PRODUS- Integrations are Unavailable

Major incident Production US Integrations
2023-10-17 06:02 UTC · 1 hour, 53 minutes

Updates

Post-mortem

Summary:

On Oct 16, 2023 multiple EPs were found to be stuck in the waiting state within the PRODUS environment. The issue was initially detected when the operational team observed that integration jobs were running longer than expected, resulting in reports of delays. Upon investigation, it was determined that the issue was caused by a problem with database optimization. To permanently resolve this issue improved optimizations were made to the database system, consequently restoring service. All integrations were subsequently validated, and EPs were confirmed to be functioning without issues.
Customer Impact:

The delay in integration jobs impacted operations.

Root Cause:

In certain scenarios database performance was impacted by misconfigured optimisation parameters.

Remediations:

To address the immediate issue, the impacted service was restarted. Configuration updates to optimize the database for the relevant scenarios were also implemented.

Future Mitigating Actions:

The Engineering team are reviewing enhancements to the product and performance testing to mitigate any similar future occurrences and will implement the identified outcomes.

November 8, 2023 · 08:53 UTC
Resolved

We have confirmed internally and with our customers that the Aera platform is now fully restored.

We appreciate your patience during this incident and apologise for any inconvenience that this issue may have caused. Our teams are now working on documenting a comprehensive root cause analysis which we will share with you shortly.

If you have any questions or experience any further problems please don’t hesitate to reach out to our Support team at support@aeratechnology.com

October 17, 2023 · 07:54 UTC
Investigating

We are continuing to work towards restoring service for the integration issues. Our engineers are diligently working to narrow down the root cause. We will continue to keep you informed as the investigation progresses. We appreciate your continued patience whilst we work towards resolution.

October 17, 2023 · 07:15 UTC
Investigating

We are continuing to investigate the integration issues. Our engineers are actively working to restore service as quickly as possible. Thank you for bearing with us whilst we work through these issues.

October 17, 2023 · 06:02 UTC

← Back