Performance degradation

Minor incident UAT IRL Platform Access Analytics Cognitive WorkBench Data WorkBench Discovery Integrations Other Features Skills UAT US Platform Access Analytics Cognitive WorkBench Data WorkBench Discovery Integrations Other Features Skills
2024-06-20 12:33 UTC · 10 hours

Updates

Post-mortem

Summary:

On 20th June, a small subset of customers experienced a temporary issue that caused slowdowns in certain parts of our platform, including features such as the Monitoring Console and CWB. This was due to unexpected demands on the system that exceeded its capacity at that time. To address this, our engineers identified the need to temporarily reduce the system load. They achieved this by carefully pausing specific system components and then restarting them in a controlled manner. This remediation allowed the system to recover fully, ensuring everyone could access the platform again. Moving forward, we are collaborating with our infrastructure provider and refining deployment processes for system changes to prevent similar incidents in the future.

Customer Impact:

The issue resulted in slowness for some customers, affecting platform features such as the Monitoring Console and CWB, impacting user experience and access.

Root Cause:

The slowness was caused by unexpected demands on the system that surpassed its capacity, due to a capacity issue at the cloud service provider, leading to performance degradation.

Remediations:

To address the issue, our engineers implemented measures to temporarily reduce system load. They achieved this by carefully pausing specific system components and then restarting them in a controlled manner to restore system performance and ensure platform accessibility.

Future Mitigating Actions:

Moving forward, we are enhancing our deployment processes for system changes to better anticipate and manage system demands.
Internal change management processes are under review and will be updated accordingly.

July 1, 2024 · 17:54 UTC
Resolved

We have confirmed internally and with our customers that the Aera platform is now fully restored.

We appreciate your patience during this incident and apologise for any inconvenience that this issue may have caused. Our teams are now working on documenting a comprehensive root cause analysis which we will share with you shortly.

If you have any questions or experience any further problems please don’t hesitate to reach out to our Support team at support@aeratechnology.com

June 20, 2024 · 22:33 UTC
Investigating

Our engineers have identified a potential cause of the slowness issue and are working on a solution. We understand the business impact this issue may have and are working to restore service as quickly as possible. Again, we thank you for your continued patience and understanding.

June 20, 2024 · 18:09 UTC
Investigating

Our engineers continue to investigate the root cause of the slowness issues with the platform. We understand the business impact this issue may have and are working to restore service as quickly as possible. Again, we thank you for your continued patience and understanding.

June 20, 2024 · 16:08 UTC
Investigating

We are continuing to work towards restoring services for the impacted customers. Our engineers are diligently working to narrow down the root cause. We will continue to keep you informed as the investigation progresses. We appreciate your continued patience whilst we work towards resolution.

June 20, 2024 · 14:34 UTC
Investigating

We are continuing to investigate the slowness issues with the platform. Our engineers are actively working to restore service as quickly as possible. Thank you for bearing with us whilst we work through these issues.

June 20, 2024 · 13:14 UTC
Issue

This notice is to advise you that we are receiving reports that a subset of our customers are experiencing slowness with the platform. We are actively investigating and will provide regular updates until the issues are resolved.

Our apologies for the inconvenience this may be causing and we appreciate your patience as we investigate further.

June 20, 2024 · 12:33 UTC

← Back