PRODUS is unavailable

Major incident Production US Platform Access Analytics Cognitive WorkBench Cortex Data WorkBench Discovery Integrations Other Features Skills
2025-04-01 16:35 UTC · 51 minutes

Updates

Post-mortem

Summary:

On 1st April 2025, the Aera Platform experienced a service issue impacting the PRODUS environment. Following an investigation by the Production Services and Site Reliability Engineering (SRE) teams, the root cause was identified as infrastructure instability affecting the systems responsible for running core platform services. The SRE team took immediate action by replacing the affected nodes and increasing system capacity. These actions successfully restored platform performance and stability. The service has since been fully recovered and is functioning as expected.

Customer Impact:

A few users experienced service unavailability in the PRODUS environment.

Root Cause:

The issue was caused by infrastructure instability due to resource limitations on nodes running core platform services.

Remediations:

The affected nodes were replaced, and system performance was improved to restore platform stability.

Future Mitigating Actions:
Refining internal processes to accelerate detection and response times.
Regularly review and adjust infrastructure capacity monitoring if needed.

April 8, 2025 · 16:00 UTC
Resolved

We have confirmed internally and with our customers that the Aera platform is now fully restored.

We appreciate your patience during this incident and apologise for any inconvenience that this issue may have caused. Our teams are now working on documenting a comprehensive root cause analysis which we will share with you shortly.

If you have any questions or experience any further problems please don’t hesitate to reach out to our Support team at Aera Support Portal

April 1, 2025 · 17:26 UTC
Investigating

We are continuing to investigate the issues with platform availability. Our engineers are actively working to restore service as quickly as possible. Thank you for bearing with us whilst we work through these issues.

April 1, 2025 · 16:59 UTC
Issue

This notice is to advise you that we are receiving reports that a subset of our customers is experiencing difficulties with the platform. We are actively investigating and will provide regular updates until the issues are resolved.

Our apologies for the inconvenience this may be causing and we appreciate your patience as we investigate further.

April 1, 2025 · 16:35 UTC

← Back