UATIRL - Notebook services are unavailable
Updates
Summary:
On 14th February 2025, users reported being unable to access notebooks in UATIRL. The Production Services and Site Reliability Engineering teams investigated and identified instability and performance degradation affecting a single node within the IRL cluster’s storage infrastructure, leading to notebook inaccessibility. After initial recovery attempts, the affected node was removed, and a new node was provisioned, successfully restoring access.
Customer Impact:
Users were unable to open, edit, or run notebooks within the UATIRL environment.
Root Cause:
The root cause was identified as instability and performance degradation affecting a single node within the IRL cluster’s storage infrastructure.
Remediations:
The team investigated the issue, attempted service restarts, and resolved it by provisioning a new node and restoring services, ensuring notebook accessibility and system stability.
Future Mitigating Actions:
Review and enhance proactive monitoring and alerting mechanisms to detect and address storage performance degradation at an early stage.
We have confirmed internally and with our customers that the Aera platform is now fully restored.
We appreciate your patience during this incident and apologise for any inconvenience that this issue may have caused. Our teams are now working on documenting a comprehensive root cause analysis which we will share with you shortly.
If you have any questions or experience any further problems please don’t hesitate to reach out to our Support team at Aera Support Portal
We are continuing to investigate the issues with Notebook services. Our engineers are actively working to restore service as quickly as possible. Thank you for bearing with us whilst we work through these issues.
This notice is to advise you that we are receiving reports from a subset of customers experiencing difficulties with the Notebook services. We are actively investigating and will provide regular updates until the issues are resolved.
Our apologies for the inconvenience this may be causing and we appreciate your patience as we investigate further.
← Back