AUTODESK EVENT ANALYSIS
Incident Number: #COE-INC113840
Incident Date: December 7, 2024
Summary
Between December 7, 2024 at 11:49 PM PST and December 8, 2024 at 10:37 AM PST, customers using the Autodesk Fusion Home tab ‘search panel’ experienced timeouts or significant latency in search results.
Impacted Services
Root Cause
- After implementation of a planned change to resize Fusion Folders and Files Search cluster for better performance, the cluster started showing very high CPU resource utilization.
- On further analysis, it was discovered that the search process was getting bottlenecked on memory resources, resulting in frequent memory cleanups and very high CPU utilization.
- This led to significant resource constraints, causing high latency and timeouts on user-initiated search requests.
Autodesk Actions
Autodesk has completed a post-incident analysis of the event and identified actions to be taken. These include the following:
- Immediate Action:
o Scaled out the Search cluster.
- Short term:
o Enchance monitoring and alerting on the Fusion Folders and Files search cluster for faster detection of issues.
o Improve stakeholder communication about major changes in production environment.
o Keep cloud service provider on standby during high impact changes.
- Long term:
o Simulate cluster scaling exercises in lower environments.
o Engage with cloud service provider for faster scale up/scale out options.
o Stricter review of the deployment procedures including:
-Time to complete the change.
-Backout/rollback plan – documented, reviewed, timed, and tested.
-Post deployment testing and monitoring.
Thank you for your patience and understanding.