AUTODESK EVENT ANALYSIS
Incident Number: #COE-INC113840
Incident Date: December 7, 2024
Summary
Between December 7, 2024 at 11:49 PM PST and December 8, 2024 at 10:37 AM PST, customers using the Autodesk Fusion Home tab ‘search panel’ experienced timeouts or significant latency in search results.
Impacted Services
Root Cause
- After implementation of a planned change to resize Fusion Folders and Files Search cluster for better performance, the cluster started showing very high CPU resource utilization.
 
- On further analysis, it was discovered that the search process was getting bottlenecked on memory resources, resulting in frequent memory cleanups and very high CPU utilization.
 
- This led to significant resource constraints, causing high latency and timeouts on user-initiated search requests.
 
Autodesk Actions
Autodesk has completed a post-incident analysis of the event and identified actions to be taken. These include the following:
- Immediate Action:
o   Scaled out the Search cluster. 
- Short term:
o   Enchance monitoring and alerting on the Fusion Folders and Files search cluster for faster detection of issues.
o   Improve stakeholder communication about major changes in production environment.
o   Keep cloud service provider on standby during high impact changes. 
- Long term:
o    Simulate cluster scaling exercises in lower environments.
o    Engage with cloud service provider for faster scale up/scale out options.
o    Stricter review of the deployment procedures including:
                    -Time to complete the change.
                    -Backout/rollback plan – documented, reviewed, timed, and tested.
                    -Post deployment testing and monitoring. 
Thank you for your patience and understanding.