app.circle.so is down
Incident Report for Circle
Postmortem

Post-Mortem: Database Locking Issue on November 14, 2024

Incident Duration:

  • Start: November 14, 2024, at 14:44 UTC
  • End: November 14, 2024, at 14:55 UTC

Impact:
Our services experienced 3 minutes of downtime, followed by 8 minutes of degraded performance. During this period, users encountered difficulties accessing Circle, leading to a brief disruption in service availability.

Root Cause:
The incident was caused by a database locking issue due to a recent code deployment, resulting in a bottleneck that prevented requests from being processed.

Resolution:
Upon identifying the locking issue, our team acted quickly to manually stop the process that was holding the lock on the database, which allowed services to return to normal operation. After clearing the lock, we rolled back the problematic code deployment to prevent further disruptions.

Next Steps:
To mitigate the risk of similar incidents in the future, we are implementing the following measures:

  1. Foreign Key Manipulation with Strong Migrations: We will enforce strong migration practices, particularly around Foreign Key manipulation, to minimize locking risks during database changes.
  2. Enhanced Testing: We will expand our pre-deployment testing to include more rigorous checks for database locks and contention, ensuring that locking risks are identified before deployment.

We apologize for the inconvenience caused by this incident and appreciate your patience as we work to improve Circle’s stability and reliability. If you have any further questions or concerns, please feel free to reach out to our support team at support@circle.so.

Posted Nov 14, 2024 - 22:10 UTC

Resolved
This incident has been resolved.
Posted Nov 14, 2024 - 15:32 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 14, 2024 - 15:15 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Nov 14, 2024 - 14:47 UTC
Investigating
We are currently investigating this issue.
Posted Nov 14, 2024 - 14:44 UTC