You trust us to host your developer documentation, and we take that responsibility seriously. Any incident that affects service, especially for your customers, gets our immediate attention until it’s resolved. We only look good when you look good. Our goal is to decrease the frequency of incidents and we let you down today. Here’s what we’re doing about it:
Result: The pages for the Reference sections of all ReadMe projects failed to load for almost an hour from 5:30pm-6:30pm PST.
Root cause: The caching mechanism for our JS bundle did not properly sync with the HTML we were serving from a prior rollback. Purging the cache of the JS bundle fixed the issue, but finding the root cause took quite a bit of time.
We’ve added more granular health checks to detect an outage like this more quickly. We’ve also optimized our cachebust script so future rollbacks will hit production more instantly. And long-term we’ll rework our mechanisms to prevent cache invalidation issues from reaching production.
We have a lot of great things in store for 2021. Subscribe to our changelog (https://docs.readme.com/changelog.rss) to be notified of our weekly progress as we work to improve your ReadMe experience!