On April 28, 2024, a scheduled database maintenance window led to a severe service outage that impacted the Signable API and document processing services for over 19 hours. Signers were still able to Sign envelopes and send documents using a template. The root cause was a misconfiguration during the database cluster cloning process, which prevented the public API and other internal services from connecting to the new database cluster.
During the maintenance window on April 28, the production database cluster was cloned with a subnet misconfiguration. This meant the public API and other internal services attempted to connect to the cluster using a private IP. As they resided in a separate VPC, they lost connectivity to the new database cluster.
While the services that run the Signable Web app and Signing page were largely unaffected, the issue was initially raised because documents were not being processed through the Signable App.
Issues with error logging and monitoring in the affected services made it difficult to identify the root cause promptly.
The outage severely impacted Signable's ability to process customer documents and accept calls from the public API, causing significant disruption for customers, many of whom rely on the platform for time-sensitive transactions.
The incident was declared at 8:09 AM on April 29, and the product team immediately began working towards a resolution.
The issue was resolved at 1:30 PM on April 29 by cloning the database cluster again with instances in public subnets, restoring connectivity for the Public API and document processing services.
In order of criticality, Signable will be implementing improvements in the following areas:
Signable is committed to learning from this incident and implementing the necessary improvements to prevent similar outages and provide a more reliable service to our customers.
If you have further questions or require more information, please get in touch with us at help@signable.co.uk.