Uploading documents during the Envelope Send Flow process
Incident Report for Signable Status Page
Postmortem

On May 21, 2024, at 15:22 UTC, a spike in indexing latency in ElasticSearch caused significant delays in document processing. This latency increase led to a backlog in the job queues, which impacted the overall performance of our app. After manually scaling several parts of our infrastructure, all job queues were fully processed and normal service resumed at 17:12 UTC.

Customer Impact

A severe slowdown was experienced in the following services:

  • Document processing
  • Syncing to Elasticsearch
  • Bulk sends
  • Envelope expiries and reminders
  • Integrations with Google Drive and Dropbox
  • Zapier integration upon signup

Lessons Learned and Improvements

To prevent a recurrence, we implemented the following changes:

  1. Queue Prioritisation: Adjusted the priorities of our busiest queues and increased the priority of essential jobs like Document Processing so a slowdown in ElasticSearch does not impact other core services. 
  2. Incident Response: Improved documentation and procedures for scaling services (cache, database, search).
  3. Scaling Enhancements: Enhanced Redis cluster scaling mechanisms.
  4. Latency Investigation: Initiated an investigation into the cause of high latency on a specific Elasticsearch shard.

We apologise for the inconvenience caused and are committed to improving our services to prevent such incidents in the future. Thank you for your understanding and continued support.

For any further questions or concerns, please contact our support team at help@signable.co.uk

Posted May 28, 2024 - 11:29 BST

Resolved
This incident has been resolved.
Posted May 21, 2024 - 18:18 BST
Update
We are continuing to monitor for any further issues.
Posted May 21, 2024 - 18:14 BST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted May 21, 2024 - 18:11 BST
Investigating
We are currently investigating this issue.
Posted May 21, 2024 - 16:51 BST
This incident affected: Web App and Document Processing.