aboutsummaryrefslogtreecommitdiff
path: root/content/issues/2018-04-13-unavailable-guilds-connection-issues.md
diff options
context:
space:
mode:
authorAdam Stück <adam@adast.dk>2024-08-05 14:21:53 +0200
committerAdam Stück <adam@adast.dk>2024-08-05 14:21:53 +0200
commit363f85d32b6062dedaf4b22c1c7404b47ebf6c2a (patch)
treee219fd6feddb0b074dff11b2bf3c81125be5b7df /content/issues/2018-04-13-unavailable-guilds-connection-issues.md
parent37085c8f9a398d2d9cee596ee8ece555dc69c2bb (diff)
Initial commit as status.adast.dk
Diffstat (limited to 'content/issues/2018-04-13-unavailable-guilds-connection-issues.md')
-rw-r--r--content/issues/2018-04-13-unavailable-guilds-connection-issues.md25
1 files changed, 0 insertions, 25 deletions
diff --git a/content/issues/2018-04-13-unavailable-guilds-connection-issues.md b/content/issues/2018-04-13-unavailable-guilds-connection-issues.md
deleted file mode 100644
index 170bd1f..0000000
--- a/content/issues/2018-04-13-unavailable-guilds-connection-issues.md
+++ /dev/null
@@ -1,25 +0,0 @@
----
-title: Unavailable Guilds & Connection Issues
-date: 2018-04-13 15:54:00
-resolved: true
-resolvedWhen: 2018-04-13 17:30:00
-# Possible severity levels: down, disrupted, notice
-severity: down
-affected:
- - API
- - Media Proxy
-section: issue
----
-
-*Post-mortem*
-
-At approximately 14:01, a Redis instance acting as the primary for a highly-available cluster used by our API services was migrated automatically by Google’s Cloud Platform. This migration caused the node to incorrectly drop offline, forcing the cluster to rebalance and trigger known issues with the way our API instances handle Redis failover. After resolving this partial outage, unnoticed issues on other services caused a cascading failure through Example Chat App’s real time system. These issues caused enough critical impact that Example Chat App’s engineering team was forced to fully restart the service, reconnecting millions of clients over a period of 20 minutes.
-
-
----
-
-*Update* - A fix has been implemented and we are monitoring the results. Looks like this has been fixed. {{< track "2018-04-13 17:30:00" >}}
-
-*Monitoring* - After hitting the ole reboot button Example Chat App is now recovering. We're going to continue to monitor as everyone reconnects. {{< track "2018-04-13 16:50:00" >}}
-
-*Investigating* - We're aware of users experiencing unavailable guilds and issues when attempting to connect. We're currently investigating. {{< track "2018-04-13 15:54:00" >}}