Over the past several months, our team has been working diligently to improve our failover plans with Amazon Web Services (AWS). With the AWS outage last Friday, this urgency has heightened, and we have a new implementation plan, briefly outlined below. I know how essential the phone is for each of you to do business, and I acknowledge how frustrating it is when the vendor you selected to provide phone service has the inability to do so.
That being said, I would like to briefly break down the outage at AWS and how we will move forward from this. Beginning at 10:32 AM EDT last Friday, we had a critical alarm notify us that the servers we use with AWS were experiencing connectivity issues. In an attempt to move traffic from the affected server, there was a failure in the load balancing technology we rely on to move our traffic from server to server. A short time later, AWS acknowledged this interruption and began working to rectify the issue.
At approximately 11:54 AM EDT, we were able to gain connectivity to our servers, and service was restored to our clients.
For a more detailed breakdown of the outage, please visit this Help Center page.
As Fathom Voice's CEO, and as a user of our services, I do not take this outage lightly and view it as unacceptable. Our team will continue to work with AWS to understand the cause of this incident and is continuing their work to implement the new strategy to keep these incidents from happening in the future.
Although the outage was in the hands of AWS, we will be providing a Service Level Agreement (SLA) credit to your account that accounts for triple the amount of time our service was down.
I know an SLA credit cannot make up for the impact the outage had on your business last Friday, but I hope you see this credit as a demonstration of how much we value our promises and our relationship with you.
Since the outage, we have worked internally and with AWS to ensure issues like this do not occur in the future for our clients. Already, we have added substantial resources into our "always-on" environment in AWS at multiple regions; this addition to our network started shortly after Friday's outage and will be completed by this weekend. Additionally, we have modified our alarm detection system to help us predict issues further into the future, allowing us more time to react to and solve issues that have the potential to become client-impacting.
We will continue to work with AWS to ensure our clients do not see any downtime in service with Fathom Voice moving forward. We will be posting blogs in the near future that go into more detail regarding these actions.
An additional part of our implementation plan moving forward is to fill two new positions, a VP of Infrastructure and VP of Quality Assurance. These positions will be in charge of not only quickly resolving issues within our network, but they will also continue to innovate and create forward-looking policies to prevent them in the future.
I encourage you to read the breakdown of this event if you haven't already. If you have any questions, please contact our support team at 855-249-3357.
On behalf of all of us at Fathom Voice, we appreciate your business and patience, and we look forward to continuing our relationship.
Cameron Weeks, CEO