Apache server down. Incident report

By Oscar Morales

The following is then incident report for the Apache server occurred on December 09, 2020.

Issue Sumary

  • Duration: From 09/12/2020 12:00:00 AM to 09/12/2020 8:06:00 AM (GMT-5)

The issue was detected on the morning of 09/12/2020. Users reported that they could not enter to the website recieving a 500 HTTP response status code.

Timeline (all times in GMT-5)

  • 8:00 AM: Users report the issue

Root Cause

At 7:50 AM GMT-5, to make possible apply of some changes in the configuration of the Apache server, the apache2 service was stopped but never started. This caused an server error when users were trying access to the website.

Resolution and recovery

At 8:00 AM GMT-5, users repost the issue and quickly the debugging began. By 8:03 AM, was identified that the issue involved the server due the HTTP response status code.

At 8:04 AM, a connection via ssh with the container that runs the server was made to check the status of the server, getting that the apache2 service was not running. Next, the apache2 was started.

At 8:06 AM, the connection to the website was made getting a 200 HTTP response status code restoring the connection between users and server.

Corrective and Preventative Measures

Making a review and analysis some actions are required to avoid these issues in the future:

  • It is necessary to implement a monitoring service

It is necessary continually and quickly improving our technology and operational processes to prevent issues. We apologize to all users for any inconvenience they may have had.

Chemical engineer. Software developer student.