DQ+ AU UAT Unavailable

Incident Report for Precisely

Postmortem

After restarting the application servers a few times and reviewing the logs, nothing pointed to being the issue. But there wasn’t much communication between the application and the aurora postgres database, upon checking the Postgres DB the primary (writer) node had several locks and 20 connections and the locks started at the time of the incident (2:30AM CDT). A manual fail-over of the Postgres nodes (Secondary to Primary) resolved the issue with the communication between the Application and DB, which then allowed the application to be online and available.

We are reviewing the DB logs with the DBA to determine why so many locks and were they causing the issue or was it something else in the DB.

Posted Jun 06, 2024 - 03:47 EDT

Resolved

The DQ+ AU UAT environment was experiencing http 5xx errors and the environment was offline.

Posted Jun 05, 2024 - 03:30 EDT