Ghost blogs fail to load - probably DB connection issues on AWS EC2 + RDS


#1

I am not sure if this is something related to Ghost, but I will talk about it here as some other people might have run into this.

Almost every morning, at around 6am my time, my Ghost installations stop working (I get notifications from NewRelic that the ping to them fails). I think I have identified the issue in my current setup, which is the following:

  • Amazon EC2 (runs multiple websites, a server I’ve configured for myself)
  • Amazon RDS (database - it does it’s daily backup at around 6am in the morning)

In the nginx error logs I get multiple messages like this:

2018/05/24 06:09:30 [error] 1276#1276: *60914 upstream timed out (110: Connection timed out) while reading response header from upstream, client: xxx.xxx.xxx.xxx, server: www.secareanu.com, request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:2370/", host: "www.secareanu.com"

The AWS RDS backup seems to take a lot of time, it starts at 5:00 am and finishes at 5:30 but at around 8 am it gives this message:

DB Instance xxx has a large number of tables which can increase database recovery time significantly. This DB Instance also contains InnoDB tables that have not been modified to use the shared tablespace. Consider modifying these tables to use the InnoDB shared tablespace.

Not sure how I can check the Ghost log as the ghost log website-name command only provides limited information, i.e.

[2018-05-24 09:10:49] INFO "GET /assets/js/bootstrap.min.js?v=30599452a1" 200 4ms
[2018-05-24 09:10:49] INFO "GET /assets/js/bootstrap.min.js?v=30599452a1" 200 4ms
[2018-05-24 09:11:26] INFO "HEAD /" 200 125ms
[2018-05-24 09:11:26] INFO "HEAD /" 200 125ms

However, even though the DB connection might fail because the AWS RDS is probably in another state and not available temporary, Ghost should probably retry connections in a while (the WordPress sites on the same EC2 machine connected to the same RDS database do not fail).

Also, the RDS logs show something like this (so it seems the database is running at that time):

2018-05-24 3:26:55 47756575479936 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
Version: '10.1.31-MariaDB' socket: '/tmp/mysql.sock' port: 3306 MariaDB Server

The problem is temporarily solved if I restart the Ghost instances as when I check them in the morning they seem to be up and running (based on ghost ls or checking running processes). But then the next morning it all happens again.

It seems to be a database connection issue as this morning one of the blogs was showing an SQL command in the admin interface (most probably an error, didn’t get the chance to copy/paste it as I hit reload too fast to check if that blog loads or not).

If you have any ideas where else to check to try to debug this it would be of great help. :slight_smile:

Thanks!


Back with the Ghost sites failing to load following an AWS RDS maintenance/backup window
#2

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.