Backups in GHOST are not well supported.
This is how I have GHOST incorporated into our archive-and-backup system:
Apologies for the long post.
You need to backup TWO things:
- Database content
- File-system content and configuration.
When you need to restore, you need to also restore these two parts together.
As GHOST blogging is not a high-volume transaction oriented system, the chances of there being timing differences between the two parts is low. You probably do not need to worry about journal-snapshots on your file-system.
I administer about a dozen blogs on three self-managed servers.
The GHOST application takes up a lot of space - you have to decide if you want to back this up too - for example, just 5 versions of GHOST (3.21.0, 3.21.1, 3.22.1, 3.23.1, 3.27.0) take up 1.8G.
You probably DONT want to back up this 1.8G for every Ghost blog - for example, that would be 18G everyday for 10 blogs, or about 540G for a month, 6.5TB for a year - just for the app, without any data!
I install the GHOST app in /usr/share/ghost
and then to use links in each installation to point to the specific version being run. When I backup the file-system I do NOT follow links, so each file-system-backup only contains the config, uploaded images and generated images, which keeps them lean.
I wrote a simple shell script that runs everyday at 3:00am (outside the time-zone change window). It uses rsync
to perform a local-disk backup of all GHOST installations from /var/www/$domain
to /var/mycompany/backup/ghost-$host/$domain
.
At 3:05am, I also backup all the MySQL databases on each host into /var/mycompany/backup/mysql-$host/$database
. I wrote a simple script that automates the mysqlbackup script - the important thing is to create a file-per-table with no embedded timestamp. This makes the remote rsync efficient, as it will only copy tables where the content has actually changed. I create time-stamped canary files so I can check that all the backup hosts are synchronised to origin. I also include the time-stamp in the CREATE DATABASE
backup file so that at least one other file changes every day.
The on-host local backups run very quickly - in only a few seconds. These local copies are useful for an immediate restore to this mornings 3:00am state - it is possible to lose updates made since the backup in the event of a server loss. My blog users are not so active that this has ever been a problem. In a higher volume system, you could increase the frequency of file-system rsync (or set up a trigger based system etc.), and implement MySQL replication.
I take off-host backups an hour later, (4:00am) to a host in an amber zone. This pulls information into internal networks. The gateway-host uses rsync
(always over SSH) and only copies the file-system and database-table-files that have changed since yesterdays backup. It consolidates all the installations from multiple hosts onto mirrored ZFS volumes.
This primary backup-host uses rsync to then push the file-system and mysql-backup changed files to secondary hosts in the US, Europe and Oceania.
On each of the four backup-hosts, I run a de-duplicating archive operation using borg
- which means I can identify and retrieve any individual file or table from any day in the last few years.
On a monthly basis, in each off-site location, we physically connect external drives, and copy the encrypted archives to these devices, which are then disconnected from the network, This provides an off-network encrypted backup.
Itβs probably overkill
I have tested the restore and recovery process - which are easy, but a bit manual. I also use these backups to move blogs / domains from host to host, during server rebuild, and occasionally to create development installations.