Searching for a Perfect Backup Solution: Borg and Restic

Stefano Marinelli included in Borg Restic Burp Backuppc Data Linux Obnam Recovery Restore Rsync Attic Backup Urbackup

30-06-2020 About 2100 words 10 minutes

Contents

Note: This article is dated, and while the core principles may still be correct, technologies have evolved. I currently have an update in progress, which will include more recent content. Please check back soon for the updated version.

Backup: Methods

I have already addressed the challenges of performing a correct backup, providing guidance on the appropriate methods and explaining why it’s crucial to do so. I’ve discussed why RAID should not be considered a form of backup and shared some insights into software I have evaluated and used in the past. In this article, however, I will delve deeper into how I manage backups and how I ensure a certain level of security.

Backups: The Basics

Here is a bulleted list covering the key aspects to consider:

Operating System and Data to Save: Every operating system is its own world. It’s unlikely to find a one-size-fits-all solution, even though many leading open-source software options are cross-platform. The significant divide is between Unix-like systems (GNU/Linux, *BSDs, MacOS, etc.) and Windows. The same software, even when available on multiple platforms, may not suit everyone’s needs.
Type of Data to be Saved: Different types of data require distinct backup solutions. In some scenarios, you might need continuous incremental copies as files grow but are not altered. In such cases, any incremental system (like Duplicity) could be sufficient. However, for varied backup needs (e.g., a continuously changing database), a solution like the latter can be highly inefficient, particularly over time.
Need for Encryption: For instance, backups of my invoices can be stored unencrypted…
Speed of Execution and Recovery: Some tools are exceptionally efficient at performing backup operations but make the recovery process painfully slow. A notable drawback I found with BURP Backup—which I mentioned in a previous article and still use in certain situations—is its requirement for file restoration and its inability to allow direct browsing of the backup as a local file system. A similar case is with Proxmox’s native backup: it’s easy to set up and comprehensive, but the recovery time can be extensive, especially when done over remote locations and slow connections. Often, recovering a file means having to restore the entire virtual machine.
Snapshot: Last but definitely not least. Backing up a “live” file system involves a “start” and an “end” moment. During this time, the data can change, leading to inconsistencies. I’ve encountered such issues in the past: a large MySQL database was compromised by a client, and I was tasked with its recovery. I confidently took their last backup and restored various files (not a native dump). Unsurprisingly, the database failed to restart: the large file had changed too much between the start and end of the backup, rendering it inconsistent. Note that I also had the dump, so I managed to recover from that. But the issue is evident: backing up a live file system is risky, unless you’re dealing with simple folders like “My Documents” or “Images”. An open database, even a basic one like a browser’s, is highly likely to get corrupted, making the backup useless. The solution is to create a snapshot of the entire file system before beginning the backup. This approach has its risks (the backup will reflect the state of a suddenly unplugged machine), but they are significantly lower. To date, using snapshots, I have managed to recover everything.

Backup: Snapshot

Capturing a snapshot of the file system is crucial for achieving a sufficiently consistent backup. Over the years, I’ve explored several methods:

Native File System Snapshot (e.g., BTRFS or ZFS): If your file system inherently supports snapshots, it’s wise to use this feature. It’s likely to be the most efficient and technically sound option.
LVM Snapshot: For those using LVM, creating a snapshot of the logical volume, mounting it, and then performing the backup is a viable approach. You’ll need to allocate a maximum variation size (the extent of data change during the snapshot’s lifespan), which should be part of the VG but not assigned to any LV. This method can lead to some space wastage, proportional to the working set of data being written within a specific timeframe. While I still use this method, it has occasionally caused issues where the file system freezes during the snapshot’s destruction, necessitating a reboot. This has been a rare but recurring issue across different hardware setups.
DattoBD: I’ve tracked the development of this tool from its inception. Early versions had issues that were resolved in subsequent releases. It’s stable and reliable, based on my experience. For snapshots with Datto, I often use UrBackup scripts (another excellent backup system I’ve frequently used, though now mainly for Windows systems), which are convenient and efficient.

Backup: Push or Pull?

A longstanding debate among experts is whether backups should be initiated by the client (push) or requested by the server (pull).

In my view, it depends. Generally, I prefer centralized backup systems on dedicated servers, maintained in highly secure environments with minimal services running. Sometimes, I use Docker to encapsulate the entire backup system, minimizing other services. Therefore, I lean towards the “pull” method, where the server connects to the client to initiate the backup, or a mixed approach like that of BURP. In BURP’s case, the client connects to the server, which then decides whether a backup is necessary and manages the process. Here, the server is more than just storage; it’s an active participant in the backup.

Unfortunately, my current preferred systems don’t support this functionality. As a result, I’ve developed alternative solutions to address this limitation.

Next, I will discuss the two tools I use most frequently, alongside BURP: Borg Backup and Restic.

Borg Backup

Borg Backup has been ensuring the security of almost all my backups for over two years. Its positive aspects include:

Compression and Deduplication: Borg compresses and deduplicates data within the same repository.
Easy Navigation and Recovery: You can mount backups in a directory, allowing for straightforward navigation, browsing, and recovery.
Speed: Borg is fast in every aspect, maintaining a local cache to track already copied files, which speeds up subsequent backups.

Borg eliminates the need for traditional incremental systems. It allows you to store all backups deduplicated and compressed as if they were full backups, avoiding slow reconstructions during file browsing or restoration.

However, there are some limitations to consider:

Repository Use: It’s recommended to use a separate repository for each server, as the backup operation is blocking. This means deduplication occurs only between backup generations of the same server. For similar servers, like the hundreds I manage, this can lead to significant space usage. This aspect can be both a drawback and an advantage, as damage to one repository won’t affect others.
Python-based Execution: Being written in Python, Borg takes a few seconds to load and run. While not a major issue, it can be slightly inconvenient in certain scenarios.
Indexing Delays: When mounting a repository, Borg creates internal indexes to display directories. This can take time for backups with large amounts of data, which can be critical when quick access is required.
Push Solution: Borg operates as a push solution, where the client connects to the storage server. This process is more efficient when the storage server is equipped with the Borg executable.

My experience with Borg has been positive. It’s fast, effective, and reliable for recovering entire servers and individual files quickly and thoroughly.

Restic

I’ve been observing Restic for some time and initially preferred Borg. Both are similar in approach and functionality, yet Restic, written in Go, has recently made significant improvements, including supporting compression.

Advantages of Restic:

Cross-Server Deduplication: Unlike Borg, Restic advises using a single data repository, enhancing deduplication possibilities across different machines. This is especially beneficial in environments with many similar servers.
Speed: My tests suggest Restic is faster than Borg.
Convenient Navigation: Restic allows you to mount backups in a directory for easy browsing and restoration. Its structure creation process during navigation ensures acceptable performance despite being slightly slower than Borg.
Separate Forget and Prune Operations: Restic lets you separate the deletion of old backups (forget) from the actual removal of data (prune), optimizing backup time.
Supportive Community: Led by Alexander Neumann, Restic’s community is exceptionally helpful and responsive.

Challenges with Restic:

Push Solution: Like Borg, Restic operates on a push model.
Repository Mounting: Mounting the repository for browsing backups does not return the prompt immediately, requiring a separate shell for navigating backups. This can be inconvenient in emergency situations.

Restic, while not perfect, offers robust and efficient backup solutions, with ongoing improvements and active community support.

Example: Script Used to Back Up My Laptop Using Borg and Restic

Let’s examine a practical example: the script I use for backing up my notebooks. Initially, I verify that the computer is not on battery power and that the backup server is accessible. Then, I lock and create a snapshot using DattoBD, redirecting all outputs to a file in the temporary directory.

Borg:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#!/bin/bash
PATH="/usr/local/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/usr/pkg/bin:/usr/pkg/sbin"
export PATH
server=192.168.2.254
export server
STATE=$(upower -i /org/freedesktop/UPower/devices/battery_BAT0 | grep state | grep discharging)
export STATE
if [[ $STATE == *'discharging'* ]]; then
   exit
fi
if nc -w 10 -z $server 22 2>/dev/null; then
    echo "$server ✓"
else
    echo "$server ✗"
    exit
fi
if mkdir /tmp/backuphappening; then
    echo "Locking succeeded" >&2
else
    echo "Lock failed - exit" >&2
    exit 1
fi
exec > /tmp/borg_backup_log
exec 2>&1
date
/usr/local/share/urbackup/dattobd_create_filesystem_snapshot 1 /
REPOSITORY=user@$server:repo
TAG=daily
ionice -c3 borg create -v --progress --compression zlib --stats \
    $REPOSITORY::$TAG'-{now:%Y-%m-%dT%H:%M:%S}' \
    /mnt/urbackup_snaps/ /boot /boot/efi \
    --exclude '*.cache*' \
    --exclude '*/home/*/.cache*' \
    --exclude '*/home/*/Scaricati*' \
    --exclude '*.datto*' \
    --exclude '*.overlay*' \
    --exclude '*.crdownload' \
    --exclude '*.rpm' \
    --exclude '*.deb' \
    --exclude '*swapfile*' \
    --exclude '*/home/*/Virtualbox VMs*' \
    --exclude '*/home/*/VirtualBox VMs*' \
    --exclude '*/home/*/.vagrant.d*' \
    --exclude '*/root/.cache*' \
    --exclude '*/var/lib/docker*' \
    --exclude '*/tmp'
/usr/local/share/urbackup/dattobd_remove_filesystem_snapshot 1 /mnt/urbackup_snaps/1
borg prune -v $REPOSITORY --stats --prefix $TAG'-' \
    --keep-hourly=12 --keep-daily=60 --keep-weekly=12 --keep-monthly=24
rm -Rf /tmp/backuphappening
date

Restic:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#!/bin/bash
PATH="/usr/local/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/usr/pkg/bin:/usr/pkg/sbin"
export PATH
STATE=$(upower -i /org/freedesktop/UPower/devices/battery_BAT0 | grep state | grep discharging)
export STATE
if [[ $STATE == *'discharging'* ]]; then
   exit
fi
if nc -w 10 -z $server 22 2>/dev/null; then
    echo "$server ✓"
else
    echo "$server ✗"
    exit
fi
if mkdir /tmp/backuphappening; then
    echo "Locking succeeded" >&2
else
    echo "Lock failed - exit" >&2
    exit 1
fi
exec > /tmp/restic_backup_log
exec 2>&1
date
/usr/local/share/urbackup/dattobd_create_filesystem_snapshot 1 /
ionice -c3 restic -r myrepo \
    backup \
    --exclude='*/home/*/.cache*' \
    --exclude='*.cache*' \
    --exclude='*/home/*/Scaricati*' \
    --exclude='*.datto*' \
    --exclude='*.overlay*' \
    --exclude='*.crdownload' \
    --exclude='*.rpm' \
    --exclude='*.deb' \
    --exclude='*swapfile*' \
    --exclude='*/home/*/Virtualbox VMs*' \
    --exclude='*/home/*/VirtualBox VMs*' \
    --exclude='*/home/*/.vagrant.d*' \
    --exclude='*/root/.cache*' \
    --exclude='*/var/lib/docker*' \
    --exclude='/sys' \
    --exclude='/proc' \
    --exclude='/dev'

 \
    --exclude='*/tmp' \
    --exclude='/run' \
    /mnt/urbackup_snaps/1/
/usr/local/share/urbackup/dattobd_remove_filesystem_snapshot 1 /mnt/urbackup_snaps/1
restic forget -d 30 -w 8 -m 12 -y 1 --host myhost -r myrepo
rm -Rf /tmp/backuphappening
date

Restic’s script is a simplified adaptation of the Borg script. Unlike Borg, it doesn’t immediately execute the prune operation but marks older backups for removal. It’s advisable to consult the manuals of both tools to avoid common copy-paste issues.

And Now…

Currently, I’m using both Borg and Restic. Borg serves as my primary backup system, while Restic is in an ’emerging’ stage as I thoroughly test it. The introduction of compression in Restic narrows the gap with Borg, particularly in terms of deduplication efficiency. I perform nightly rsync of the entire repository to remote storage for multi-location replicated backups. Additionally, a Jenkins setup manages the connections to individual machines, performs backups, and alerts me in case of any issues.