488 private links
Traditional backup tools can mostly be subdivided by the following characteristics:
-
file-based vs. image-based
Image-based solutions make sure everything is backed up, but are potentially difficult to restore on other (less powerful) hardware. Additionally, creating images by using traditional tools like dd requires the disk that is being backed up to be unmounted (to avoid consistency issues). This makes image-based backups better suited for filesystems that allow doing advanced operations like snapshots or zfs send-style images that contain a consistent snapshot of the data of interest. For file-based tools there is also a distinction between tools that exactly replicate the source file structure in the backup target (e.g. rsync or rdiff-backup) and tools that use an archive format to store backup contents (tar). -
networked vs. single-host
Networked solutions allow backing up multiple hosts and to some extent allow for centralized administration. Traditionally, a dedicated client is required to be installed on all machines to be backed up. Networked solutions can act pull-based (server gets backups from the clients) or push-based (client sends backup to server). Single-Host solutions consist of a single tool that is being invoked to backup data from the current host to a target storage. As this target storage can be a network target, the distinction between networked and single-host solutions is not exactly clear. -
incremental vs. full
Traditionally, tools either do an actual 1:1 copy (full backup) or copy “just the differences“ which can mean anything from “copy all changed files” to “copy changes from within files”. Incremental schemes allow multiple backup states to be kept without needing much disk space. However, traditional tools require that another full backup be made in order to free space used by previous changes.
Modern tools mostly advance things on the incremental vs. full front by acting incremental forever without the negative impacts that such a scheme has when realized with traditional tools. Additionally, modern tools mostly rely on their own/custom archival format. While this may seem like a step back from tools that replicate the file structure, there are numerous potential advantages to be taken from this:
-
Enclosing files in archives allows them and their metadata to be encrypted and portable across file systems.
-
Given that many backups will eventually be stored to online storages like Dropbox, Mega, Microsoft One Drive or Google Drive, the portability across file systems is especially useful. Even when not storing backups online, portability ensures that backup data can be copied by easy operations like cp without damaging the contained metadata. Given that online stores are often not exactly trustworthy, encryption is also required.
Abstract
This article attempts to compare three modern backup tools with respect to their features and performance. The tools of interest are Borg, Bupstash and Kopia.
BorgTUI -- A simple TUI and CLI to automate your Borg backups :^)
Can someone please help decide what is the "best" backup software?
- Restic (https://restic.net/)
- Borg backup (https://www.borgbackup.org/)
- Duplicati (https://www.duplicati.com/)
- Kopia (https://kopia.io/)
- Duplicay (https://duplicacy.com/)
- Duplicity (https://duplicity.us/)
mekster 79 days ago
Do yourself a favor and use zfs as your primary backup, even though it means you'll have to replace your filesystem, it's just that good.
Faster than any other backup software (because it knows what's changed from the last snapshot being the filesystem itself but external backup tools always have to scan the entire directories to know what's changed), battle tested reliability with added benefit like transparent compression.
A bit of explanation on how fast it can be than external tools. (I don't work for the said service in the article or promote it.)
Then you'll realize Borg is the one with least data corruption complaint on the internet which is good as your secondary backup.
Easily checked with, "[app name] data corruption" on Google.
And see who else lists vulnerability and corruption bugs upfront like Borg does and know the developers are forthcoming about these important issues.
https://borgbackup.readthedocs.io/en/stable/changes.html
The term "best" apparently means reliable for backup and also they don't start choking on large data sets taking huge amount of memories and roundtrip times.
They don't work against your favorite S3 compatible targets but there are services that can be targeted for those tools or just roll your own dedicated backup $5 Linux instance to avoid crying in the future.
With those 2, I don't care what other tools exist anymore.
donmcronald 79 days ago
I use ZFS + Sanoid + Syncoid locally and Borg + Borgmatic + BorgBase for offsite.
WhrRTheBaboons 76 days ago
Seconding zfs
Linux-Fan 80 days ago
Bupstash (https://bupstash.io/) beats Borg and Kopia in my tests (see https://masysma.net/37/backup_tests_borg_bupstash_kopia.xhtml). It is a modern take very close to what Borg offers regarding the feature set but has a significantly better performance (in terms of resource use for running tasks, the backups were slightly larger than Borg's in my tests).
dpbriggs 79 days ago
Personally I use borg with BorgTUI (https://github.com/dpbriggs/borgtui) to schedule backups and manage sources/repositories. I'm quite pleased with the simplicity of it compared to some of the other solutions.
Kopia’s development has accelerated in 2020 and is quickly approaching 1.0. While a number of new features have shown up within the tool, this post will concentrate on the performance improvements made over the last few months. To do that, we will compare v0.4.0 (January, 2020), v0.5.2 (March, 2020), and v0.6.0-rc1 (July, 2020). We will additionally also compare it to restic, another popular open-source backup tool. All binaries were downloaded from GitHub. With the exception of the s2-standard compression scheme being enabled with kopia, the default options were used for all tools. //
As can be seen in the above results, kopia’s performance has improved significantly over the last few releases. The time taken to backup 200GiB of data has been reduced from ~840 seconds to ~200! For just a single process, this translates to an effective processing bandwidth of 1 GiB/second and an upload bandwidth utilization of 3.5 Gbps.
For storing rarely used secrets that should not be kept on a networked computer, it is convenient to print them on paper. However, ordinary barcodes can store not much more than 2000 octets of data, and in practice even such small amounts cannot be reliably read by widely used software (e.g. ZXing).
In this note I show a script for splitting small amounts of data across multiple barcodes and generating a printable document. Specifically, this script is limited to less than 7650 alphanumeric characters, such as from the Base-64 alphabet. It can be used for archiving Tarsnap keys, GPG keys, SSH keys, etc.
On Sun, Apr 04, 2021 at 10:37:47AM -0700, jerry wrote:
Ideas? Right now, I'm experimenting with printed barcodes.
You might be interested in:
https://lab.whitequark.org/notes/2016-08-24/archiving-cryptographic-secrets-on-paper/
which was written specifically for tarsnap keys.
Cheers,
- Graham Percival
use Tarsnap for my critical data. Case in point, I use it to backup my Bacula database dump. I use Bacula to backup my hosts. The database in question keeps track of what was backed up, from what host, the file size, checksum, where that backup is now, and many other items. Losing this data is annoying but not a disaster. It can be recreated from the backup volumes, but that is time consuming. As it is, the file is dumped daily, and rsynced to multiple locations.
I also backup that database daily via tarsnap. I’ve been doing this since at least 2015-10-09.
The uncompressed dump of this PostgreSQL database is now about 117G.
I was interested in trying out a service like OneDrive or Dropbox, but one thing always held me back: the idea that at any moment, and for any reason, the company could lock me out of my files.
The problem
No one wants to have their data held hostage by a third-party. How can you get the benefits of using cloud storage while also retaining ownership rights and having a level of assurance that your files will always be accessible?
The solution
Luckily, there’s a simple solution: Perform full backups of your cloud files in an environment that you control.
"Backup your data, you say?! What a novel idea!" /S
The setup
I use rclone
to sync files from my cloud storage accounts to a VM running Alpine Linux. rclone
works with over 40 cloud storage providers, has a very easy-to-use CLI, and works with modern authentication systems.
A cron job runs daily, pulling down any file changes into the backup.
I have the replication job set to exhaustively copy all files in the account to the local machine.