All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Should I be surprised that this works so well? (dump over ssh)
To my astonishment, this has consistently worked:
# dump -0 -f - / | ssh backup-server.example.com "cd /vault && cat > dump.0"
There's a little more to it in that I specify ssh keys, ports, per-server destination, etc. but that's essentially the command. I've examined the dump file on the backup server and done restores over it, etc. Of course, change level 0 for any level you like.
So, um, why isn't everyone using dump for backing up their VMs? I mean, I'm doing this over the WAN and ending up with a nice full/incremental rotation, I can pull out subsets for restore, it's compressed/secure, I could probably pipe a gpg encryption in there if I wished...
Let's Do Some Tests
Backup source: 1-core, 768M Vultr in Seattle.
Backup destination: DO in NYC. ~28ms.
Backing up:
Filesystem Size Used Avail Use% Mounted on /dev/sda1 30G 11G 18G 39% /
Dump command: dump -0 -f - / Where compression is listed, -z, -z5, or -z9
Test Results (SCIENCE!)
Compression Level Time Dump Size Source Server Impact None 3m28.997s 11.0G nil (6% cpu) 2 (default) 3m36.289s 7.0G noticeable if you look 5 4m5.272s 6.9G noticeable even if you don't look 9 (max) 6m5.260s 6.9G this is all you're doing
I'm being comical on the source server impact, but for example with level 5 or 9, the load average was well over 2.0, while with level 2 it was usually around 1.0
Destination side barely showed load - sshd was using 6% of CPU.
I was being lazy and using du -sh...I'm sure level 9 is a little smaller than level 5, but not so much that I'd care.
Of course, these are all full backups of the entire OS and in practice, I'd exclude some things (/tmp, etc.) and the daily incrementals would be much, much smaller (files changed since yesterday, compressed).
Given SSD disk speed these days, I think one could do a level 0 less frequently than the traditional once a week...more incrementals to play back but SSDs are fast.
Honorable Intentions
This method seems to meet all my needs/wants:
- captures everything - default include, not default "remember to include"
- can do incrementals which saves on my bandwidth
- can extract a subset of files to restore.
- encrypted in transit
- compressible
- haven't played with encryption yet but that's just a gpg command in the pipeline before ssh
- doesn't require staging space on the client
- can run unattended with passwordless ssh.
- on the backup server, I can move the backups somewhere out of the clients' access once backups are done, and the client doesn't depend on looking at that for an rsync-type incremental (and can't destroy backups with a malicious rsync)
Only negative is that I'd prefer to go over sftp so the client is completely locked down and limited to sftp only. But I can chroot the client into an incoming directory where he can only put files and not escape to do anything else.
I was concerned that maybe going over the WAN would result in broken connections, etc. but I just did half a dozen transcontinental dumps (please, no crude humor) and things seem to be working fine...
Someone stop me before I fall in love with this solution, get it pregnant, and elope to Buffalo.
Comments
that's a really good find, just want to know what happens if there is a local file error, network hiccups? did you try to pipe to rsync?
the above gif is giving me a headache ... thank you
hm? your way is already overcomplicated?
Just run the entire tar stream by SSH and gpg it:
https://paster.li/?9d26cd53e1ffbb15#2GssgcudHKu8W/jfdVc416tGMmz8te8XBpqH3Q1scdk=
@jarland your crap of a free CF account again blocks anyone using the word "ssh" in a post. Get a better CDN provider or pay... a tech forum that dies on pasting any bash excerpt is just... lame.
can you please give an example cmd.line?
i tried but every time i do CF tells me i can't post this and trashes my entire post. I added a link with basic info but will not type it up again.
https://github.com/willgrz/Autobackup/blob/master/backup.sh
I've been doing tar piped into SSH for years. Works fantastic; everything I back up tends to be damn near wire-speed. More info here: damtp.cam.ac.uk/user/ejb48/sshspeedtests.html#newer
Because then we wouldn't have customers who have had a service with us for four years get angry when their "life's work" gets deleted because they missed/ignored the 10+ invoice/overdue emails and then contact us three months later that they never took a backup of their "life's work" in four years.
Can I do incrementals with tar?
You cannot do incremental with GPG realistically.
Also, typing MySQL and PHP
^This
I think there are at least couple ways.
dump's knowledge of what time to base off is a file in /var.
There's tar -u, but I'd have to think about how that would work over a networked pipeline...mmm, maybe not.
Why no love for tar --listed-incremental?
dump is obsolete and by using it you are setting yourself up for trouble in the future:
http://dump.sourceforge.net/isdumpdeprecated.html
With ext4 you have pretty aggressive caching of write operations. You are going to lose data if you use dump.
Yes, I eventually came across that...boo. Note sure if xfsdump suffers the same limitations, and there seems to be considerable difference of opinion.
Well, on to tar --listed-incremental, or something else that can write an incremental to stdout...
Hmmm: http://unix.stackexchange.com/questions/124531/linux-tar-listed-incremental-untrustworthy
Well on to star then...
https://linux.die.net/man/1/star
... or dar, then: http://dar.linux.free.fr/doc/presentation.html
All-in-one solution: archive, compress, diff/incr backup, encrypt
Doesn't
borg
fityoureveryone's use case?https://github.com/borgbackup/borg
Someone opened a thread about it IIRC?
Also, there is duply and duplicity, I use the second one and find it good enough.
Borg is push-only.
But boy does it do a good job of de-duplicating similar blocks if you have an intermediate pull-backup machine(I have a $12 2TB kimsufi special that only 'pulls' in from all vps using rSnapshot over ssh-key)
(Do NOT run anything besides a key-only SSH server on this box.
Use dropbear-unlocked full-disk LUKS encryption to mitigate against OVH-management-level 'attacks'.)
Borg then creates incremental snapshots based on changed blocks over ALL vps 'pulls' ; compresses that with LZMA (level 3-6 is good is you have an i3/i5 cpu), encrypts that, and pushes to Time4VPS maxing out my 100mbit kimsufi uplink.
So, I can afford to lose either the Kimsufi Or time4vps box in this setup.
Rsnapshot gives you quick filesystem snapshots for the occasional FUBAR.
Borg serves as the long-term archival tier.
I haven't setup pruning on Borg, yet.
Life is short. I'm just going to use Duplicity like a normal person.
New, based on ZFS snapshots and rsync for easy access on secure storage machine and GPG for external storage also via rsync - works well for me:
https://github.com/willgrz/wBak-Autobackup-ZFS