ZFS - Backup server

Discussion:

ZFS - Backup server - recv send

Bertram Huber

2014-01-06 17:11:49 UTC

Hi,

I am just about done with my homeserver setup and my next task will be
to build a backup server. I bought a HP Proliant Microserver N54L and 4
2TB drives and would like to run a raidz1 on them. The plan is to wake
the Backup up once a week, send zvol snapshots via ssh to it, run a
scrub and put it back to sleep. My problem is: who should run the backup
command? The productive server, or the backup server? How do I do
authentication for ssh? Hardcode the password into a cron script?
Certificates?

Has someone already done something like this? Any ideas, hints, suggestions?

kind regards,

Bertram

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Turbo Fredriksson

2014-01-06 17:20:55 UTC

Permalink

who should run the backup command?

The server.

How do I do authentication for ssh?

Check out ssh-agent.
--
You know, boys, a nuclear reactor is a lot like a woman.
You just have to read the manual and press the right buttons
- Homer Simpson

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Patrick Hahn

2014-01-06 17:33:36 UTC

Permalink

One guideline I like for thinking about backups is this: the production
server should not be able to delete the backups, even as root. Maybe some
clever use of sudo can do this but it's easier to reason about if the
backup location /pulls/ from production. It saves you from your own 'oops'
moments as well as limiting some of the damage an attacker can do (*cough*
cryptolocker *cough*).

Post by Turbo Fredriksson

who should run the backup command?

The server.

How do I do authentication for ssh?

Check out ssh-agent.
--
You know, boys, a nuclear reactor is a lot like a woman.
You just have to read the manual and press the right buttons
- Homer Simpson
To unsubscribe from this group and stop receiving emails from it, send an

--
Patrick Hahn

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Michael Kjörling

2014-01-06 17:39:31 UTC

Permalink

Post by Bertram Huber
My problem is: who should
run the backup command? The productive server, or the backup server?

Assuming that it is an either/or situation, I'd probably make the
_sender_ (where the working copy of the data lives) initiate the
backup process. That makes it a lot easier also to initiate a backup
manually. However, as an extra reminder, you certainly could write a
script that runs periodically on the backup storage server that will
bug you if a backup hasn't been run in an expected time frame. (For
example, sends you an email if a backup is more than two days overdue
according to the schedule.) The backup server could also when it comes
up ping the clients to say "I'm ready to accept a backup now". That
should all be just a few quite simple scripts and little more.

Post by Bertram Huber
How do I do authentication for ssh? Hardcode the password into a
cron script? Certificates?

Consider using passphrase-less SSH keys, limited on the backup server
to be usable only from the specific host that should be using it and
limited to only exactly what is needed. Use a different authentication
method (different key, or password; possibly a different SSH server
instance) for administrative access.

Post by Bertram Huber
Has someone already done something like this? Any ideas, hints, suggestions?

Consider whether RAIDZ1 will provide enough redundancy in your
situation when a disk fails and the array needs to be resilvered.
There have been plenty of discussion about that in the last several
months at least; it shouldn't be hard to find out about the different
sides of the argument through the list archive. For this, at the very
least consider the size of the array, reading all the data stored on
it and judge the probability of hitting one of those uncorrectable
errors on any of the remaining drives during the resilvering process.

Also, be careful with plain consumer-grade drives; IMO not so much for
their higher rated error rate (regular scrubbing and a bit of
redundancy can take care of that) as their longer read timeouts. You
don't want a drive to get kicked out of the pool because of a single
failed read.

And one more thing I can think of: make sure you're monitoring those
disks. Run smartd and have it report anything out of the ordinary to
you in a way that you will actually notice. A set of drives bought at
the same time and exposed to the same environmental conditions are
much more likely to see multiple failures in short order than a random
set of the same number of drives of the same model, so if any drive
starts to fail, you want to know right away.

--
Michael Kjörling • http://michael.kjorling.se • michael-/***@public.gmane.org
“People who think they know everything really annoy
those of us who know we don’t.” (Bjarne Stroustrup)

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Schlacta, Christ

2014-01-06 18:14:52 UTC

Permalink

Your setup is a little simplistic, but look into using amanda for backups.
It might be a little heavy for you, but you can at least examine it from a
high level to see how it works in a nutshell

Hi,
I am just about done with my homeserver setup and my next task will be to
build a backup server. I bought a HP Proliant Microserver N54L and 4 2TB
drives and would like to run a raidz1 on them. The plan is to wake the
Backup up once a week, send zvol snapshots via ssh to it, run a scrub and
put it back to sleep. My problem is: who should run the backup command? The
productive server, or the backup server? How do I do authentication for
ssh? Hardcode the password into a cron script? Certificates?
Has someone already done something like this? Any ideas, hints, suggestions?
kind regards,
Bertram
To unsubscribe from this group and stop receiving emails from it, send an

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Bertram Huber

2014-01-07 00:22:01 UTC

Permalink

Thank you very much for all your answers!

I do not really know how to use the mailing list properly but I hope it
reaches everyone, who gave me some much needed tipps.
There are quite some things to be considered, but I think I have some
ideas I can build upon.

kind regards
Bertram

Post by Schlacta, Christ
Your setup is a little simplistic, but look into using amanda for
backups. It might be a little heavy for you, but you can at least
examine it from a high level to see how it works in a nutshell
Hi,
I am just about done with my homeserver setup and my next task
will be to build a backup server. I bought a HP Proliant
Microserver N54L and 4 2TB drives and would like to run a raidz1
on them. The plan is to wake the Backup up once a week, send zvol
snapshots via ssh to it, run a scrub and put it back to sleep. My
problem is: who should run the backup command? The productive
server, or the backup server? How do I do authentication for ssh?
Hardcode the password into a cron script? Certificates?
Has someone already done something like this? Any ideas, hints, suggestions?
kind regards,
Bertram
To unsubscribe from this group and stop receiving emails from it,
To unsubscribe from this group and stop receiving emails from it, send

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

k***@public.gmane.org

2014-01-07 03:17:15 UTC

Permalink

Hi-

I just finished setting up an offsite backup including encrypting the
incremental file stream. I run and cron.daily script on the backup server
that checks for a new snapshot on the file server an initiates a send if
there is a new snapshot. This allows the file server to create snapshots
asynchronously with the send request from the backup server because the
backup server is only intermittently turned on. The file server is creating
daily backups using the zfs-auto-snapshot.sh script as detailed here:
http://bernaerts.dyndns.org/linux/75-debian/279-debian-wheezy-zfs-raidz-pool

My backup server script is created as follows:

sudo nano /etc/cron.daily/zfs-remote-backup-daily
---------------------------------------------------------------------------
#!/bin/bash

# ip address of file server which is remote from the point of view of the
backup server
REMOTE_HOST=xxx.xxx.xxx.xxx

# non-standard port for ssh
SSH_PORT=yyyy

# set PATH
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# script assumes remote host is taking daily snapshots of the entire pool
that are named @backup_daily_YYYY_MM_DD_TTTT
# this snapshot naming is done by the zfs-auto-snapshot.sh script
# remote host uses zfs-auto-snapshot.sh to recursively snapshot the entire
pool and has set prefix to "backup"
# tank02 is the name of the pool on the file server

# determine the latest local backup daily snapshot
LOCAL_NEW=`zfs list -t snapshot | grep '***@backup_daily' | tail -n 1 |
cut -d ' ' -f 1 | cut -d '@' -f 2`

# determine the latest remote backup daily snapshot and encrypt
# encryption key is contained in /root/streampass on remote host so as to
not have it show up in a process listing
ssh $REMOTE_HOST -p $SSH_PORT "zfs list -t snapshot | grep
'***@backup_daily' | tail -n 1 | cut -d ' ' -f 1 | cut -d '@' -f 2 |
openssl enc -aes-256-cbc -a -salt -pass file:/root/streampass" >
remote-new.asc

# decrypt the latest remote backup daily
# decryption key is contained in /root/streampass on backup server
openssl enc -d -aes-256-cbc -a -pass file:/root/streampass -in
./remote-new.asc > remote-new

REMOTE_NEW=`cat remote-new`

# grab an incremental snapshot if the remote backup daily snapshot is
different than the local backup daily snapshot
if [ $LOCAL_NEW" != "$REMOTE_NEW" ]
then

echo ${REMOTE_NEW}: grabbing from remote server
# Grab a snapshot spanning the lastest local backup daily snapshot to
latest remote dailysnapshot recursively including any intervening snapshots
# Compress and then encrypt using key contained in file
/root/streampass on remote host
ssh $REMOTE_HOST -p $SSH_PORT "zfs send -R -I @$LOCAL_NEW
tank02@$REMOTE_NEW | xz | openssl enc -aes-256-cbc -a -salt -pass
file:/root/streampass" > incremental.img.xz.asc

echo ${REMOTE_NEW}: decrypting and uncompressing
# Decrypt and uncompress the incremental snapshot using key contained
in file /root/streampass on backup server
openssl enc -d -aes-256-cbc -a -pass file:/root/streampass -in
./incremental.img.xz.asc | unxz > incremental.img

echo ${REMOTE_NEW}: receiving incremental backup into pool
# receive incremental backup into backup pool retain only snapshots
present in remote pool (allows expiration of snapshots)
# assumes backups are stored in pool tank04/backups/
zfs receive -u -F -d tank04/backups/tank02 < incremental.img

echo ${REMOTE_NEW}: removing temporary files

rm incremental.img.xz.asc
rm incremental.img
rm remote-new.asc
rm remote-new
else
echo ${LOCAL_NEW}: nothing to grab
echo ${REMOTE_NEW}: removing temporary files
rm remote-new.asc
rm remote-new
fi
---------------------------------------------------------------------------

sudo chmod +x /etc/cron.daily/zfs-remote-backup-daily

Post by Bertram Huber
Thank you very much for all your answers!
I do not really know how to use the mailing list properly but I hope it
reaches everyone, who gave me some much needed tipps.
There are quite some things to be considered, but I think I have some
ideas I can build upon.
kind regards
Bertram
Your setup is a little simplistic, but look into using amanda for backups.
It might be a little heavy for you, but you can at least examine it from a
high level to see how it works in a nutshell

Hi,
I am just about done with my homeserver setup and my next task will be to
build a backup server. I bought a HP Proliant Microserver N54L and 4 2TB
drives and would like to run a raidz1 on them. The plan is to wake the
Backup up once a week, send zvol snapshots via ssh to it, run a scrub and
put it back to sleep. My problem is: who should run the backup command? The
productive server, or the backup server? How do I do authentication for
ssh? Hardcode the password into a cron script? Certificates?
Has someone already done something like this? Any ideas, hints, suggestions?
kind regards,
Bertram
To unsubscribe from this group and stop receiving emails from it, send an

To unsubscribe from this group and stop receiving emails from it, send an

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Michael Kjörling

2014-01-07 08:51:32 UTC

Permalink

Post by k***@public.gmane.org
# determine the latest remote backup daily snapshot and encrypt
# encryption key is contained in /root/streampass on remote host so as to
not have it show up in a process listing
ssh $REMOTE_HOST -p $SSH_PORT "zfs list -t snapshot | grep
openssl enc -aes-256-cbc -a -salt -pass file:/root/streampass" >
remote-new.asc
# decrypt the latest remote backup daily
# decryption key is contained in /root/streampass on backup server
openssl enc -d -aes-256-cbc -a -pass file:/root/streampass -in
./remote-new.asc > remote-new
REMOTE_NEW=`cat remote-new`

That's an interesting script. Out of curiosity, though, is there any
particular reason why you're shipping double-encrypted data? Assuming
that the ssh connection is encrypted (seems like a safe assumption to
me), what you are doing amounts to the following:

* take some input data - zfs list | ... | cut -d '@'
* encrypt it using OpenSSL
* encrypt it as part of SSH communications
* ship the package across the Internet
* decrypt it as part of SSH communications
* decrypt it using OpenSSL
* magic

The AES key also becomes an additional shared secret between the
systems.

It seems to me like double-encrypting the _snapshot list_ is rather
overkill. What was your reason for doing it that way?

Matthew McDonald

2014-01-07 13:37:52 UTC

Permalink

For what it's worth, I've just put the python script I use for keeping a
local ZFS filesystem in synch with a remote one by transferring snapshots
onto github here:

https://github.com/mafm/random-public-code

Basically, it looks for the most recent snapshot name the two filesystems
have in common and then incrementally transfers all snaphots since then
over, after destroying any local snapshot that doesn't appear on remote.

It seems to work fine for me in practice, though it could be improved. If
anyone wants to take it, and improve it, feel free.

The script gets called by the root crontab something like this:
*/5 * * * * export PYTHONPATH=/wherever; /usr/bin/python
$PYTHONPATH/replicate_zfs_snapshots.py sydney
tank-microserver-0-mirror-2tb/share/kapsia
tank/sydney-tank-replica/share/kapsia

Code below:
#!/usr/bin/env python
"""Usage: replicate_zfs_snapshots.py <remote-host> <remote-filesystem>
<local-filesystem> [-h | --help | -v | --verbose | -q | --quiet | -n |
--dry-run]

-n --dry-run
-h --help Show this
-v --verbose Log more than default
-q --quiet Log less than default

Example:
python replicate_zfs_snapshots.py sydney
tank-microserver-0-mirror-2tb/share/kapsia
tank/sydney-tank-replica/share/kapsia

This script synchronizes ZFS snapshots between filesystems on a local and
remote linux box.

All logging output generated by this script is written to syslog.

We assume that:
* passwordless ssh is set up between host running this script
and the remote host.
* the user the ssh connection logs in to on the remote host is allowed
password-less sudo on read-only commands (see /etc/sudoers.d/zfs).
* The user running this script is allowed to use destructive ZFS
commands: destroy, zfs receive, etc.
* That the local and remote filesystems have at least one initial
snapshot in common.

This script could be smarter and better:
* We could add command line arguments.
* We could add another script to check that the two filesystems were
actually synchronised successfully.
* We could make sure that all snapshots matched. Even if we have one
snapshot in common, we could make sure that *all* snapshots on
remote filesystem were present locally.
* We could compress zfs send output - or does ssh do that anyway?
* As long as local filesystem exists, and remote has snapshots, we
could synchronise the two filesystems by first transferring a
non-incremental initial snapshot, and then an incremental one,
instead of failing because we don't initially have a common snapshot
on both sides.
"""

from docopt import docopt

import subprocess

from kmds.lib import simple_syslog as logger

class ZfsReplicationNoLocalSnapshots(Exception):
pass

class ZfsReplicationNoRemoteSnapshots(Exception):
pass

class ZfsReplicationNoSnapshotsInCommon(Exception):
pass

def snapshots_in_creation_order(filesystem, host=None):
"Return list of snapshots on FILESYSTEM in order of creation."
result = []
if host:
cmd = "ssh {} sudo zfs list -r -t snapshot -s creation {} -o
name".format(host, filesystem)
else:
cmd = "sudo zfs list -r -t snapshot -s creation {} -o
name".format(filesystem)
lines = subprocess.check_output(cmd, stderr=subprocess.STDOUT,
shell=True).split('\n')
snapshot_prefix = filesystem + "@"
for line in lines:
if line.startswith(snapshot_prefix):
result.append(line)
return result

def strip_filesystem_name(snapshot_name):
"""Given the name of a snapshot, strip the filesystem part.

We require (and check) that the snapshot name contains a single
'@' separating filesystem name from the 'snapshot' part of the name.
"""
assert snapshot_name.count("@")==1
return snapshot_name.split("@")[1]

def execute_shell_command(cmd, dry_run=True):
if dry_run:
logger.info("would execute: {}".format(cmd))
else:
logger.info("executing: {}".format(cmd))
text = subprocess.check_output(cmd, stderr=subprocess.STDOUT,
shell=True).split('\n')
if not text:
logger.debug(" no output")
else:
logger.debug(" output:")
for line in text:
logger.debug(" {}".format(line))

def replicate_snapshots(remote_host, remote_filesystem, local_filesystem,
dry_run=True):
"""Synchronise ZFS snapshots from remote filesystem to a local
filesystem."""

logger.info("Started. remote host: {}, remote-fs: {}, local-filesystem:
{}, dry-run: {}".format(
remote_host, remote_filesystem, local_filesystem, dry_run))

local_snapshots = snapshots_in_creation_order(local_filesystem)
if not local_snapshots:
raise ZfsReplicationNoLocalSnapshots("No local snapshots",
local_filesystem)
remote_snapshots = snapshots_in_creation_order(remote_filesystem,
remote_host)

if not remote_snapshots:
raise ZfsReplicationNoRemoteSnapshots("No remote snapshots",
"host:
{}".format(remote_host),
"filesystem:
{}".format(remote_filesystem))

remote_set = set(map(strip_filesystem_name, remote_snapshots))
local_set = set(map(strip_filesystem_name, local_snapshots))

last_common_snapshot = next((s for s in reversed(remote_snapshots) if
strip_filesystem_name(s) in local_set), None)
snapshots_missing_in_remote = [s for s in local_snapshots if not
strip_filesystem_name(s) in remote_set]
last_remote_snapshot = remote_snapshots[-1]

logger.debug("Local snapshots:")
for snapshot in local_snapshots:
logger.debug(" {}".format(snapshot))
logger.debug("Remote snapshots:")
for snapshot in remote_snapshots:
logger.debug(" {}".format(snapshot))

logger.debug("Last common snapshot: {}".format(last_common_snapshot))
logger.debug("Last remote snapshot: {}".format(last_remote_snapshot))

if snapshots_missing_in_remote:
logger.debug("Present locally, but not in remote:")
for snapshot in snapshots_missing_in_remote:
logger.debug(" {}".format(snapshot))
if not last_common_snapshot:
raise ZfsReplicationNoRemoteSnapshots("No remote snapshots",
"host:
{}".format(remote_host),
"remote_filesystem:
{}".format(remote_filesystem),
"local_filesystem:
{}".format(local_filesystem))

if last_remote_snapshot == last_common_snapshot:
logger.info("No work to do. Last remote snapshot '{}' already on
local filesystem.".format(
strip_filesystem_name(last_remote_snapshot)))
return

for snapshot in snapshots_missing_in_remote:
execute_shell_command("sudo zfs destroy {}".format(snapshot),
dry_run)
execute_shell_command(("ssh {} ".format(remote_host)
+ "sudo zfs send -I {} {}
".format(last_common_snapshot, last_remote_snapshot)
+ "| sudo zfs receive -F
{}".format(local_filesystem)),
dry_run)

if __name__ == '__main__':
arguments=docopt(__doc__)

logger.init("Kapsia/ZfsReplicateSnapshots", logger.INFO)

if arguments['--verbose']:
logger.setLevel(logger.DEBUG)
logger.debug('Arguments: {}'.format(arguments))
if arguments['--quiet']:
logger.setLevel(logger.ERROR)

try:
replicate_snapshots(arguments['<remote-host>'],
arguments['<remote-filesystem>'], arguments['<local-filesystem>'],
arguments['--dry-run'])
except Exception as e:
logger.critical("Exception: {}: {}".format(type(e), e))

logger.info("Finished.")

Post by Michael KjÃ¶rling

That's an interesting script. Out of curiosity, though, is there any
particular reason why you're shipping double-encrypted data? Assuming
that the ssh connection is encrypted (seems like a safe assumption to
* encrypt it using OpenSSL
* encrypt it as part of SSH communications
* ship the package across the Internet
* decrypt it as part of SSH communications
* decrypt it using OpenSSL
* magic
The AES key also becomes an additional shared secret between the
systems.
It seems to me like double-encrypting the _snapshot list_ is rather
overkill. What was your reason for doing it that way?
--
People who think they know everything really annoy
those of us who know we dont. (Bjarne Stroustrup)
To unsubscribe from this group and stop receiving emails from it, send an

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

k***@public.gmane.org

2014-01-07 15:16:58 UTC

Permalink

The double encryption is because of my unfamiliarity with ssh. I was
unsure if all data was being encrypted with an ssh connection or just
handshaking data. If ssh is encrypting all data then there is no need to
add separate encryption on top of it.

Post by Michael KjÃ¶rling
That's an interesting script. Out of curiosity, though, is there any
particular reason why you're shipping double-encrypted data? Assuming
that the ssh connection is encrypted (seems like a safe assumption to
* encrypt it using OpenSSL
* encrypt it as part of SSH communications
* ship the package across the Internet
* decrypt it as part of SSH communications
* decrypt it using OpenSSL
* magic
The AES key also becomes an additional shared secret between the
systems.
It seems to me like double-encrypting the _snapshot list_ is rather
overkill. What was your reason for doing it that way?

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Michael Kjörling

2014-01-07 15:55:43 UTC

Permalink

Post by k***@public.gmane.org
The double encryption is because of my unfamiliarity with ssh. I was
unsure if all data was being encrypted with an ssh connection or just
handshaking data. If ssh is encrypting all data then there is no need to
add separate encryption on top of it.

SSH encrypts all data that is transmitted, both channel control as
well as payload data. Not doing so would make it not very "secure";
imagine how easy it would be to do passive eavesdropping or for that
matter active data injection/rejection/modification attacks.

Tunneling a random network application's data through SSH is a quick
and easy way to make that application use strong encryption.

The only exception would be if you are using the "none" cipher, which
at least my version of OpenSSH (OpenSSH_6.0p1 Debian-4, OpenSSL 1.0.1e
11 Feb 2013, Debian Wheezy) does not support and which is optional to
implement as well as "not recommended" in the standard. See [1]. Using
SSH with the "none" cipher essentially turns it into telnet.

[1] https://tools.ietf.org/html/rfc4253#section-6.3

Gordan Bobic

2014-01-07 15:59:21 UTC

Permalink

Post by Michael KjÃ¶rling
Using
SSH with the "none" cipher essentially turns it into telnet.

RSH should be a slightly closer comparison.

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Michael Kjörling

2014-01-07 16:01:46 UTC

Permalink

Post by Gordan Bobic

Using SSH with the "none" cipher essentially turns it into telnet.

RSH should be a slightly closer comparison.

Good point, but still close enough IMO and more likely to be familiar. :-)

k***@public.gmane.org

2014-01-07 17:36:16 UTC

Permalink

Ok with that understanding of ssh I have recast the script with no separate
encryption. This simplifies it nicely so I will have a prayer of a chance
figuring out what I was doing when I look back on it in a year or so:

sudo nano /etc/cron.daily/zfs-remote-backup-daily
---------------------------------------------------------------------------
#!/bin/bash

# ip address of file server which is remote from the point of view of the
backup server
REMOTE_HOST=xxx.xxx.xxx.xxx

# non-standard port for ssh
SSH_PORT=yyyy

# set PATH
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# script assumes remote host is taking daily snapshots of the entire pool
that are named @backup_daily_YYYY_MM_DD_TTTT
# this snapshot naming is done by the zfs-auto-snapshot.sh script
# remote host uses zfs-auto-snapshot.sh to recursively snapshot the entire
pool and has set prefix to "backup"
# tank02 is the name of the pool on the file server

# determine the latest local backup daily snapshot
# note to self: those aren't single quotes they are back quotes to allow
for command substitution
LOCAL_NEW=`zfs list -t snapshot | grep '***@backup_daily' | tail -n 1 |
cut -d ' ' -f 1 | cut -d '@' -f 2`

# determine the latest remote backup daily snapshot
# note to self: those aren't single quotes they are back quotes to allow
for command substitution
REMOTE_NEW=`ssh $REMOTE_HOST -p $SSH_PORT "zfs list -t snapshot | grep
'***@backup_daily' | tail -n 1 | cut -d ' ' -f 1 | cut -d '@' -f 2 "`

# grab an incremental filestream if the remote backup daily snapshot is
different than the local backup daily snapshot
if [ $LOCAL_NEW" != "$REMOTE_NEW" ]
then

echo ${REMOTE_NEW}: grabbing from remote server
# Grab an incremental file stream spanning the lastest local backup
daily snapshot to latest remote dailysnapshot recursively including any
intervening snapshots
# Compress on remote host
ssh $REMOTE_HOST -p $SSH_PORT "zfs send -R -I @$LOCAL_NEW
tank02@$REMOTE_NEW | xz" > incremental.img.xz

echo ${REMOTE_NEW}: uncompressing
# Uncompress the incremental filestream on backup server
xz -d ./incremental.img.xz > incremental.img

echo ${REMOTE_NEW}: receiving incremental filestream into pool
# receive incremental filestream into backup pool, retain only
snapshots present in remote pool (allows expiration of snapshots)
# assumes [pool backups are stored in pool tank04/backups/
zfs receive -u -F -d tank04/backups/tank02 < incremental.img

echo ${REMOTE_NEW}: removing temporary files
rm incremental.img.xz
rm incremental.img

else
echo ${LOCAL_NEW}: nothing to grab

fi
---------------------------------------------------------------------------

sudo chmod +x /etc/cron.daily/zfs-remote-backup-daily

If my backup server has been too lazy in triggering this script by way of
daily cron, it can be forced to run it by:
sudo /etc/cron.daily/zfs-remote-backup-daily

Post by Michael KjÃ¶rling
SSH encrypts all data that is transmitted, both channel control as
well as payload data. Not doing so would make it not very "secure";
imagine how easy it would be to do passive eavesdropping or for that
matter active data injection/rejection/modification attacks.
Tunneling a random network application's data through SSH is a quick
and easy way to make that application use strong encryption.
The only exception would be if you are using the "none" cipher, which
at least my version of OpenSSH (OpenSSH_6.0p1 Debian-4, OpenSSL 1.0.1e
11 Feb 2013, Debian Wheezy) does not support and which is optional to
implement as well as "not recommended" in the standard. See [1]. Using
SSH with the "none" cipher essentially turns it into telnet.
[1] https://tools.ietf.org/html/rfc4253#section-6.3
--
People who think they know everything really annoy
those of us who know we dont. (Bjarne Stroustrup)

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Cold Canuck

2014-01-06 19:20:27 UTC

Permalink

In addition to the excellent points others have made, I would add:

- I use the exact same server, with a RAIDZ1 pool. I choose RAIDZ1 as it is a backup server (in fact it is an offsite backup of a backup machine); if all goes well I'll never read from it, and I am willing to accept the risk that I will have 2 disks go done on it simultaneously, coincident with a multiple drive failure on the primary machine. The machines are NOT co-located. This should be an informed decision and and you may be unwilling to take the risk. Note that even a second disk death during a resilver is not catastrophic as long as the primary machine still has the data. In my case it is a "backup server", not "an archive machine".

- I use a push from the primary machine as it is the only machine with the knowledge of when the data sets are stable and consistent. It can then snapshot this state on the primary machine and push it out to the remote backup machine; an asynchronous pull from the backup machine risks snapshotting an inconsistent state of the data (think databases, or active VMs).

- I use zfs send/recv over ssh to a non standard port; between the firewall and sshd on the remote site connections are only accepted from specific IPs (i.e my primary site). This keeps the port scanners and script kiddies done to a dull roar.

- The user on the target machine that the ssh 'tunnel' connects to is only allowed to run a small number commands (such as "zfs recv"). Someone could still trash the backup, but only with pretty good knowledge of the structure. Safe enough for me, you may have different thoughts.

- please do NOT put passwords in cron jobs; at the very least us ssh 'authorized_keys' or cut your own certificates.

Contact me off-list if you want more info.

Andrew

Hi,
I am just about done with my homeserver setup and my next task will be to build a backup server. I bought a HP Proliant Microserver N54L and 4 2TB drives and would like to run a raidz1 on them. The plan is to wake the Backup up once a week, send zvol snapshots via ssh to it, run a scrub and put it back to sleep. My problem is: who should run the backup command? The productive server, or the backup server? How do I do authentication for ssh? Hardcode the password into a cron script? Certificates?
Has someone already done something like this? Any ideas, hints, suggestions?
kind regards,
Bertram

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Gregor Kopka

2014-01-07 22:13:45 UTC

Permalink

Post by Cold Canuck
- I use a push from the primary machine as it is the only machine with the knowledge of when the data sets are stable and consistent. It can then snapshot this state on the primary machine and push it out to the remote backup machine; an asynchronous pull from the backup machine risks snapshotting an inconsistent state of the data (think databases, or active VMs).

I though about that too, but then the primary machine can destroy the
backup (since it can access it), so it's better to have the snapshots
made on the primary (whenever it thinks it's safe to do so) while having
the backup server pull from it with /ssh <primary> -c zfs send | //recv
<local pool>/.

Post by Cold Canuck
- I use zfs send/recv over ssh to a non standard port; between the firewall and sshd on the remote site connections are only accepted from specific IPs (i.e my primary site). This keeps the port scanners and script kiddies done to a dull roar.

While non-standard ports sound nice at first glance they are pointless
(since nmap exists).
Should you want to hide services: use port knocking.

Post by Cold Canuck
- The user on the target machine that the ssh 'tunnel' connects to is only allowed to run a small number commands (such as "zfs recv"). Someone could still trash the backup, but only with pretty good knowledge of the structure. Safe enough for me, you may have different thoughts.

You also would need access to "zfs destroy" to remove snapshots existing
on the backup which are no longer on the primary - for scenarios where
the backup host was down for a while so an older snapshot (f.ex. monthly
one) needs to be used since the more regular ones (daily/weekly) have
already been rotated out by the primary system. This would open the
backup server to "zfs destroy <pool> -R" - imho a very bad idea.

Post by Cold Canuck
- please do NOT put passwords in cron jobs; at the very least us ssh 'authorized_keys' or cut your own certificates.

Yes, Passwords in cron jobs are especially bad since you don't need to
be root to see then (ps will show the command lines of what is running
to normal users).

Nevertheless, leaving the ability for attackers to gain access to backup
systems is a road to desaster:
One of my boxes got rooted once, since i had a backup system /pulling/
from it the problem was contained on that one box which could be
restored from last good backup quickly - in case the rooted box would
have pushed to the backup (as a local root user on the backup - which we
currently need since ZoL dosn't support delegation yet) i would have had
to start from scratch (since i couldn't be sure that the backup host
wasn't compromised too).

Gregor

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Cold Canuck

2014-01-08 02:09:12 UTC

Permalink

While non-standard ports sound nice at first glance they are pointless (since nmap exists).
Should you want to hide services: use port knocking.

True, but in my case someone would have to be IP spoofing on either the normal or moved port to get through the fw. Moving the port just reduces the noise on std. port scans, it does not improve security. The reality is you need some sort of hole to get into your remote machine; so someone else can find it if they are determined enough. So we are in thunderous agreement.

You also would need access to "zfs destroy" to remove snapshots existing on the backup which are no longer on the primary - for scenarios where the backup host was down for a while so an older snapshot (f.ex. monthly one) needs to be used since the more regular ones (daily/weekly) have already been rotated out by the primary system. This would open the backup server to "zfs destroy <pool> -R" - imho a very bad idea.

Yes, I agree; I'm just not *as* worried about that (primary is a locked down backup server) as I am about my point below. I also mitigate it somewhat by only allowing a 'zfs destroy snapshot' command. Not that this is foolproof, but at least keeps me from shooting myself in the foot.

One of my boxes got rooted once, since i had a backup system pulling from it the problem was contained on that one box which could be restored from last good backup quickly - in case the rooted box would have pushed to the backup (as a local root user on the backup - which we currently need since ZoL dosn't support delegation yet) i would have had to start from scratch (since i couldn't be sure that the backup host wasn't compromised too).

The problem is that if you pull you have to have allow remote access to the primary machine. Remote gets rooted and there is a free pass to the primary machine , or at the very least its data. So you can choose your poison :o).
In my case the primary machine's data are unencrypted, the remote machine is encrypted. So while I would greatly dislike someone rooting the remote machine and destroying the data, the loss would be much less than having that happen to the primary machine. So in my case no paths into the primary machine, only paths out.

Take home message for the OP:

The reality is that either push or pull can be made to work; each has its own set of advantages for some use cases, and for a specific user one might be superior to the other. I guess we were both trying to tell the OP what those advantages might be without being dogmatic about the approach. Hope we accomplished that.

Andrew

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Michael Kjörling

2014-01-08 07:44:08 UTC

Permalink

Post by Cold Canuck
The reality is that either push or pull can be made to work; each
has its own set of advantages for some use cases, and for a specific
user one might be superior to the other. I guess we were both trying
to tell the OP what those advantages might be without being dogmatic
about the approach. Hope we accomplished that.

There is also another aspect to this that a lot of people seem to be
overlooking. Namely, that either host that _initiates_ the backup does
not _necessarily_ imply either a push-to-backup or
pull-from-production scheme. As a trivial example, the production
server can initiate the backup flow by making a web request to the
backup server which sets a flag of some sort on the backup server,
which a cron job checks regularly and reacts to. It could even be
implemented with port _blocking_ plus fail2ban with a custom action.

Now, I'm not saying that the above is necessarily a good way to go.
But it is one way that separates the question of who initiates the
flow from whether the backup process itself is push or pull.

Niels de Carpentier

2014-01-15 07:25:03 UTC

Permalink

Post by Gregor Kopka

Post by Cold Canuck
- I use a push from the primary machine as it is the only machine with
the knowledge of when the data sets are stable and consistent. It can
then snapshot this state on the primary machine and push it out to the
remote backup machine; an asynchronous pull from the backup machine
risks snapshotting an inconsistent state of the data (think databases,
or active VMs).

You either need to initiate from the primary machine if it doesn't have
the rights to delete older data on the backup machine, or initiate from
the backup machine with no right to modify any production data. In you do
this, both machines need to be hacked to compromise data.

Niels

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org

Manuel Amador (Rudd-O

2014-01-15 23:09:53 UTC

Permalink

Create SSH key in backup server.

Install pubkey in your root user of the host to be backed up.

Then run (with cron) zreplicate on your backup server in
github.com/Rudd-O/zfs-tools:

ssh hosttobackup zfs snapshot -r ***@autosnapshot-daily-`date
+"%Y-%M-%D"`
zfs create -p backuppoolname/hosttobackup/poolname
zreplicate -vtc hosttobackup:poolname
backuppoolname/hosttobackup/poolname

That is really all there is to it. The tool zsnap in
github.com/Rudd-O/zfs-tools includes a more sophisticated snapshotting
tool that removes obsolete snapshots, but there is another one that I
use which I can't quite recall right now.

I have the exact same server, and this is what I do.

Post by Bertram Huber
Hi,
I am just about done with my homeserver setup and my next task will be
to build a backup server. I bought a HP Proliant Microserver N54L and
4 2TB drives and would like to run a raidz1 on them. The plan is to
wake the Backup up once a week, send zvol snapshots via ssh to it, run
a scrub and put it back to sleep. My problem is: who should run the
backup command? The productive server, or the backup server? How do I
do authentication for ssh? Hardcode the password into a cron script?
Certificates?
Has someone already done something like this? Any ideas, hints, suggestions?
kind regards,
Bertram
To unsubscribe from this group and stop receiving emails from it, send

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org