Here, we investigate the behaviour of LXD when moving containers between LXD cluster nodes, with a focus on various types of (filesystem) snapshots.
LXD containers can be snapshot by LXD itself, but in case one uses a ZFS storage backend, one can also use a tool like Sanoid to make snapshots of a container’s filesystem. When moving an LXD container from one LXD cluster node to another, one, of course, wants those filesystem snapshots to move along as well. Spoiler: this isn’t always the case.
Let’s create a test container on my home LXD cluster (which uses ZFS as default storage backend), starting on node wiske2
:
lxc launch ubuntu:22.04 snapmovetest --target=wiske2
Check the container is running:
lxc list snapmovetest
+--------------+---------+-----------------------+-------------------------------------------+-----------+-----------+----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION | +--------------+---------+-----------------------+-------------------------------------------+-----------+-----------+----------+ | snapmovetest | RUNNING | 192.168.10.158 (eth0) | 2a10:3781:782:1:216:3eff:fed5:ef48 (eth0) | CONTAINER | 0 | wiske2 | +--------------+---------+-----------------------+-------------------------------------------+-----------+-----------+----------+
Now, let’s use LXD to create two snapshots:
lxc snapshot snapmovetest "Test1" sleep 10 lxc snapshot snapmovetest "Test2"
Check the snapshots have been made:
lxc info snapmovetest | awk '$1=="Snapshots:" {toprint=1}; {if(toprint==1) {print $0}}'
Snapshots: +-------+----------------------+------------+----------+ | NAME | TAKEN AT | EXPIRES AT | STATEFUL | +-------+----------------------+------------+----------+ | Test1 | 2023/03/11 22:22 CET | | NO | +-------+----------------------+------------+----------+ | Test2 | 2023/03/11 22:22 CET | | NO | +-------+----------------------+------------+----------+
At the ZFS level:
zfs list -rtall rpool/lxd/containers/snapmovetest
NAME USED AVAIL REFER MOUNTPOINT rpool/lxd/containers/snapmovetest 24.7M 192G 748M legacy rpool/lxd/containers/snapmovetest@snapshot-Test1 60K - 748M - rpool/lxd/containers/snapmovetest@snapshot-Test2 60K - 748M -
All is fine! Now, let’s move the container to node wiske3
:
lxc stop snapmovetest lxc move snapmovetest snapmovetest --target=wiske3 lxc list snapmovetest
+--------------+---------+------+------+-----------+-----------+----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION | +--------------+---------+------+------+-----------+-----------+----------+ | snapmovetest | STOPPED | | | CONTAINER | 2 | wiske3 | +--------------+---------+------+------+-----------+-----------+----------+
Check the snapshots:
lxc info snapmovetest | awk '$1=="Snapshots:" {toprint=1}; {if(toprint==1) {print $0}}'
Snapshots: +-------+----------------------+------------+----------+ | NAME | TAKEN AT | EXPIRES AT | STATEFUL | +-------+----------------------+------------+----------+ | Test1 | 2023/03/11 22:22 CET | | NO | +-------+----------------------+------------+----------+ | Test2 | 2023/03/11 22:22 CET | | NO | +-------+----------------------+------------+----------+
At the ZFS level:
zfs list -rtall rpool/lxd/containers/snapmovetest
NAME USED AVAIL REFER MOUNTPOINT rpool/lxd/containers/snapmovetest 749M 202G 748M legacy rpool/lxd/containers/snapmovetest@snapshot-Test1 60K - 748M - rpool/lxd/containers/snapmovetest@snapshot-Test2 60K - 748M -
So far so good: snapshots taken with the native LXD toolchain get moved. Now let’s manually create a ZFS snapshot:
zfs snapshot rpool/lxd/containers/snapmovetest@manual_zfs_snap zfs list -rtall rpool/lxd/containers/snapmovetest
NAME USED AVAIL REFER MOUNTPOINT rpool/lxd/containers/snapmovetest 749M 202G 748M legacy rpool/lxd/containers/snapmovetest@snapshot-Test1 60K - 748M - rpool/lxd/containers/snapmovetest@snapshot-Test2 60K - 748M - rpool/lxd/containers/snapmovetest@manual_zfs_snap 0B - 748M -
Nove move the container back to node wiske2
:
lxc move snapmovetest snapmovetest --target=wiske2 lxc list snapmovetest
+--------------+---------+------+------+-----------+-----------+----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION | +--------------+---------+------+------+-----------+-----------+----------+ | snapmovetest | STOPPED | | | CONTAINER | 2 | wiske2 | +--------------+---------+------+------+-----------+-----------+----------+
What happened to the snapshots?
lxc info snapmovetest | awk '$1=="Snapshots:" {toprint=1}; {if(toprint==1) {print $0}}'
Snapshots: +-------+----------------------+------------+----------+ | NAME | TAKEN AT | EXPIRES AT | STATEFUL | +-------+----------------------+------------+----------+ | Test1 | 2023/03/11 22:22 CET | | NO | +-------+----------------------+------------+----------+ | Test2 | 2023/03/11 22:22 CET | | NO | +-------+----------------------+------------+----------+
zfs list -rtall rpool/lxd/containers/snapmovetest
NAME USED AVAIL REFER MOUNTPOINT rpool/lxd/containers/snapmovetest 749M 191G 748M legacy rpool/lxd/containers/snapmovetest@snapshot-Test1 60K - 748M - rpool/lxd/containers/snapmovetest@snapshot-Test2 60K - 748M -
Somehow, the ZFS-level snapshot has been removed… I guess this part of the LXD manual should be written in bold (emphasis mine):
LXD assumes that it has full control over the ZFS pool and dataset. Therefore, you should never maintain any datasets or file system entities that are not owned by LXD in a ZFS pool or dataset, because LXD might delete them.
Consequently, in a LXD cluster one shouldn’t use Sanoid to make snapshots ZFS-backed LXD container filesystems. Instead, use LXD’s builtin automatic snapshot capabilities (see the snapshots.expiry
and snapshots.schedule
options).
Clean up:
lxc delete snapmovetest