2023-09-22 17:23:49 +02:00
|
|
|
---
|
|
|
|
|
title: "Seafile Mirror - Simple automatic backup of your Seafile libraries"
|
|
|
|
|
date: 2023-09-22
|
|
|
|
|
categories:
|
2026-02-12 23:48:38 +01:00
|
|
|
- blog
|
2023-09-22 17:23:49 +02:00
|
|
|
- english
|
|
|
|
|
tags:
|
2026-02-22 11:32:37 +01:00
|
|
|
- Code
|
|
|
|
|
- SystemAdministration
|
2026-02-12 21:39:21 +01:00
|
|
|
headerimage:
|
|
|
|
|
src: /blog/library.jpg
|
|
|
|
|
text: Wouldn't it be a shame if your library were to be destroyed?
|
2023-09-22 17:23:49 +02:00
|
|
|
---
|
|
|
|
|
|
|
|
|
|
I have been using [Seafile](https://www.seafile.com/) for years to host and
|
|
|
|
|
synchronise files on my own server. It's fast and reliable, especially when
|
|
|
|
|
dealing with a large number and size of files. But making reliable backups of
|
|
|
|
|
all its files isn't so trivial. This is because the files are stored in a layout
|
|
|
|
|
similar to bare Git repositories, and Seafile's headless tool, seafile-cli,
|
|
|
|
|
is... suboptimal. So I created what started out as a wrapper for it and ended up
|
|
|
|
|
as a full-blown tool for automatically synchronising your libraries to a backup
|
|
|
|
|
location: [**Seafile Mirror**](https://src.mehl.mx/mxmehl/seafile-mirror).
|
|
|
|
|
|
|
|
|
|
## My requirements
|
|
|
|
|
|
|
|
|
|
Of course, you could just take snapshots of the whole server, or copy the raw
|
|
|
|
|
Seafile data files and import them into a newly created Seafile instance as a
|
|
|
|
|
disaster recovery, but I want to be able to **directly access the current
|
|
|
|
|
state of the files** whenever I need them in case of an emergency.
|
|
|
|
|
|
|
|
|
|
It was also important for me to have a **snapshot**, not just another real-time
|
|
|
|
|
sync of a library. This is because I also want to have a backup in case I (or an
|
|
|
|
|
attacker) mess up a Seafile library. A real-time sync would immediately fetch
|
|
|
|
|
that failed state.
|
|
|
|
|
|
|
|
|
|
I also want to take a snapshot at a **configurable interval**. Some libraries
|
|
|
|
|
should be synchronised more often than others. For example, my picture albums do
|
|
|
|
|
not change as often as my miscellaneous documents, but they use at least 20
|
|
|
|
|
times the disk space and therefore network traffic when running a full sync.
|
|
|
|
|
|
|
|
|
|
Also, the backup service must have **read-only access** to the files.
|
|
|
|
|
|
|
|
|
|
A version controlled backup of the backup (i.e. the plain files) wasn't in
|
|
|
|
|
scope. I handle this separately by backing up my backup location, which also
|
|
|
|
|
contains similar backups of other services and machines. For this reason, my
|
|
|
|
|
current solution does not do incremental backups, even though this may be
|
|
|
|
|
relevant for other use cases.
|
|
|
|
|
|
|
|
|
|
## The problems
|
|
|
|
|
|
|
|
|
|
Actually, [seafile-cli](https://help.seafile.com/syncing_client/linux-cli/)
|
|
|
|
|
should have been everything you'd need to fulfill the requirements. But no. It
|
|
|
|
|
turned out that this tool has a number of fundamental issues:
|
|
|
|
|
|
|
|
|
|
* You can make the host the tool is running on a sync peer. However, it easily
|
|
|
|
|
leads to sync errors if the user just has read-only permissions to the
|
|
|
|
|
library.
|
|
|
|
|
* You can also download a library but then again it may lead to strange sync
|
|
|
|
|
errors.
|
|
|
|
|
* It requires a running daemon which crashes irregularly during larger sync
|
|
|
|
|
tasks or has other issues.
|
|
|
|
|
* Download/sync intervals cannot be set manually.
|
|
|
|
|
|
|
|
|
|
## The solution
|
|
|
|
|
|
|
|
|
|
[seafile-mirror](https://src.mehl.mx/mxmehl/seafile-mirror) takes care of all
|
|
|
|
|
these stumbling blocks:
|
|
|
|
|
|
|
|
|
|
* It downloads/syncs defined libraries in customisable intervals
|
|
|
|
|
* It de-syncs libaries immediately after they have been downloaded to avoid sync
|
|
|
|
|
errors
|
|
|
|
|
* You can force-re-sync a library even if its re-sync interval hasn't reached
|
|
|
|
|
yet
|
|
|
|
|
* Extensive informative and error logging is provided
|
|
|
|
|
* Of course created with automation in mind so you can run it in cronjobs or
|
|
|
|
|
systemd triggers
|
|
|
|
|
* And as explained, it deals with the numerous caveats of `seaf-cli` and Seafile
|
|
|
|
|
in general
|
|
|
|
|
|
|
|
|
|
Full installation and usage documentation can be found in the project
|
|
|
|
|
repository. Installation is as simple as running `pip3 install seafile-mirror`,
|
|
|
|
|
and a sample configuration is provided.
|
|
|
|
|
|
|
|
|
|
In my setup, I run this application on a headless server with systemd under a
|
|
|
|
|
separate user account. Therefore the systemd service needs to be set up first.
|
|
|
|
|
This is also covered in the tool's documentation. And as an Ansible power user,
|
|
|
|
|
I also provide an [Ansible
|
|
|
|
|
role](https://src.mehl.mx/mxmehl/seafile-mirror-ansible) that does all the setup
|
|
|
|
|
and configuration.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Possible next steps
|
|
|
|
|
|
|
|
|
|
The tool has been running every day since a couple of months without any issues.
|
|
|
|
|
However, I could imagine a few more features to be helpful for more people:
|
|
|
|
|
|
|
|
|
|
* Support of login tokens: Currently, only user/password auth is supported which
|
|
|
|
|
is fine for my use-case as it's just a read-only user. This wouldn't be hard
|
|
|
|
|
to fix either, seafile-cli supports it (at least in theory).
|
|
|
|
|
([#2](https://src.mehl.mx/mxmehl/seafile-mirror/issues/2))
|
|
|
|
|
* Support of encrypted libraries: Shouldn't be a big issue, it would require
|
|
|
|
|
passing the password to the underlying seafile-cli command.
|
|
|
|
|
([#3](https://src.mehl.mx/mxmehl/seafile-mirror/issues/3))
|
|
|
|
|
|
|
|
|
|
If you have encountered problems or would like to point out the need for
|
|
|
|
|
specific features, please feel free to contact me or comment on the Mastodon
|
|
|
|
|
post. I'd also love to hear if you've become a happy user of the tool 😊.
|