Files
mehl.mx/content/blog/2023-09-seafile-mirror.md
Max Mehl 1b5ed85cc6
All checks were successful
Website build and deploy / build (push) Successful in 1m44s
feat: add hard-coded mastodon toot url for comments to speed up
2026-02-24 12:03:49 +01:00

109 lines
4.9 KiB
Markdown

---
title: "Seafile Mirror - Simple automatic backup of your Seafile libraries"
date: 2023-09-22
categories:
- blog
- english
tags:
- Code
- SystemAdministration
headerimage:
src: /blog/library.jpg
text: Wouldn't it be a shame if your library were to be destroyed?
mastodon_toot_url: "https://mastodon.social/@mxmehl/111109533835169070"
---
I have been using [Seafile](https://www.seafile.com/) for years to host and
synchronise files on my own server. It's fast and reliable, especially when
dealing with a large number and size of files. But making reliable backups of
all its files isn't so trivial. This is because the files are stored in a layout
similar to bare Git repositories, and Seafile's headless tool, seafile-cli,
is... suboptimal. So I created what started out as a wrapper for it and ended up
as a full-blown tool for automatically synchronising your libraries to a backup
location: [**Seafile Mirror**](https://src.mehl.mx/mxmehl/seafile-mirror).
## My requirements
Of course, you could just take snapshots of the whole server, or copy the raw
Seafile data files and import them into a newly created Seafile instance as a
disaster recovery, but I want to be able to **directly access the current
state of the files** whenever I need them in case of an emergency.
It was also important for me to have a **snapshot**, not just another real-time
sync of a library. This is because I also want to have a backup in case I (or an
attacker) mess up a Seafile library. A real-time sync would immediately fetch
that failed state.
I also want to take a snapshot at a **configurable interval**. Some libraries
should be synchronised more often than others. For example, my picture albums do
not change as often as my miscellaneous documents, but they use at least 20
times the disk space and therefore network traffic when running a full sync.
Also, the backup service must have **read-only access** to the files.
A version controlled backup of the backup (i.e. the plain files) wasn't in
scope. I handle this separately by backing up my backup location, which also
contains similar backups of other services and machines. For this reason, my
current solution does not do incremental backups, even though this may be
relevant for other use cases.
## The problems
Actually, [seafile-cli](https://help.seafile.com/syncing_client/linux-cli/)
should have been everything you'd need to fulfill the requirements. But no. It
turned out that this tool has a number of fundamental issues:
* You can make the host the tool is running on a sync peer. However, it easily
leads to sync errors if the user just has read-only permissions to the
library.
* You can also download a library but then again it may lead to strange sync
errors.
* It requires a running daemon which crashes irregularly during larger sync
tasks or has other issues.
* Download/sync intervals cannot be set manually.
## The solution
[seafile-mirror](https://src.mehl.mx/mxmehl/seafile-mirror) takes care of all
these stumbling blocks:
* It downloads/syncs defined libraries in customisable intervals
* It de-syncs libaries immediately after they have been downloaded to avoid sync
errors
* You can force-re-sync a library even if its re-sync interval hasn't reached
yet
* Extensive informative and error logging is provided
* Of course created with automation in mind so you can run it in cronjobs or
systemd triggers
* And as explained, it deals with the numerous caveats of `seaf-cli` and Seafile
in general
Full installation and usage documentation can be found in the project
repository. Installation is as simple as running `pip3 install seafile-mirror`,
and a sample configuration is provided.
In my setup, I run this application on a headless server with systemd under a
separate user account. Therefore the systemd service needs to be set up first.
This is also covered in the tool's documentation. And as an Ansible power user,
I also provide an [Ansible
role](https://src.mehl.mx/mxmehl/seafile-mirror-ansible) that does all the setup
and configuration.
## Possible next steps
The tool has been running every day since a couple of months without any issues.
However, I could imagine a few more features to be helpful for more people:
* Support of login tokens: Currently, only user/password auth is supported which
is fine for my use-case as it's just a read-only user. This wouldn't be hard
to fix either, seafile-cli supports it (at least in theory).
([#2](https://src.mehl.mx/mxmehl/seafile-mirror/issues/2))
* Support of encrypted libraries: Shouldn't be a big issue, it would require
passing the password to the underlying seafile-cli command.
([#3](https://src.mehl.mx/mxmehl/seafile-mirror/issues/3))
If you have encountered problems or would like to point out the need for
specific features, please feel free to contact me or comment on the Mastodon
post. I'd also love to hear if you've become a happy user of the tool 😊.