add blog post about seafile-mirror
This commit is contained in:
106
content/blog/2023-09-seafile-mirror.md
Normal file
106
content/blog/2023-09-seafile-mirror.md
Normal file
@@ -0,0 +1,106 @@
|
||||
---
|
||||
title: "Seafile Mirror - Simple automatic backup of your Seafile libraries"
|
||||
date: 2023-09-22
|
||||
categories:
|
||||
- english
|
||||
tags:
|
||||
- python
|
||||
- server
|
||||
- tools
|
||||
headerimage: /blog/library.jpg
|
||||
headercredits: Wouldn't it be a shame if your library were to be destroyed?
|
||||
---
|
||||
|
||||
I have been using [Seafile](https://www.seafile.com/) for years to host and
|
||||
synchronise files on my own server. It's fast and reliable, especially when
|
||||
dealing with a large number and size of files. But making reliable backups of
|
||||
all its files isn't so trivial. This is because the files are stored in a layout
|
||||
similar to bare Git repositories, and Seafile's headless tool, seafile-cli,
|
||||
is... suboptimal. So I created what started out as a wrapper for it and ended up
|
||||
as a full-blown tool for automatically synchronising your libraries to a backup
|
||||
location: [**Seafile Mirror**](https://src.mehl.mx/mxmehl/seafile-mirror).
|
||||
|
||||
## My requirements
|
||||
|
||||
Of course, you could just take snapshots of the whole server, or copy the raw
|
||||
Seafile data files and import them into a newly created Seafile instance as a
|
||||
disaster recovery, but I want to be able to **directly access the current
|
||||
state of the files** whenever I need them in case of an emergency.
|
||||
|
||||
It was also important for me to have a **snapshot**, not just another real-time
|
||||
sync of a library. This is because I also want to have a backup in case I (or an
|
||||
attacker) mess up a Seafile library. A real-time sync would immediately fetch
|
||||
that failed state.
|
||||
|
||||
I also want to take a snapshot at a **configurable interval**. Some libraries
|
||||
should be synchronised more often than others. For example, my picture albums do
|
||||
not change as often as my miscellaneous documents, but they use at least 20
|
||||
times the disk space and therefore network traffic when running a full sync.
|
||||
|
||||
Also, the backup service must have **read-only access** to the files.
|
||||
|
||||
A version controlled backup of the backup (i.e. the plain files) wasn't in
|
||||
scope. I handle this separately by backing up my backup location, which also
|
||||
contains similar backups of other services and machines. For this reason, my
|
||||
current solution does not do incremental backups, even though this may be
|
||||
relevant for other use cases.
|
||||
|
||||
## The problems
|
||||
|
||||
Actually, [seafile-cli](https://help.seafile.com/syncing_client/linux-cli/)
|
||||
should have been everything you'd need to fulfill the requirements. But no. It
|
||||
turned out that this tool has a number of fundamental issues:
|
||||
|
||||
* You can make the host the tool is running on a sync peer. However, it easily
|
||||
leads to sync errors if the user just has read-only permissions to the
|
||||
library.
|
||||
* You can also download a library but then again it may lead to strange sync
|
||||
errors.
|
||||
* It requires a running daemon which crashes irregularly during larger sync
|
||||
tasks or has other issues.
|
||||
* Download/sync intervals cannot be set manually.
|
||||
|
||||
## The solution
|
||||
|
||||
[seafile-mirror](https://src.mehl.mx/mxmehl/seafile-mirror) takes care of all
|
||||
these stumbling blocks:
|
||||
|
||||
* It downloads/syncs defined libraries in customisable intervals
|
||||
* It de-syncs libaries immediately after they have been downloaded to avoid sync
|
||||
errors
|
||||
* You can force-re-sync a library even if its re-sync interval hasn't reached
|
||||
yet
|
||||
* Extensive informative and error logging is provided
|
||||
* Of course created with automation in mind so you can run it in cronjobs or
|
||||
systemd triggers
|
||||
* And as explained, it deals with the numerous caveats of `seaf-cli` and Seafile
|
||||
in general
|
||||
|
||||
Full installation and usage documentation can be found in the project
|
||||
repository. Installation is as simple as running `pip3 install seafile-mirror`,
|
||||
and a sample configuration is provided.
|
||||
|
||||
In my setup, I run this application on a headless server with systemd under a
|
||||
separate user account. Therefore the systemd service needs to be set up first.
|
||||
This is also covered in the tool's documentation. And as an Ansible power user,
|
||||
I also provide an [Ansible
|
||||
role](https://src.mehl.mx/mxmehl/seafile-mirror-ansible) that does all the setup
|
||||
and configuration.
|
||||
|
||||
|
||||
## Possible next steps
|
||||
|
||||
The tool has been running every day since a couple of months without any issues.
|
||||
However, I could imagine a few more features to be helpful for more people:
|
||||
|
||||
* Support of login tokens: Currently, only user/password auth is supported which
|
||||
is fine for my use-case as it's just a read-only user. This wouldn't be hard
|
||||
to fix either, seafile-cli supports it (at least in theory).
|
||||
([#2](https://src.mehl.mx/mxmehl/seafile-mirror/issues/2))
|
||||
* Support of encrypted libraries: Shouldn't be a big issue, it would require
|
||||
passing the password to the underlying seafile-cli command.
|
||||
([#3](https://src.mehl.mx/mxmehl/seafile-mirror/issues/3))
|
||||
|
||||
If you have encountered problems or would like to point out the need for
|
||||
specific features, please feel free to contact me or comment on the Mastodon
|
||||
post. I'd also love to hear if you've become a happy user of the tool 😊.
|
||||
Reference in New Issue
Block a user