Commit a1050d79 authored by Denis Zalevskiy's avatar Denis Zalevskiy


Signed-off-by: Denis Zalevskiy's avatarDenis Zalevskiy <>
This diff is collapsed.
* the-vault
Set of tools to perform backup/restoration of the data.
It uses git for data and metadata management. Binaries are stored
separately now inside .git repository. Some ideas are borrowed from
git-annex but this framework has more permissive license.
** Rationale
Application developers ordinary understand backup as a process of
performing a snapshots of an application state and restoration as a
replacement of application state with a previously saved state. So,
the common strategy of backup is to freeze application workflow, make
a snapshot of application data and continue to work. Restoration is
the reverse process.
This is the bad practice: imagine, application database is damaged as
a result of some bug in application code (this is becomes more and
more common) and initially there is no any visible changes in
application behaviour, user performs backup, but later user notices
issues in application/system behaviour and want to restore its state
from backup was done eariler. But because database was corrupted
before, restoring data from this backup will result in the same issues
to reappear soon. Also, maybe user added some data in the mean time,
so restoration also will result in this data will be lost. Finally,
user still has damaged database while one lost data he/she added in
the meantime.
The proper and flexible way to backup/restore application data is to
work with exporting/importing structured data and saving/restoring
opaque data. So, like in relational databases world application data
can be separated:
- data (structured) can be represented in human-readable form
(e.g. for relational database this is SQL, for contacts database it
can be vCard files, maybe, EXIF information from images, MP3 tags
etc.). Also small binary files can be put here
- opaque, binary data aka blobs (e.g. images, videos, audio files, can
be also something like PDF files (potentially can be exported but it
is hard to do it and can result in some data loss), maybe documents
in binary formats etc.
So, application communication with backup world is described in terms
of the following operations: export/import/clear
** Backup API
Each application should register path to executable to be invoked for
backup/restore. The API itself is just a set of command line options.
- --dir -- points to destination directory to save application
structured data;
- --bin-dir -- destination directory for blobs;
- --home-dir -- user home directory path;
- --action -- which action should be executed. Possible values are:
import, export, clear.
** TODO Examples
** Planned features
- Huge files should be detected and managed automatically in many
- It looks like it is possible to improve usability replacing separate
storage for BLOBs (.git/blobs) with git submodule with
pack.windowmemory set to finite value - multiply of device RAM and
splitting BLOBs to smaller chunks. So, it will be possible to clone
repo and submodule using standard git w/o any enhancements.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment