XS backup restore: Difference between revisions

From OLPC
Jump to navigation Jump to search
No edit summary
m (Adding SchoolServer category)
 
(10 intermediate revisions by 2 users not shown)
Line 26: Line 26:
''<protocol version>'' is the integer representing the latest
''<protocol version>'' is the integer representing the latest
backup protocol version supported by this XO. In protocol version 1,
backup protocol version supported by this XO. In protocol version 1,
a successful reply consists of a single integer:
a successful reply is a 200 OK with an empty body.
'''timestamp''' -- timestamp of latest backed up item for this user
or 0 if there are no previous backups
If the sent protocol version is not supported by the school server,
If the sent protocol version is not supported by the school server,
Line 55: Line 52:
overwriting (atomically) any previously existing version.
overwriting (atomically) any previously existing version.
4. Run rsync between the datastore and a remote directory called '''datastore-current/'''
4. Run rsync-over-ssh between the datastore and a remote directory
in the user's home directory on the school server.
called '''datastore-current/''' in the user's home directory on
the XS.

The remote datastore-current directory will have a complete set of files
The remote datastore-current directory will have a complete set of files
so use the rsync facilities available to optimise the transfer and
so use the rsync facilities available to optimise the transfer and
delete stale files:
delete stale files:

--times
--times
--partial (to make retries faster)
--partial (to make retries faster)
Line 68: Line 66:
Check the exit value from rsync. If non-zero, retry up to 3 times.
Check the exit value from rsync. If non-zero, retry up to 3 times.
If still non-zero, abort until next backup.
If still non-zero, abort until next backup.

5. Store the epoch of the end time of step 4.
5. Store the epoch of the end time of step 4.


'''Note''': This backup scheme is not atomic. Users of the backed-up data must be prepared for slightly inconsistent state between metadata and files - a large window exists between steps 3 and 5. Solutions to this could come from the FS (a ZFS-like implementation) or from a higher-level later (a git-based DS for example).
'''Note''': This backup scheme is not atomic. Users of the backed-up data must be prepared for slightly inconsistent state between metadata and files - a large window exists between steps 3 and 5. Solutions to this could come from the FS (a ZFS-like implementation) or from a higher-level layer (a git-based DS for example).


== XS side ==
== XS side ==
Line 86: Line 84:
based on transfer rate or number of rsync processes), if so, return
based on transfer rate or number of rsync processes), if so, return
503.
503.
3. Check if backups for this machine exist. In protocol version 1, if
3. Check if backups for this machine exist. In protocol version 1, if
backups don't exist, let timestamp be 0. Otherwise, find the
backups don't exist, let timestamp be 0. Otherwise, find the
Line 92: Line 90:
it.
it.
4. Check system and network load metrics - can we offer service to this
4. Return timestamp in the body of a 200 OK
response.
client?



=== request for '''new''' ===

On the school server, when getting a request for
'''/backup/''<protocol version>''/new/''<SN>''''':
1. Check if we support the protocol version. If not, return 404 and a list
5. Return a 200 OK response.
of supported versions. Otherwise, proceed.


When the rsync-over-ssh connection comes in, we need to have an rsync wrapper
==== protocol version 1 ====
script that will


1. Establish a lock using flock to prevent overlaps
2. If no 'Backup-Auth' header is present, return 403, otherwise
proceed.
3. Load the contents of the '''nonce''' file from the backup hierarchy for
this XO (e.g. /backups/''<SN>''/nonce) in the nonce variable. If there
is no nonce file, use '0' for the nonce variable.
2. Cleanup/sanitise parameter list to rsync
4. Find the XO's UUID in the local database, load into '''XO_UUID'''
variable. Verify that the contents of the '''Backup-Auth''' header
match exactly the contents of SHA1(''<nonce>''+''<XO_UUID>''). If not,
return 403, otherwise return empty (no body) 200 OK request to the
client and proceed to next step.
(Note: the nonce circus is required to keep a malicious actor
from inhibiting all backups on his network by watching for /last
GETs, then issuing /new gets 5 seconds later for the same XO. As
the backup won't have completed, getting an updater running on the
server would invalidate the backup, as will be seen in the following
steps.)
5. Spawn an updater process in the background that does this:
5.1. Issue a call to a setuid helper command that makes the
'backup-new' folder in the XO's home directory (on the server)
writable by the updater UID.
5.2. Check if a file exists in the XO's home directory, within the
dir '''backup-new''', called '''backup.idx.processing'''. If the file
does not exist, go to step '''5.3'''.
If its timestamp is NOT older than 10 minutes, exit the
updater. (We don't allow users to force us to do index
updates for backups more frequently than once in 10 minutes.)
If the timestamp is older than 10 minutes:
* Check if a file called '''backup.idx.processing.pid''' exists
AND is owned by us. If so, read its contents -- it contains
a PID of the updater that tried to deal with the new backup
-- and issue a SIGKILL to that PID.
* Go to step '''5.4'''.
5.3. Move '''backup.idx''' to '''backup.idx.processing''' and write our own
PID to '''backup.idx.processing.pid'''. If the move
fails (because '''backup.idx''' doesn't exist), go to last step.
Check if 'backup-state.idx' exists. If not, go to last step.
5.4. Read '''backup.idx.processing''' by line. The first line is a
single backup protocol integer. If this updater doesn't
support this version, the client sent a backup even though we
told it not to. Go to last step.
For every following line, check that the object filename it
references exists in the '''backup-new''' folder. If it exists,
move this file to the server's real backup hierarchy,
e.g. /backups/''<SN>''/ and add a record to the server's backup DB
backend (whatever it is) for this object. If the file doesn't
exist, move to next line.
5.5. Move '''backup-state.idx''' to server backup hierarchy, e.g.
/backup/''<SN>''/backup-state.idx. Generate a 64-bit nonce
and write it out to /backup/''<SN>''/nonce.
5.6. Delete everything in '''backup-new''' and exit the updater.
3. Upon successful completion, set a success flag

== XS maintenance ==


* A regular cronjob checks for recent success flags. Home directories that are marked as successfully backed up will be 'shadowed' with a hardcopy script similar to pdumpfs.
** It might be a good idea to spot partial/failed backups and checkpoint/shadow them anyway. If our handling of inconsistent data is reasonably good, a partial backup might be a passable data source for per-document restores.
* A low-freq cronjob runs [http://code.google.com/p/hardlinkpy/ hardlink.py]
* A cronjob removes old pdumpfs snapshots, ideally with some auto-tuning for space usage.


= XO-initiated full restore =
= XO-initiated full restore =


== XO side ==
== XO side ==

1. Issue a HTTP GET to the XS with path
1. Issue a HTTP GET to the XS with path
/backup/''<protocol version>''/restore/''<this_XO_serial_number>''
/backup/''<protocol version>''/restore/''<this_XO_serial_number>''
Line 193: Line 130:
otherwise proceed.
otherwise proceed.
2. rsync the directory provided in step 1, restoring mode and
2. Let variable index_path be the concatenation of the path variable
times. Retry 3 times; if still failing, abort restore and
from step 1 and the string '''restore.idx'''. Rsync the file whose path
report to user.
is index_path from the XS to the XO. This file is a set of lines
formatted like the contents of a '''backup.idx''' file -- produced
exactly like in step 4 of the XO-side backup. In other words, the
first line repeats the protocol version, and every next line
describes a single DS object. If the rsync fails, retry 3 times;
if still failing, abort restore and report to user.
(Do we need to remove the fetched files in case of a dropped
3. For every item in this list, parse out any paths to files,
rsync? rsync guarantees we won't get partial files in place, so
and write each one (e.g. one for the binary object, one for the
it is reasonably safe, and makes retries "incremental". As long
thumbnail) to a local file called '''restore-files.idx''', one per line.
as the metadata is restored only once step 2 succeeds, the Journal
should be ok...)
3. Rebuild the metadata in Xapian, based on the metadata.json file
Note that the paths contained in 'restore.idx' that we
that should have been restored by rsync in #2. Might make sense
received in step 2 are absolute paths '''on the schoolserver''',
to apply some checks.
e.g. /backups/''<SN>''/''<filename>'', and those paths MUST be preserved
when writing to '''restore-files.idx'''.
Check
4. Run a rsync on the XO, going from the schoolserver to the XO, and
- do the files named by the metadata file exist?
pass it '''restore-files.idx''' as the list of files to rsync.
The race conditions that exist during the backup generation mean that
5. Check the rsync exit value. If non-zero, retry 3 times. If still
the document may have changed or vanished after the metadata was created.
non-zero, abort and report failure to the user.
4. We have succeeded with the restore. Inform
6. Go back through the list received in step 2 line by line. For
user. '''Eat some ice cream. Dance salsa'''
every file path in the current line (there might be several for
e.g. binary object, thumbnail, etc), strip everything except the
filename -- remove the directory components. Verify that the files
exist locally on the XO.
If they don't exist, rsync didn't get all the files back, but there
should have been some (because we didn't get 0 for timestamp in
step 1) AND rsync thinks it succeeded (because of step 5). Abort
restore and report to user that something is wrong.
If the files exist, issue a request to the DS to create the object
based on the metadata in the line, and pass in stripped file paths
for the contents/thumbnail.
(Note: if the DS does not support setting creation timestamps or
thumbnails through the present API, another function might have to
be added specifically for the restore system to use, where such
functions are allowed.)
7. If the last line in the list returned in step 1 is processed and
stored in the DS, we have succeeded with the restore. Inform
user. '''Eat some ice cream. Do the macarena.'''


== XS side ==
== XS side ==
Line 252: Line 164:
only body contents is '''0'''. Otherwise, proceed.
only body contents is '''0'''. Otherwise, proceed.
3. Check if a file called '''restore.idx''' exists in the backup hierarchy
3. Check for system and network traffic load metrics. Return 503 for "not now".
for the XO. If so, return absolute path to this XO's files in the
server backup hierarchy (e.g. /backups/''<SN>''/) as sole body of a 200
OK response. If it doesn't exist, proceed.
4. Check if a file called '''restore-state.idx''' in the backup hierarchy
for the XO exists. If not, return error 500. For some reason we don't
have a state file for this machine; this shouldn't happen, but it means
the user has to pick out objects to restore individually from the
web interface.
5. Return 503 service unavailable, and in the background, spawn
a restore process that does the following:
4. Find the latest complete backup - it should be the most recent directory
5.1. Check if a file called '''restore-state.idx.processing''' in the
following this format in the home directory for the laptop:
backup hierarchy for this XO exists. If not, proceed to next
step. If it exists, and its timestamp is older than 10
minutes, we tried to prepare a restore list for this machine
already and somehow failed (e.g. database timeouts,
etc). Check if '''restore-state.idx.processing.pid''' exists and
is owned by us; if so, load its contents and send SIGKILL
to the PID, then move to step '''5.3'''. If the timestamp is
younger than 10 minutes, exit.
5.2. Move '''restore-state.idx''' in the XO's backup hierarchy to
'''restore-state.processing.idx'''
~/datastore-YYYY-MM-DD
5.3. Write our own PID to '''restore-state.processing.idx.pid'''.
Note: 'Most recent' should be intepreted on the parsed datestamp from the
5.4. To a temporary file, write a line containing the backup
directory name, not the FS ctime/mtime.
protocol version.
Return the directory path in a '200' response.
5.5. For each line in '''restore-state.processing.idx'''

(representing a UUID), query all the relevant metadata from
= Listing of stored backups =
the XS store and write it, one JSON dictionary-encoded line

per object, to the temporary file. Paths of any referenced
The XS will also answer requests to
files (binary objects, thumbnails) must be absolute paths on

the XS. If any queries fail, retry with a timeout, and if
/backup/''<protocol version>''/list/''<SN>''
failure continues, exit the updater.

with a 200 OK message with a body of new-line-separated paths to available snapshots. The XO client can then initiate a restore of any of those available snapshots over SSH.
5.6. When finished, move temporary file to '''restore.idx''' in the

backup hierarchy for this XO. Unlink
[[Category:Specifications]]
'''restore-state.processing.idx'''.
[[category:SchoolServer]]

Latest revision as of 23:21, 18 August 2008

Goals

  • Simple, efficient (minimise processing, traffic), quick dev turnaround, debuggable
  • Sane, fail-safe, atomic-ish
  • Independent of the actual storage strategy (DS-agnostic)
  • And yet, it must work well with the current DS (as of April 2008), and avoid restricting the evolution of the DS
  • Safe for XO and XS
  • The server can refuse to backup due to traffic/load
  • Simple version negotiation
  • Supports full homedir restore
  • Supports per-document restore (via journal and/or webbased)
    • There is some interest in leveraging a webbased 'document restore' facility as 'async document share/publish' mechanism.

Notes

  • All timestamps are integers representing seconds elapsed since the UNIX epoch.
  • There is a REST meta-protocol versioning scheme. Outside of that initialcheck, what this page describes is the version "1" of the backup/restore protocol.

XO-initiated backup

XO side

1. Issue a HTTP GET to XS with path 
  /backup/<protocol version>/available/<this_XO_serial_number>
  
  <protocol version> is the integer representing the latest
  backup protocol version supported by this XO. In protocol version 1,
  a successful reply is a 200 OK with an empty body.

  If the sent protocol version is not supported by the school server,
  it will return a 404 not found error, whose only body contents is 
  a comma-separated list of integers representing the backup protocol
  versions supported by this school server.

  If this school server refuses to provide backup service for this XO,
  it will return a 403 forbidden error.

  If the school server is too busy to deal with the XO's backup request,
  it will return a 503 service unavailable error. The XO will sleep 5
  minutes and retry.


2. If the request in step 1 succeeded, go to step 3. Otherwise, 
  if none of the backup system versions on the XO (multiple may be
  present) are in the 'versions' variable listed in the 404 error, abort
  until next scheduled backup time (we cannot back up to this XS). If
  a version was returned that also exists locally, go back to step 1
  and use that protocol version.

3. Write out all the metadata for all the documents available for
  backup, in CanonicalJSON format. Save it as metadata.json
  overwriting (atomically) any previously existing version.

4. Run rsync-over-ssh between the datastore and a remote directory
  called datastore-current/ in the user's home directory on
  the XS.

  The remote datastore-current directory will have a complete set of files
  so use the rsync facilities available to optimise the transfer and 
  delete stale files:

   --times
   --partial (to make retries faster)
   --delete
 
  Check the exit value from rsync. If non-zero, retry up to 3 times.
  If still non-zero, abort until next backup.

5. Store the epoch of the end time of step 4.

Note: This backup scheme is not atomic. Users of the backed-up data must be prepared for slightly inconsistent state between metadata and files - a large window exists between steps 3 and 5. Solutions to this could come from the FS (a ZFS-like implementation) or from a higher-level layer (a git-based DS for example).

XS side

On the school server, when getting a request for /backup/<protocol version>/available/<SN>:

1. Check if we support the protocol version. If not, return 404 and a list
  of supported versions. Otherwise, proceed.

2. Check if we know this machine (can find it in our registration DB on
  the XS). If not, return 403. We will not offer it backup service.
  Check if we're too busy to process another concurrent backup (e.g.
  based on transfer rate or number of rsync processes), if so, return
  503.
 
3. Check if backups for this machine exist. In protocol version 1, if
  backups don't exist, let timestamp be 0. Otherwise, find the
  timestamp of the last backed-up object for this machine and return
  it.

4. Check system and network load metrics - can we offer service to this
  client?


5. Return a 200 OK response.

When the rsync-over-ssh connection comes in, we need to have an rsync wrapper script that will

1. Establish a lock using flock to prevent overlaps

2. Cleanup/sanitise parameter list to rsync

3. Upon successful completion, set a success flag

XS maintenance

  • A regular cronjob checks for recent success flags. Home directories that are marked as successfully backed up will be 'shadowed' with a hardcopy script similar to pdumpfs.
    • It might be a good idea to spot partial/failed backups and checkpoint/shadow them anyway. If our handling of inconsistent data is reasonably good, a partial backup might be a passable data source for per-document restores.
  • A low-freq cronjob runs hardlink.py
  • A cronjob removes old pdumpfs snapshots, ideally with some auto-tuning for space usage.

XO-initiated full restore

XO side

1. Issue a HTTP GET to the XS with path
  /backup/<protocol version>/restore/<this_XO_serial_number>

  The response is 0 or a single absolute path on the XS, pointing to
  the location of this XO's backup files in the backup hierarchy. If
  the response is 0, abort and report to user; there are no backups
  to restore. Otherwise store the path variable for future use.

  If the request returns a 500, abort and report to user that they
  must pick out restore files individually from the web interface.

  If the request returns a 503, wait 1 minute, then retry step 1,
  otherwise proceed.

2. rsync the directory provided in step 1, restoring mode and
  times. Retry 3 times; if still failing, abort restore and
  report to user.

  (Do we need to remove the fetched files in case of a dropped
  rsync? rsync guarantees we won't get partial files in place, so
  it is reasonably safe, and makes retries "incremental". As long
  as the metadata is restored only once step 2 succeeds, the Journal
  should be ok...)

3. Rebuild the metadata in Xapian, based on the metadata.json file
  that should have been restored by rsync in #2. Might make sense
  to apply some checks. 

  Check
  - do the files named by the metadata file exist?

  The race conditions that exist during the backup generation mean that
  the document may have changed or vanished after the metadata was created.
 
4. We have succeeded with the restore. Inform
  user. Eat some ice cream. Dance salsa

XS side

On the school server, when getting a request for /backup/<protocol version>/restore/<SN>:

1. Check if we support the protocol version. If not, return 404 and a list
  of supported versions. Otherwise, proceed.

2. Check if backups for this machine exist. If not, return 200 OK whose
  only body contents is 0. Otherwise, proceed.

3. Check for system and network traffic load metrics. Return 503 for "not now".

4. Find the latest complete backup - it should be the most recent directory
   following this format in the home directory for the laptop: 

   ~/datastore-YYYY-MM-DD

  Note: 'Most recent' should be intepreted on the parsed datestamp from the
  directory name, not the FS ctime/mtime.

  Return the directory path in a '200' response.

Listing of stored backups

The XS will also answer requests to

   /backup/<protocol version>/list/<SN>

with a 200 OK message with a body of new-line-separated paths to available snapshots. The XO client can then initiate a restore of any of those available snapshots over SSH.