Annotation of embedaddon/rsync/rsync3.txt, revision 1.1.1.2

1.1       misho       1: -*- indented-text -*-
                      2: 
                      3: Notes towards a new version of rsync
                      4: Martin Pool <mbp@samba.org>, September 2001.
                      5: 
                      6: 
                      7: Good things about the current implementation:
                      8: 
                      9:   - Widely known and adopted.
                     10: 
                     11:   - Fast/efficient, especially for moderately small sets of files over
                     12:     slow links (transoceanic or modem.)
                     13: 
                     14:   - Fairly reliable.
                     15: 
                     16:   - The choice of running over a plain TCP socket or tunneling over
                     17:     ssh.
                     18: 
                     19:   - rsync operations are idempotent: you can always run the same
                     20:     command twice to make sure it worked properly without any fear.
                     21:     (Are there any exceptions?)
                     22: 
                     23:   - Small changes to files cause small deltas.
                     24: 
                     25:   - There is a way to evolve the protocol to some extent.
                     26: 
                     27:   - rdiff and rsync --write-batch allow generation of standalone patch
                     28:     sets.  rsync+ is pretty cheesy, though.  xdelta seems cleaner.
                     29: 
                     30:   - Process triangle is creative, but seems to provoke OS bugs.
                     31: 
                     32:   - "Morning-after property": you don't need to know anything on the
                     33:     local machine about the state of the remote machine, or about
                     34:     transfers that have been done in the past.
                     35: 
                     36:   - You can easily push or pull simply by switching the order of
                     37:     files.
                     38: 
                     39:   - The "modules" system has some neat features compared to
                     40:     e.g. Apache's per-directory configuration.  In particular, because
                     41:     you can set a userid and chroot directory, there is strong
                     42:     protection between different modules.  I haven't seen any calls
                     43:     for a more flexible system.
                     44: 
                     45: 
                     46: Bad things about the current implementation:
                     47: 
                     48:   - Persistent and hard-to-diagnose hang bugs remain
                     49: 
                     50:   - Protocol is sketchily documented, tied to this implementation, and
                     51:     hard to modify/extend
                     52: 
                     53:   - Both the program and the protocol assume a single non-interactive
                     54:     one-way transfer
                     55: 
                     56:   - A list of all files are held in memory for the entire transfer,
                     57:     which cripples scalability to large file trees
                     58: 
                     59:   - Opening a new socket for every operation causes problems,
                     60:     especially when running over SSH with password authentication.
                     61: 
                     62:   - Renamed files are not handled: the old file is removed, and the
                     63:     new file created from scratch.
                     64: 
                     65:   - The versioning approach assumes that future versions of the
                     66:     program know about all previous versions, and will do the right
                     67:     thing.
                     68: 
                     69:   - People always get confused about ':' vs '::'
                     70: 
                     71:   - Error messages can be cryptic.
                     72: 
                     73:   - Default behaviour is not intuitive: in too many cases rsync will
                     74:     happily do nothing.  Perhaps -a should be the default?
                     75: 
                     76:   - People get confused by trailing slashes, though it's hard to think
                     77:     of another reasonable way to make this necessary distinction
                     78:     between a directory and its contents.
                     79: 
                     80: 
                     81: Protocol philosophy:
                     82: 
                     83:    *The* big difference between protocols like HTTP, FTP, and NFS is
                     84:     that their fundamental operations are "read this file", "delete
                     85:     this file", and "make this directory", whereas rsync is "make this
                     86:     directory like this one".
                     87: 
                     88: 
                     89: Questionable features:
                     90: 
                     91:   These are neat, but not necessarily clean or worth preserving.
                     92: 
                     93:   - The remote rsync can be wrapped by some other program, such as in
                     94:     tridge's rsync-mail scripts.  The general feature of sending and
                     95:     retrieving mail over rsync is good, but this is perhaps not the
                     96:     right way to implement it.
                     97: 
                     98: 
                     99: Desirable features:
                    100: 
                    101:   These don't really require architectural changes; they're just
                    102:   something to keep in mind.
                    103: 
                    104:   - Synchronize ACLs and extended attributes
                    105: 
                    106:   - Anonymous servers should be efficient
                    107: 
                    108:   - Code should be portable to non-UNIX systems
                    109: 
                    110:   - Should be possible to document the protocol in RFC form
                    111: 
                    112:   - --dry-run option
                    113: 
                    114:   - IPv6 support.  Pretty straightforward.
                    115: 
                    116:   - Allow the basis and destination files to be different.  For
                    117:     example, you could use this when you have a CD-ROM and want to
                    118:     download an updated image onto a hard drive.
                    119: 
                    120:   - Efficiently interrupt and restart a transfer.  We can write a
                    121:     checkpoint file that says where we're up to in the filesystem.
                    122:     Alternatively, as long as transfers are idempotent, we can just
                    123:     restart the whole thing.  [NFSv4]
                    124: 
                    125:   - Scripting support.
                    126: 
                    127:   - Propagate atimes and do not modify them.  This is very ugly on
                    128:     Unix.  It might be better to try to add O_NOATIME to kernels, and
                    129:     call that.
                    130: 
                    131:   - Unicode.  Probably just use UTF-8 for everything.
                    132: 
                    133:   - Open authentication system.  Can we use PAM?  Is SASL an adequate
                    134:     mapping of PAM to the network, or useful in some other way?
                    135: 
                    136:   - Resume interrupted transfers without the --partial flag.  We need
                    137:     to leave the temporary file behind, and then know to use it.  This
                    138:     leaves a risk of large temporary files accumulating, which is not
                    139:     good.  Perhaps it should be off by default.
                    140: 
                    141:   - tcpwrappers support.  Should be trivial; can already be done
                    142:     through tcpd or inetd.
                    143: 
                    144:   - Socks support built in.  It's not clear this is any better than
                    145:     just linking against the socks library, though.
                    146: 
                    147:   - When run over SSH, invoke with predictable command-line arguments,
                    148:     so that people can restrict what commands sshd will run.  (Is this
                    149:     really required?)
                    150: 
                    151:   - Comparison mode: give a list of which files are new, gone, or
                    152:     different.  Set return code depending on whether anything has
                    153:     changed.
                    154: 
                    155:   - Internationalized messages (gettext?)
                    156: 
                    157:   - Optionally use real regexps rather than globs?
                    158: 
                    159:   - Show overall progress.  Pretty hard to do, especially if we insist
                    160:     on not scanning the directory tree up front.
                    161: 
                    162: 
                    163: Regression testing:
                    164: 
                    165:   - Support automatic testing.
                    166: 
                    167:   - Have hard internal timeouts against hangs.
                    168: 
                    169:   - Be deterministic.
                    170: 
                    171:   - Measure performance.
                    172: 
                    173: 
                    174: Hard links:
                    175: 
                    176:   At the moment, we can recreate hard links, but it's a bit
                    177:   inefficient: it depends on holding a list of all files in the tree.
                    178:   Every time we see a file with a linkcount >1, we need to search for
                    179:   another known name that has the same (fsid,inum) tuple.  We could do
                    180:   that more efficiently by keeping a list of only files with
                    181:   linkcount>1, and removing files from that list as all their names
                    182:   become known.
                    183: 
                    184: 
                    185: Command-line options:
                    186: 
                    187:   We have rather a lot at the moment.  We might get more if the tool
                    188:   becomes more flexible.  Do we need a .rc or configuration file?
                    189:   That wouldn't really fit with its pattern of use: cp and tar don't
                    190:   have them, though ssh does.
                    191: 
                    192: 
                    193: Scripting issues:
                    194: 
                    195:   - Perhaps support multiple scripting languages: candidates include
                    196:     Perl, Python, Tcl, Scheme (guile?), sh, ...
                    197: 
                    198:   - Simply running a subprocess and looking at its stdout/exit code
                    199:     might be sufficient, though it could also be pretty slow if it's
                    200:     called often.
                    201: 
                    202:   - There are security issues about running remote code, at least if
                    203:     it's not running in the users own account.  So we can either
                    204:     disallow it, or use some kind of sandbox system.
                    205: 
                    206:   - Python is a good language, but the syntax is not so good for
                    207:     giving small fragments on the command line.
                    208: 
                    209:   - Tcl is broken Lisp.
                    210: 
                    211:   - Lots of sysadmins know Perl, though Perl can give some bizarre or
                    212:     confusing errors.  The built in stat operators and regexps might
                    213:     be useful.
                    214: 
                    215:   - Sadly probably not enough people know Scheme.
                    216: 
                    217:   - sh is hard to embed.
                    218: 
                    219: 
                    220: Scripting hooks:
                    221: 
                    222:   - Whether to transfer a file
                    223: 
                    224:   - What basis file to use
                    225: 
                    226:   - Logging
                    227: 
                    228:   - Whether to allow transfers (for public servers)
                    229: 
                    230:   - Authentication
                    231: 
                    232:   - Locking
                    233: 
                    234:   - Cache
                    235: 
                    236:   - Generating backup path/name.
                    237: 
                    238:   - Post-processing of backups, e.g. to do compression.
                    239: 
                    240:   - After transfer, before replacement: so that we can spit out a diff
                    241:     of what was changed, or kick off some kind of reconciliation
                    242:     process.
                    243: 
                    244: 
                    245: VFS:
                    246: 
                    247:   Rather than talking straight to the filesystem, rsyncd talks through
                    248:   an internal API.  Samba has one.  Is it useful?
                    249: 
                    250:   - Could be a tidy way to implement cached signatures.
                    251: 
                    252:   - Keep files compressed on disk?
                    253: 
                    254: 
                    255: Interactive interface:
                    256: 
                    257:   - Something like ncFTP, or integration into GNOME-vfs.  Probably
                    258:     hold a single socket connection open.
                    259: 
                    260:   - Can either call us as a separate process, or as a library.
                    261: 
                    262:   - The standalone process needs to produce output in a form easily
                    263:     digestible by a calling program, like the --emacs feature some
                    264:     have.  Same goes for output: rpm outputs a series of hash symbols,
                    265:     which are easier for a GUI to handle than "\r30% complete"
                    266:     strings.
                    267: 
                    268:   - Yow!  emacs support.  (You could probably build that already, of
                    269:     course.)  I'd like to be able to write a simple script on a remote
                    270:     machine that rsyncs it to my workstation, edits it there, then
                    271:     pushes it back up.
                    272: 
                    273: 
                    274: Pie-in-the-sky features:
                    275: 
                    276:   These might have a severe impact on the protocol, and are not
                    277:   clearly in our core requirements.  It looks like in many of them
                    278:   having scripting hooks will allow us
                    279: 
                    280:   - Transport over UDP multicast.  The hard part is handling multiple
                    281:     destinations which have different basis files.  We can look at
                    282:     multicast-TFTP for inspiration.
                    283: 
                    284:   - Conflict resolution.  Possibly general scripting support will be
                    285:     sufficient.
                    286: 
                    287:   - Integrate with locking.  It's hard to see a good general solution,
                    288:     because Unix systems have several locking mechanisms, and grabbing
                    289:     the lock from programs that don't expect it could cause deadlocks,
                    290:     timeouts, or other problems.  Scripting support might help.
                    291: 
                    292:   - Replicate in place, rather than to a temporary file.  This is
                    293:     dangerous in the case of interruption, and it also means that the
                    294:     delta can't refer to blocks that have already been overwritten.
                    295:     On the other hand we could semi-trivially do this at first by
                    296:     simply generating a delta with no copy instructions.
                    297: 
                    298:   - Replicate block devices.  Most of the difficulties here are to do
                    299:     with replication in place, though on some systems we will also
                    300:     have to do I/O on block boundaries.
                    301: 
                    302:   - Peer to peer features.  Flavour of the year.  Can we think about
                    303:     ways for clients to smoothly and voluntarily become servers for
                    304:     content they receive?
                    305: 
                    306:   - Imagine a situation where the destination has a much faster link
                    307:     to the cloud than the source.  In this case, Mojo Nation downloads
                    308:     interleaved blocks from several slower servers.  The general
                    309:     situation might be a way for a master rsync process to farm out
                    310:     tasks to several subjobs.  In this particular case they'd need
                    311:     different sockets.  This might be related to multicast.
                    312: 
                    313: 
                    314: Unlikely features:
                    315: 
                    316:   - Allow remote source and destination.  If this can be cleanly
                    317:     designed into the protocol, perhaps with the remote machine acting
                    318:     as a kind of echo, then it's good.  It's uncommon enough that we
                    319:     don't want to shape the whole protocol around it, though.
                    320: 
                    321:     In fact, in a triangle of machines there are two possibilities:
                    322:     all traffic passes from remote1 to remote2 through local, or local
                    323:     just sets up the transfer and then remote1 talks to remote2.  FTP
                    324:     supports the second but it's not clearly good.  There are some
                    325:     security problems with being able to instruct one machine to open
                    326:     a connection to another.
                    327: 
                    328: 
                    329: In favour of evolving the protocol:
                    330: 
                    331:   - Keeping compatibility with existing rsync servers will help with
                    332:     adoption and testing.
                    333: 
                    334:   - We should at the very least be able to fall back to the new
                    335:     protocol.
                    336: 
                    337:   - Error handling is not so good.
                    338: 
                    339: 
                    340: In favour of using a new protocol:
                    341: 
                    342:   - Maintaining compatibility might soak up development time that
                    343:     would better go into improving a new protocol.
                    344: 
                    345:   - If we start from scratch, it can be documented as we go, and we
                    346:     can avoid design decisions that make the protocol complex or
                    347:     implementation-bound.
                    348: 
                    349: 
                    350: Error handling:
                    351: 
                    352:   - Errors should come back reliably, and be clearly associated with
                    353:     the particular file that caused the problem.
                    354: 
                    355:   - Some errors ought to cause the whole transfer to abort; some are
                    356:     just warnings.  If any errors have occurred, then rsync ought to
                    357:     return an error.
                    358: 
                    359: 
                    360: Concurrency:
                    361: 
                    362:   - We want to keep the CPU, filesystem, and network as full as
                    363:     possible as much of the time as possible.
                    364: 
                    365:   - We can do nonblocking network IO, but not so for disk.
                    366: 
                    367:   - It makes sense to on the destination be generating signatures and
                    368:     applying patches at the same time.
                    369: 
                    370:   - Can structure this with nonblocking, threads, separate processes,
                    371:     etc.
                    372: 
                    373: 
                    374: Uses:
                    375: 
                    376:   - Mirroring software distributions:
                    377: 
                    378:   - Synchronizing laptop and desktop
                    379: 
                    380:   - NFS filesystem migration/replication.  See
                    381:     http://www.ietf.org/proceedings/00jul/00july-133.htm#P24510_1276764
                    382: 
                    383:   - Sync with PDA
                    384: 
                    385:   - Network backup systems
                    386: 
                    387:   - CVS filemover
                    388: 
                    389: 
                    390: Conflict resolution:
                    391: 
                    392:   - Requires application-specific knowledge.  We want to provide
                    393:     policy, rather than mechanism.
                    394: 
                    395:   - Possibly allowing two-way migration across a single connection
                    396:     would be useful.
                    397: 
                    398: 
1.1.1.2 ! misho     399: Moved files:
1.1       misho     400: 
                    401:   - There's no trivial way to detect renamed files, especially if they
                    402:     move between directories.
                    403: 
                    404:   - If we had a picture of the remote directory from last time on
                    405:     either machine, then the inode numbers might give us a hint about
                    406:     files which may have been renamed.
                    407: 
                    408:   - Files that are renamed and not modified can be detected by
                    409:     examining the directory listing, looking for files with the same
                    410:     size/date as the origin.
                    411: 
                    412: 
                    413: Filesystem migration:
                    414: 
                    415:   NFSv4 probably wants to migrate file locks, but that's not really
                    416:   our problem.
                    417: 
                    418: 
                    419: Atomic updates:
                    420: 
                    421:   The NFSv4 working group wants atomic migration.  Most of the
                    422:   responsibility for this lies on the NFS server or OS.
                    423: 
                    424:   If migrating a whole tree, then we could do a nearly-atomic rename
                    425:   at the end.  This ties in to having separate basis and destination
                    426:   files.
                    427: 
                    428:   There's no way in Unix to replace a whole set of files atomically.
                    429:   However, if we get them all onto the destination machine and then do
                    430:   the updates quickly it would greatly reduce the window.
                    431: 
                    432: 
                    433: Scalability:
                    434: 
                    435:   We should aim to work well on machines in use in a year or two.
                    436:   That probably means transfers of many millions of files in one
                    437:   batch, and gigabytes or terabytes of data.
                    438: 
                    439:   For argument's sake: at the low end, we want to sync ten files for a
                    440:   total of 10kb across a 1kB/s link.  At the high end, we want to sync
                    441:   1e9 files for 1TB of data across a 1GB/s link.
                    442: 
                    443:   On the whole CPU usage is not normally a limiting factor, if only
                    444:   because running over SSH burns a lot of cycles on encryption.
                    445: 
                    446:   Perhaps have resource throttling without relying on rlimit.
                    447: 
                    448: 
                    449: Streaming:
                    450: 
                    451:   A big attraction of rsync is that there are few round-trip delays:
                    452:   basically only one to get started, and then everything is
                    453:   pipelined.  This is a problem with FTP, and NFS (at least up to
                    454:   v3).  NFSv4 can pipeline operations, but building on that is
                    455:   probably a bit complicated.
                    456: 
                    457: 
                    458: Related work:
                    459: 
1.1.1.2 ! misho     460:   - mirror.pl
1.1       misho     461: 
                    462:   - ProFTPd
                    463: 
                    464:   - Apache
                    465: 
                    466:   - BitTorrent -- p2p mirroring
                    467:     http://bitconjurer.org/BitTorrent/

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>