embedaddon/rsync/rsync3.txt - view

File: [ELWIX - Embedded LightWeight unIX -] / embedaddon / rsync / rsync3.txt
Revision 1.1.1.2 (vendor branch): download - view: text, annotated - select for diffs - revision graph
Wed Mar 17 00:32:36 2021 UTC (3 years, 3 months ago) by misho
Branches: rsync, MAIN
CVS tags: v3_2_3, HEAD

rsync 3.2.3

1: -*- indented-text -*- 2: 3: Notes towards a new version of rsync 4: Martin Pool <mbp@samba.org>, September 2001. 5: 6: 7: Good things about the current implementation: 8: 9: - Widely known and adopted. 10: 11: - Fast/efficient, especially for moderately small sets of files over 12: slow links (transoceanic or modem.) 13: 14: - Fairly reliable. 15: 16: - The choice of running over a plain TCP socket or tunneling over 17: ssh. 18: 19: - rsync operations are idempotent: you can always run the same 20: command twice to make sure it worked properly without any fear. 21: (Are there any exceptions?) 22: 23: - Small changes to files cause small deltas. 24: 25: - There is a way to evolve the protocol to some extent. 26: 27: - rdiff and rsync --write-batch allow generation of standalone patch 28: sets. rsync+ is pretty cheesy, though. xdelta seems cleaner. 29: 30: - Process triangle is creative, but seems to provoke OS bugs. 31: 32: - "Morning-after property": you don't need to know anything on the 33: local machine about the state of the remote machine, or about 34: transfers that have been done in the past. 35: 36: - You can easily push or pull simply by switching the order of 37: files. 38: 39: - The "modules" system has some neat features compared to 40: e.g. Apache's per-directory configuration. In particular, because 41: you can set a userid and chroot directory, there is strong 42: protection between different modules. I haven't seen any calls 43: for a more flexible system. 44: 45: 46: Bad things about the current implementation: 47: 48: - Persistent and hard-to-diagnose hang bugs remain 49: 50: - Protocol is sketchily documented, tied to this implementation, and 51: hard to modify/extend 52: 53: - Both the program and the protocol assume a single non-interactive 54: one-way transfer 55: 56: - A list of all files are held in memory for the entire transfer, 57: which cripples scalability to large file trees 58: 59: - Opening a new socket for every operation causes problems, 60: especially when running over SSH with password authentication. 61: 62: - Renamed files are not handled: the old file is removed, and the 63: new file created from scratch. 64: 65: - The versioning approach assumes that future versions of the 66: program know about all previous versions, and will do the right 67: thing. 68: 69: - People always get confused about ':' vs '::' 70: 71: - Error messages can be cryptic. 72: 73: - Default behaviour is not intuitive: in too many cases rsync will 74: happily do nothing. Perhaps -a should be the default? 75: 76: - People get confused by trailing slashes, though it's hard to think 77: of another reasonable way to make this necessary distinction 78: between a directory and its contents. 79: 80: 81: Protocol philosophy: 82: 83: *The* big difference between protocols like HTTP, FTP, and NFS is 84: that their fundamental operations are "read this file", "delete 85: this file", and "make this directory", whereas rsync is "make this 86: directory like this one". 87: 88: 89: Questionable features: 90: 91: These are neat, but not necessarily clean or worth preserving. 92: 93: - The remote rsync can be wrapped by some other program, such as in 94: tridge's rsync-mail scripts. The general feature of sending and 95: retrieving mail over rsync is good, but this is perhaps not the 96: right way to implement it. 97: 98: 99: Desirable features: 100: 101: These don't really require architectural changes; they're just 102: something to keep in mind. 103: 104: - Synchronize ACLs and extended attributes 105: 106: - Anonymous servers should be efficient 107: 108: - Code should be portable to non-UNIX systems 109: 110: - Should be possible to document the protocol in RFC form 111: 112: - --dry-run option 113: 114: - IPv6 support. Pretty straightforward. 115: 116: - Allow the basis and destination files to be different. For 117: example, you could use this when you have a CD-ROM and want to 118: download an updated image onto a hard drive. 119: 120: - Efficiently interrupt and restart a transfer. We can write a 121: checkpoint file that says where we're up to in the filesystem. 122: Alternatively, as long as transfers are idempotent, we can just 123: restart the whole thing. [NFSv4] 124: 125: - Scripting support. 126: 127: - Propagate atimes and do not modify them. This is very ugly on 128: Unix. It might be better to try to add O_NOATIME to kernels, and 129: call that. 130: 131: - Unicode. Probably just use UTF-8 for everything. 132: 133: - Open authentication system. Can we use PAM? Is SASL an adequate 134: mapping of PAM to the network, or useful in some other way? 135: 136: - Resume interrupted transfers without the --partial flag. We need 137: to leave the temporary file behind, and then know to use it. This 138: leaves a risk of large temporary files accumulating, which is not 139: good. Perhaps it should be off by default. 140: 141: - tcpwrappers support. Should be trivial; can already be done 142: through tcpd or inetd. 143: 144: - Socks support built in. It's not clear this is any better than 145: just linking against the socks library, though. 146: 147: - When run over SSH, invoke with predictable command-line arguments, 148: so that people can restrict what commands sshd will run. (Is this 149: really required?) 150: 151: - Comparison mode: give a list of which files are new, gone, or 152: different. Set return code depending on whether anything has 153: changed. 154: 155: - Internationalized messages (gettext?) 156: 157: - Optionally use real regexps rather than globs? 158: 159: - Show overall progress. Pretty hard to do, especially if we insist 160: on not scanning the directory tree up front. 161: 162: 163: Regression testing: 164: 165: - Support automatic testing. 166: 167: - Have hard internal timeouts against hangs. 168: 169: - Be deterministic. 170: 171: - Measure performance. 172: 173: 174: Hard links: 175: 176: At the moment, we can recreate hard links, but it's a bit 177: inefficient: it depends on holding a list of all files in the tree. 178: Every time we see a file with a linkcount >1, we need to search for 179: another known name that has the same (fsid,inum) tuple. We could do 180: that more efficiently by keeping a list of only files with 181: linkcount>1, and removing files from that list as all their names 182: become known. 183: 184: 185: Command-line options: 186: 187: We have rather a lot at the moment. We might get more if the tool 188: becomes more flexible. Do we need a .rc or configuration file? 189: That wouldn't really fit with its pattern of use: cp and tar don't 190: have them, though ssh does. 191: 192: 193: Scripting issues: 194: 195: - Perhaps support multiple scripting languages: candidates include 196: Perl, Python, Tcl, Scheme (guile?), sh, ... 197: 198: - Simply running a subprocess and looking at its stdout/exit code 199: might be sufficient, though it could also be pretty slow if it's 200: called often. 201: 202: - There are security issues about running remote code, at least if 203: it's not running in the users own account. So we can either 204: disallow it, or use some kind of sandbox system. 205: 206: - Python is a good language, but the syntax is not so good for 207: giving small fragments on the command line. 208: 209: - Tcl is broken Lisp. 210: 211: - Lots of sysadmins know Perl, though Perl can give some bizarre or 212: confusing errors. The built in stat operators and regexps might 213: be useful. 214: 215: - Sadly probably not enough people know Scheme. 216: 217: - sh is hard to embed. 218: 219: 220: Scripting hooks: 221: 222: - Whether to transfer a file 223: 224: - What basis file to use 225: 226: - Logging 227: 228: - Whether to allow transfers (for public servers) 229: 230: - Authentication 231: 232: - Locking 233: 234: - Cache 235: 236: - Generating backup path/name. 237: 238: - Post-processing of backups, e.g. to do compression. 239: 240: - After transfer, before replacement: so that we can spit out a diff 241: of what was changed, or kick off some kind of reconciliation 242: process. 243: 244: 245: VFS: 246: 247: Rather than talking straight to the filesystem, rsyncd talks through 248: an internal API. Samba has one. Is it useful? 249: 250: - Could be a tidy way to implement cached signatures. 251: 252: - Keep files compressed on disk? 253: 254: 255: Interactive interface: 256: 257: - Something like ncFTP, or integration into GNOME-vfs. Probably 258: hold a single socket connection open. 259: 260: - Can either call us as a separate process, or as a library. 261: 262: - The standalone process needs to produce output in a form easily 263: digestible by a calling program, like the --emacs feature some 264: have. Same goes for output: rpm outputs a series of hash symbols, 265: which are easier for a GUI to handle than "\r30% complete" 266: strings. 267: 268: - Yow! emacs support. (You could probably build that already, of 269: course.) I'd like to be able to write a simple script on a remote 270: machine that rsyncs it to my workstation, edits it there, then 271: pushes it back up. 272: 273: 274: Pie-in-the-sky features: 275: 276: These might have a severe impact on the protocol, and are not 277: clearly in our core requirements. It looks like in many of them 278: having scripting hooks will allow us 279: 280: - Transport over UDP multicast. The hard part is handling multiple 281: destinations which have different basis files. We can look at 282: multicast-TFTP for inspiration. 283: 284: - Conflict resolution. Possibly general scripting support will be 285: sufficient. 286: 287: - Integrate with locking. It's hard to see a good general solution, 288: because Unix systems have several locking mechanisms, and grabbing 289: the lock from programs that don't expect it could cause deadlocks, 290: timeouts, or other problems. Scripting support might help. 291: 292: - Replicate in place, rather than to a temporary file. This is 293: dangerous in the case of interruption, and it also means that the 294: delta can't refer to blocks that have already been overwritten. 295: On the other hand we could semi-trivially do this at first by 296: simply generating a delta with no copy instructions. 297: 298: - Replicate block devices. Most of the difficulties here are to do 299: with replication in place, though on some systems we will also 300: have to do I/O on block boundaries. 301: 302: - Peer to peer features. Flavour of the year. Can we think about 303: ways for clients to smoothly and voluntarily become servers for 304: content they receive? 305: 306: - Imagine a situation where the destination has a much faster link 307: to the cloud than the source. In this case, Mojo Nation downloads 308: interleaved blocks from several slower servers. The general 309: situation might be a way for a master rsync process to farm out 310: tasks to several subjobs. In this particular case they'd need 311: different sockets. This might be related to multicast. 312: 313: 314: Unlikely features: 315: 316: - Allow remote source and destination. If this can be cleanly 317: designed into the protocol, perhaps with the remote machine acting 318: as a kind of echo, then it's good. It's uncommon enough that we 319: don't want to shape the whole protocol around it, though. 320: 321: In fact, in a triangle of machines there are two possibilities: 322: all traffic passes from remote1 to remote2 through local, or local 323: just sets up the transfer and then remote1 talks to remote2. FTP 324: supports the second but it's not clearly good. There are some 325: security problems with being able to instruct one machine to open 326: a connection to another. 327: 328: 329: In favour of evolving the protocol: 330: 331: - Keeping compatibility with existing rsync servers will help with 332: adoption and testing. 333: 334: - We should at the very least be able to fall back to the new 335: protocol. 336: 337: - Error handling is not so good. 338: 339: 340: In favour of using a new protocol: 341: 342: - Maintaining compatibility might soak up development time that 343: would better go into improving a new protocol. 344: 345: - If we start from scratch, it can be documented as we go, and we 346: can avoid design decisions that make the protocol complex or 347: implementation-bound. 348: 349: 350: Error handling: 351: 352: - Errors should come back reliably, and be clearly associated with 353: the particular file that caused the problem. 354: 355: - Some errors ought to cause the whole transfer to abort; some are 356: just warnings. If any errors have occurred, then rsync ought to 357: return an error. 358: 359: 360: Concurrency: 361: 362: - We want to keep the CPU, filesystem, and network as full as 363: possible as much of the time as possible. 364: 365: - We can do nonblocking network IO, but not so for disk. 366: 367: - It makes sense to on the destination be generating signatures and 368: applying patches at the same time. 369: 370: - Can structure this with nonblocking, threads, separate processes, 371: etc. 372: 373: 374: Uses: 375: 376: - Mirroring software distributions: 377: 378: - Synchronizing laptop and desktop 379: 380: - NFS filesystem migration/replication. See 381: http://www.ietf.org/proceedings/00jul/00july-133.htm#P24510_1276764 382: 383: - Sync with PDA 384: 385: - Network backup systems 386: 387: - CVS filemover 388: 389: 390: Conflict resolution: 391: 392: - Requires application-specific knowledge. We want to provide 393: policy, rather than mechanism. 394: 395: - Possibly allowing two-way migration across a single connection 396: would be useful. 397: 398: 399: Moved files: 400: 401: - There's no trivial way to detect renamed files, especially if they 402: move between directories. 403: 404: - If we had a picture of the remote directory from last time on 405: either machine, then the inode numbers might give us a hint about 406: files which may have been renamed. 407: 408: - Files that are renamed and not modified can be detected by 409: examining the directory listing, looking for files with the same 410: size/date as the origin. 411: 412: 413: Filesystem migration: 414: 415: NFSv4 probably wants to migrate file locks, but that's not really 416: our problem. 417: 418: 419: Atomic updates: 420: 421: The NFSv4 working group wants atomic migration. Most of the 422: responsibility for this lies on the NFS server or OS. 423: 424: If migrating a whole tree, then we could do a nearly-atomic rename 425: at the end. This ties in to having separate basis and destination 426: files. 427: 428: There's no way in Unix to replace a whole set of files atomically. 429: However, if we get them all onto the destination machine and then do 430: the updates quickly it would greatly reduce the window. 431: 432: 433: Scalability: 434: 435: We should aim to work well on machines in use in a year or two. 436: That probably means transfers of many millions of files in one 437: batch, and gigabytes or terabytes of data. 438: 439: For argument's sake: at the low end, we want to sync ten files for a 440: total of 10kb across a 1kB/s link. At the high end, we want to sync 441: 1e9 files for 1TB of data across a 1GB/s link. 442: 443: On the whole CPU usage is not normally a limiting factor, if only 444: because running over SSH burns a lot of cycles on encryption. 445: 446: Perhaps have resource throttling without relying on rlimit. 447: 448: 449: Streaming: 450: 451: A big attraction of rsync is that there are few round-trip delays: 452: basically only one to get started, and then everything is 453: pipelined. This is a problem with FTP, and NFS (at least up to 454: v3). NFSv4 can pipeline operations, but building on that is 455: probably a bit complicated. 456: 457: 458: Related work: 459: 460: - mirror.pl 461: 462: - ProFTPd 463: 464: - Apache 465: 466: - BitTorrent -- p2p mirroring 467: http://bitconjurer.org/BitTorrent/