File:  [ELWIX - Embedded LightWeight unIX -] / embedaddon / rsync / TODO
Revision 1.1.1.2 (vendor branch): download - view: text, annotated - select for diffs - revision graph
Wed Mar 17 00:32:36 2021 UTC (3 years, 3 months ago) by misho
Branches: rsync, MAIN
CVS tags: v3_2_3, HEAD
rsync 3.2.3

    1: -*- indented-text -*-
    2: 
    3: FEATURES ------------------------------------------------------------
    4: Use chroot only if supported
    5: Allow supplementary groups in rsyncd.conf			2002/04/09
    6: Handling IPv6 on old machines
    7: Other IPv6 stuff
    8: Add ACL support							2001/12/02
    9: proxy authentication						2002/01/23
   10: SOCKS								2002/01/23
   11: FAT support
   12: --diff						david.e.sewell	2002/03/15
   13: Add daemon --no-fork option
   14: Create more granular verbosity					2003/05/15
   15: 
   16: DOCUMENTATION --------------------------------------------------------
   17: Keep list of open issues and todos on the web site
   18: Perhaps redo manual as SGML
   19: 
   20: LOGGING --------------------------------------------------------------
   21: Memory accounting
   22: Improve error messages
   23: Better statistics					Rasmus	2002/03/08
   24: Perhaps flush stdout like syslog
   25: Log child death on signal
   26: verbose output					David Stein	2001/12/20
   27: internationalization
   28: 
   29: DEVELOPMENT --------------------------------------------------------
   30: Handling duplicate names
   31: Use generic zlib						2002/02/25
   32: TDB								2002/03/12
   33: Splint								2002/03/12
   34: 
   35: PERFORMANCE ----------------------------------------------------------
   36: Traverse just one directory at a time
   37: Allow skipping MD4 file_sum					2002/04/08
   38: Accelerate MD4
   39: 
   40: TESTING --------------------------------------------------------------
   41: Torture test
   42: Cross-test versions						2001/08/22
   43: Test on kernel source
   44: Test large files
   45: Create mutator program for testing
   46: Create configure option to enable dangerous tests
   47: Create pipe program for testing
   48: Create test makefile target for some tests
   49: 
   50: RELATED PROJECTS -----------------------------------------------------
   51: rsyncsh
   52: https://rsync.samba.org/rsync-and-debian/
   53: rsyncable gzip patch
   54: rsyncsplit as alternative to real integration with gzip?
   55: reverse rsync over HTTP Range
   56: 
   57: 
   58: 
   59: FEATURES ------------------------------------------------------------
   60: 
   61: 
   62: Use chroot only if supported
   63: 
   64:   If the platform doesn't support it, then don't even try.
   65: 
   66:   If running as non-root, then don't fail, just give a warning.
   67:   (There was a thread about this a while ago?)
   68: 
   69:     https://lists.samba.org/pipermail/rsync/2001-August/thread.html
   70:     https://lists.samba.org/pipermail/rsync/2001-September/thread.html
   71: 
   72:                       --          --
   73: 
   74: 
   75: Allow supplementary groups in rsyncd.conf			2002/04/09
   76: 
   77:   Perhaps allow supplementary groups to be specified in rsyncd.conf;
   78:   then make the first one the primary gid and all the rest be
   79:   supplementary gids.
   80: 
   81:                       --          --
   82: 
   83: 
   84: Handling IPv6 on old machines
   85: 
   86:   The KAME IPv6 patch is nice in theory but has proved a bit of a
   87:   nightmare in practice.  The basic idea of their patch is that rsync
   88:   is rewritten to use the new getaddrinfo()/getnameinfo() interface,
   89:   rather than gethostbyname()/gethostbyaddr() as in rsync 2.4.6.
   90:   Systems that don't have the new interface are handled by providing
   91:   our own implementation in lib/, which is selectively linked in.
   92: 
   93:   The problem with this is that it is really hard to get right on
   94:   platforms that have a half-working implementation, so redefining
   95:   these functions clashes with system headers, and leaving them out
   96:   breaks.  This affects at least OSF/1, RedHat 5, and Cobalt, which
   97:   are moderately important.
   98: 
   99:   Perhaps the simplest solution would be to have two different files
  100:   implementing the same interface, and choose either the new or the
  101:   old API.  This is probably necessary for systems that e.g. have
  102:   IPv6, but gethostbyaddr() can't handle it.  The Linux manpage claims
  103:   this is currently the case.
  104: 
  105:   In fact, our internal sockets interface (things like
  106:   open_socket_out(), etc) is much narrower than the getaddrinfo()
  107:   interface, and so probably simpler to get right.  In addition, the
  108:   old code is known to work well on old machines.
  109: 
  110:   We could drop the rather large lib/getaddrinfo files.
  111: 
  112:                       --          --
  113: 
  114: 
  115: Other IPv6 stuff
  116:   
  117:   Implement suggestions from http://www.kame.net/newsletter/19980604/
  118:   and ftp://ftp.iij.ad.jp/pub/RFC/rfc2553.txt
  119: 
  120:   If a host has multiple addresses, then listen try to connect to all
  121:   in order until we get through.  (getaddrinfo may return multiple
  122:   addresses.)  This is kind of implemented already.
  123: 
  124:   Possibly also when starting as a server we may need to listen on
  125:   multiple passive addresses.  This might be a bit harder, because we
  126:   may need to select on all of them.  Hm.
  127: 
  128:                       --          --
  129: 
  130: 
  131: Add ACL support							2001/12/02
  132: 
  133:   Transfer ACLs.  Need to think of a standard representation.
  134:   Probably better not to even try to convert between NT and POSIX.
  135:   Possibly can share some code with Samba.
  136:   NOTE: there is a patch that implements this in the "patches" subdir.
  137: 
  138:                       --          --
  139: 
  140: 
  141: proxy authentication						2002/01/23
  142: 
  143:   Allow RSYNC_PROXY to be http://user:pass@proxy.foo:3128/, and do
  144:   HTTP Basic Proxy-Authentication.
  145: 
  146:   Multiple schemes are possible, up to and including the insanity that
  147:   is NTLM, but Basic probably covers most cases.
  148: 
  149:                       --          --
  150: 
  151: 
  152: SOCKS								2002/01/23
  153: 
  154:   Add --with-socks, and then perhaps a command-line option to put them
  155:   on or off.  This might be more reliable than LD_PRELOAD hacks.
  156: 
  157:                       --          --
  158: 
  159: 
  160: FAT support
  161: 
  162:   rsync to a FAT partition on a Unix machine doesn't work very well at
  163:   the moment.  I think we get errors about invalid filenames and
  164:   perhaps also trying to do atomic renames.
  165: 
  166:   I guess the code to do this is currently #ifdef'd on Windows;
  167:   perhaps we ought to intelligently fall back to it on Unix too.
  168: 
  169:                       --          --
  170: 
  171: 
  172: --diff						david.e.sewell	2002/03/15
  173: 
  174:   Allow people to specify the diff command.  (Might want to use wdiff,
  175:   gnudiff, etc.)
  176: 
  177:   Just diff the temporary file with the destination file, and delete
  178:   the tmp file rather than moving it into place.
  179: 
  180:   Interaction with --partial.
  181: 
  182:   Security interactions with daemon mode?
  183: 
  184:                       --          --
  185: 
  186: 
  187: Add daemon --no-fork option
  188: 
  189:   Very useful for debugging.  Also good when running under a
  190:   daemon-monitoring process that tries to restart the service when the
  191:   parent exits.
  192: 
  193:                       --          --
  194: 
  195: 
  196: Create more granular verbosity					2003/05/15
  197: 
  198:   Control output with the --report option.
  199: 
  200:   The option takes as a single argument (no whitespace) a
  201:   comma delimited lists of keywords.
  202: 
  203:   This would separate debugging from "logging" as well as
  204:   fine grained selection of statistical reporting and what
  205:   actions are logged.
  206: 
  207:   https://lists.samba.org/archive/rsync/2003-May/006059.html
  208: 
  209:                       --          --
  210: 
  211: DOCUMENTATION --------------------------------------------------------
  212: 
  213: 
  214: Keep list of open issues and todos on the web site
  215: 
  216:                       --          --
  217: 
  218: 
  219: Perhaps redo manual as SGML
  220: 
  221:   The man page is getting rather large, and there is more information
  222:   that ought to be added.
  223: 
  224:   TexInfo source is probably a dying format.
  225: 
  226:   Linuxdoc looks like the most likely contender.  I know DocBook is
  227:   favoured by some people, but it's so bloody verbose, even with emacs
  228:   support.
  229: 
  230:                       --          --
  231: 
  232: LOGGING --------------------------------------------------------------
  233: 
  234: 
  235: Memory accounting
  236: 
  237:   At exit, show how much memory was used for the file list, etc.
  238: 
  239:   We also do a weird exponential-growth allocation in flist.c.  I'm
  240:   not sure this makes sense with modern mallocs.  At any rate it will
  241:   make us allocate a huge amount of memory for large file lists.
  242: 
  243:                       --          --
  244: 
  245: 
  246: Improve error messages
  247: 
  248:   If we hang or get SIGINT, then explain where we were up to.  Perhaps
  249:   have a static buffer that contains the current function name, or
  250:   some kind of description of what we were trying to do.  This is a
  251:   little easier on people than needing to run strace/truss.
  252: 
  253:   "The dungeon collapses!  You are killed."  Rather than "unexpected
  254:   eof" give a message that is more detailed if possible and also more
  255:   helpful.
  256: 
  257:   If we get an error writing to a socket, then we should perhaps
  258:   continue trying to read to see if an error message comes across
  259:   explaining why the socket is closed.  I'm not sure if this would
  260:   work, but it would certainly make our messages more helpful.
  261: 
  262:   What happens if a directory is missing -x attributes.  Do we lose
  263:   our load?  (Debian #28416) Probably fixed now, but a test case would
  264:   be good.
  265: 
  266:                       --          --
  267: 
  268: 
  269: Better statistics					Rasmus	2002/03/08
  270: 
  271:   <Rasmus>
  272:       hey, how about an rsync option that just gives you the
  273:       summary without the list of files?  And perhaps gives
  274:       more information like the number of new files, number
  275:       of changed, deleted, etc. ?
  276: 
  277:   <mbp>
  278:       nice idea there is --stats but at the moment it's very
  279:       tridge-oriented rather than user-friendly it would be
  280:       nice to improve it that would also work well with
  281:       --dryrun
  282: 
  283:                       --          --
  284: 
  285: 
  286: Perhaps flush stdout like syslog
  287: 
  288:   Perhaps flush stdout after each filename, so that people trying to
  289:   monitor progress in a log file can do so more easily.  See
  290:   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=48108
  291: 
  292:                       --          --
  293: 
  294: 
  295: Log child death on signal
  296: 
  297:   If a child of the rsync daemon dies with a signal, we should notice
  298:   that when we reap it and log a message.
  299: 
  300:                       --          --
  301: 
  302: 
  303: verbose output					David Stein	2001/12/20
  304:   
  305:   At end of transfer, show how many files were or were not transferred
  306:   correctly.
  307: 
  308:                       --          --
  309: 
  310: 
  311: internationalization
  312: 
  313:   Change to using gettext().  Probably need to ship this for platforms
  314:   that don't have it.
  315: 
  316:   Solicit translations.
  317: 
  318:   Does anyone care?  Before we bother modifying the code, we ought to
  319:   get the manual translated first, because that's possibly more useful
  320:   and at any rate demonstrates desire.
  321: 
  322:                       --          --
  323: 
  324: DEVELOPMENT --------------------------------------------------------
  325: 
  326: Handling duplicate names
  327: 
  328:   Some folks would like rsync to be deterministic in how it handles
  329:   duplicate names that come from mering multiple source directories
  330:   into a single destination directory; e.g. the last name wins.  We
  331:   could do this by switching our sort algorithm to one that will
  332:   guarantee that the names won't be reordered.  Alternately, we could
  333:   assign an ever-increasing number to each item as we insert it into
  334:   the list and then make sure that we leave the largest number when
  335:   cleaning the file list (see clean_flist()).  Another solution would
  336:   be to add a hash table, and thus never put any duplicate names into
  337:   the file list (and bump the protocol to handle this).
  338: 
  339:                       --          --
  340: 
  341: 
  342: Use generic zlib						2002/02/25
  343: 
  344:   Perhaps don't use our own zlib.
  345: 
  346:   Advantages:
  347:    
  348:     - will automatically be up to date with bugfixes in zlib
  349: 
  350:     - can leave it out for small rsync on e.g. recovery disks
  351: 
  352:     - can use a shared library
  353: 
  354:     - avoids people breaking rsync by trying to do this themselves and
  355:       messing up
  356: 
  357:   Should we ship zlib for systems that don't have it, or require
  358:   people to install it separately?
  359: 
  360:   Apparently this will make us incompatible with versions of rsync
  361:   that use the patched version of rsync.  Probably the simplest way to
  362:   do this is to just disable gzip (with a warning) when talking to old
  363:   versions.
  364: 
  365:                       --          --
  366: 
  367: 
  368: Splint								2002/03/12
  369: 
  370:   Build rsync with SPLINT to try to find security holes.  Add
  371:   annotations as necessary.  Keep track of the number of warnings
  372:   found initially, and see how many of them are real bugs, or real
  373:   security bugs.  Knowing the percentage of likely hits would be
  374:   really interesting for other projects.
  375: 
  376:                       --          --
  377: 
  378: PERFORMANCE ----------------------------------------------------------
  379: 
  380: Allow skipping MD4 file_sum					2002/04/08
  381: 
  382:   If we're doing a local transfer, or using -W, then perhaps don't
  383:   send the file checksum.  If we're doing a local transfer, then
  384:   calculating MD4 checksums uses 90% of CPU and is unlikely to be
  385:   useful.
  386: 
  387:   We should not allow it to be disabled separately from -W, though
  388:   as it is the only thing that lets us know when the rsync algorithm
  389:   got out of sync and messed the file up (i.e. if the basis file
  390:   changed between checksum generation and reception).
  391: 
  392:                       --          --
  393: 
  394: 
  395: Accelerate MD4
  396: 
  397:   Perhaps borrow an assembler MD4 from someone?
  398: 
  399:   Make sure we call MD4 with properly-sized blocks whenever possible
  400:   to avoid copying into the residue region?
  401: 
  402:                       --          --
  403: 
  404: TESTING --------------------------------------------------------------
  405: 
  406: Torture test
  407: 
  408:   Something that just keeps running rsync continuously over a data set
  409:   likely to generate problems.
  410: 
  411:                       --          --
  412: 
  413: 
  414: Cross-test versions						2001/08/22
  415: 
  416:   Part of the regression suite should be making sure that we
  417:   don't break backwards compatibility: old clients vs new
  418:   servers and so on.  Ideally we would test both up and down
  419:   from the current release to all old versions.
  420: 
  421:   Run current rsync versions against significant past releases.
  422: 
  423:   We might need to omit broken old versions, or versions in which
  424:   particular functionality is broken
  425: 
  426:   It might be sufficient to test downloads from well-known public
  427:   rsync servers running different versions of rsync.  This will give
  428:   some testing and also be the most common case for having different
  429:   versions and not being able to upgrade.
  430: 
  431:   The new --protocol option may help in this.
  432: 
  433:                       --          --
  434: 
  435: 
  436: Test on kernel source
  437: 
  438:   Download all versions of kernel; unpack, sync between them.  Also
  439:   sync between uncompressed tarballs.  Compare directories after
  440:   transfer.
  441: 
  442:   Use local mode; ssh; daemon; --whole-file and --no-whole-file.
  443: 
  444:   Use awk to pull out the 'speedup' number for each transfer.  Make
  445:   sure it is >= x.
  446: 
  447:                       --          --
  448: 
  449: 
  450: Test large files
  451: 
  452:   Sparse and non-sparse
  453: 
  454:                       --          --
  455: 
  456: 
  457: Create mutator program for testing
  458: 
  459:   Insert bytes, delete bytes, swap blocks, ...
  460: 
  461:                       --          --
  462: 
  463: 
  464: Create configure option to enable dangerous tests
  465: 
  466:                       --          --
  467: 
  468: 
  469: Create pipe program for testing
  470: 
  471:   Create pipe program that makes slow/jerky connections for
  472:   testing Versions of read() and write() that corrupt the
  473:   stream, or abruptly fail
  474: 
  475:                       --          --
  476: 
  477: 
  478: Create test makefile target for some tests
  479: 
  480:   Separate makefile target to run rough tests -- or perhaps
  481:   just run them every time?
  482: 
  483:                       --          --
  484: 
  485: RELATED PROJECTS -----------------------------------------------------
  486: 
  487: rsyncsh
  488: 
  489:    Write a small emulation of interactive ftp as a Pythonn program
  490:    that calls rsync.  Commands such as "cd", "ls", "ls *.c" etc map
  491:    fairly directly into rsync commands: it just needs to remember the
  492:    current host, directory and so on.  We can probably even do
  493:    completion of remote filenames.
  494: 
  495:                       --          --
  496: 
  497: 
  498: https://rsync.samba.org/rsync-and-debian/
  499: 
  500: 
  501:                       --          --
  502: 
  503: 
  504: rsyncable gzip patch
  505: 
  506:   Exhaustive, tortuous testing
  507: 
  508:   Cleanups?
  509: 
  510:                       --          --
  511: 
  512: 
  513: rsyncsplit as alternative to real integration with gzip?
  514: 
  515:                       --          --
  516: 
  517: 
  518: reverse rsync over HTTP Range
  519: 
  520:   Goswin Brederlow suggested this on Debian; I think tridge and I
  521:   talked about it previous in relation to rproxy.
  522: 
  523:   Addendum:  It looks like someone is working on a version of this:
  524: 
  525:     http://zsync.moria.org.uk/
  526: 
  527:                       --          --
  528: 

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>