Annotation of embedaddon/libiconv/DESIGN, revision 1.1
1.1 ! misho 1: While some other iconv(3) implementations - like FreeBSD iconv(3) - choose
! 2: the "many small shared libraries" and dlopen(3) approach, this implementation
! 3: packs everything into a single shared library. Here is a comparison of the
! 4: two designs.
! 5:
! 6: * Run-time efficiency
! 7: 1. A dlopen() based approach needs a cache of loaded shared libraries.
! 8: Otherwise, every iconv_open() call will result in a call to dlopen()
! 9: and thus to file system related system calls - which is prohibitive
! 10: because some applications use the iconv_open/iconv/iconv_close sequence
! 11: for every single filename, string, or piece of text.
! 12: 2. In terms of virtual memory use, both approaches are on par. Being shared
! 13: libraries, the tables are shared between any processes that use them.
! 14: And because of the demand loading used by Unix systems (and because libiconv
! 15: does not have initialization functions), only those parts of the tables
! 16: which are needed (typically very few kilobytes) will be read from disk and
! 17: paged into main memory.
! 18: 3. Even with a cache of loaded shared libraries, the dlopen() based approach
! 19: makes more system calls, because it has to load one or two shared libraries
! 20: for every encoding in use.
! 21:
! 22: * Total size
! 23: In the dlopen(3) approach, every shared library has a symbol table and
! 24: relocation offset. All together, FreeBSD iconv installs more than 200 shared
! 25: libraries with a total size of 2.3 MB. Whereas libiconv installs 0.45 MB.
! 26:
! 27: * Extensibility
! 28: The dlopen(3) approach is good for guaranteeing extensibility if the iconv
! 29: implementation is distributed without source. (Or when, as in glibc, you
! 30: cannot rebuild iconv without rebuilding your libc, thus possibly
! 31: destabilizing your system.)
! 32: The libiconv package achieves extensibility through the LGPL license:
! 33: Every user has access to the source of the package and can extend and
! 34: replace just libiconv.so.
! 35: The places which have to be modified when a new encoding is added are as
! 36: follows: add an #include statement in iconv.c, add an entry in the table in
! 37: iconv.c, and of course, update the README and iconv_open.3 manual page.
! 38:
! 39: * Use within other packages
! 40: If you want to incorporate an iconv implementation into another package
! 41: (such as a mail user agent or web browser), the single library approach
! 42: is easier, because:
! 43: 1. In the shared library approach you have to provide the right directory
! 44: prefix which will be used at run time.
! 45: 2. Incorporating iconv as a static library into the executable is easy -
! 46: it won't need dynamic loading. (This assumes that your package is under
! 47: the LGPL or GPL license.)
! 48:
! 49:
! 50: All conversions go through Unicode. This is possible because most of the
! 51: world's characters have already been allocated in the Unicode standard.
! 52: Therefore we have for each encoding two functions:
! 53: - For conversion from the encoding to Unicode, a function called xxx_mbtowc.
! 54: - For conversion from Unicode to the encoding, a function called xxx_wctomb,
! 55: and for stateful encodings, a function called xxx_reset which returns to
! 56: the initial shift state.
! 57:
! 58:
! 59: All our functions operate on a single Unicode character at a time. This is
! 60: obviously less efficient than operating on an entire buffer of characters at
! 61: a time, but it makes the coding considerably easier and less bug-prone. Those
! 62: who wish best performance should install the Real Thing (TM): GNU libc 2.1
! 63: or newer.
! 64:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>