Annotation of elwix/tools/oldlzma/lzma.txt, revision 1.1

1.1     ! misho       1: LZMA SDK 4.17 
        !             2: -------------
        !             3: 
        !             4: LZMA SDK 4.17  Copyright (C) 1999-2005 Igor Pavlov
        !             5: 
        !             6: LZMA SDK provides developers with documentation, source code,
        !             7: and sample code necessary to write software that uses LZMA compression. 
        !             8: 
        !             9: LZMA is default and general compression method of 7z format
        !            10: in 7-Zip compression program (www.7-zip.org). LZMA provides high 
        !            11: compression ratio and very fast decompression.
        !            12: 
        !            13: LZMA is an improved version of famous LZ77 compression algorithm. 
        !            14: It was improved in way of maximum increasing of compression ratio,
        !            15: keeping high decompression speed and low memory requirements for 
        !            16: decompressing.
        !            17: 
        !            18: 
        !            19: 
        !            20: LICENSE
        !            21: -------
        !            22: 
        !            23: LZMA SDK is licensed under two licenses:
        !            24: 
        !            25: 1) GNU Lesser General Public License (GNU LGPL)
        !            26: 2) Common Public License (CPL)
        !            27: 
        !            28: It means that you can select one of these two licenses and 
        !            29: follow rules of that license.
        !            30: 
        !            31: SPECIAL EXCEPTION
        !            32: Igor Pavlov, as the author of this code, expressly permits you 
        !            33: to statically or dynamically link your code (or bind by name) 
        !            34: to the files from LZMA SDK without subjecting your linked 
        !            35: code to the terms of the CPL or GNU LGPL. 
        !            36: Any modifications or additions to files from LZMA SDK, however, 
        !            37: are subject to the GNU LGPL or CPL terms.
        !            38: 
        !            39: 
        !            40: GNU LGPL and CPL licenses are pretty similar and both these
        !            41: licenses are classified as 
        !            42: 
        !            43: 1) "Free software licenses" at http://www.gnu.org/ 
        !            44: 2) "OSI-approved" at http://www.opensource.org/
        !            45: 
        !            46: 
        !            47: LZMA SDK also can be available under a proprietary license for 
        !            48: those who cannot use the GNU LGPL or CPL in their code. To request
        !            49: such proprietary license or any additional consultations,
        !            50: send email message from that page:
        !            51: http://www.7-zip.org/support.html
        !            52: 
        !            53: 
        !            54: You should have received a copy of the GNU Lesser General Public
        !            55: License along with this library; if not, write to the Free Software
        !            56: Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
        !            57: 
        !            58: You should have received a copy of the Common Public License
        !            59: along with this library.
        !            60: 
        !            61: 
        !            62: LZMA SDK Contents
        !            63: -----------------
        !            64: 
        !            65: LZMA SDK includes:
        !            66: 
        !            67:   - C++ source code of LZMA Encoder and Decoder
        !            68:   - C++ source code for file->file LZMA compressing and decompressing
        !            69:   - ANSI-C compatible source code for LZMA decompressing
        !            70:   - Compiled file->file LZMA compressing/decompressing program for Windows system
        !            71: 
        !            72: ANSI-C LZMA decompression code was ported from original C++ sources to C.
        !            73: Also it was simplified and optimized for code size. 
        !            74: But it is fully compatible with LZMA from 7-Zip.
        !            75: 
        !            76: 
        !            77: UNIX/Linux version 
        !            78: ------------------
        !            79: To compile C++ version of file->file LZMA, go to directory
        !            80: SRC/7zip/Compress/LZMA_Alone 
        !            81: and type "make" or "make clean all" to recompile all.
        !            82: 
        !            83: In some UNIX/Linux versions you must compile LZMA with static libraries.
        !            84: To compile with static libraries, change string in makefile
        !            85: LIB = -lm
        !            86: to string  
        !            87: LIB = -lm -static
        !            88: 
        !            89: 
        !            90: Files
        !            91: ---------------------
        !            92: SRC      - directory with source code
        !            93: lzma.txt - LZMA SDK description (this file)
        !            94: 7zFormat.txt - 7z Format description
        !            95: 7zC.txt  - 7z ANSI-C Decoder description (this file)
        !            96: methods.txt  - Compression method IDs for .7z
        !            97: LGPL.txt - GNU Lesser General Public License
        !            98: CPL.html - Common Public License
        !            99: lzma.exe - Compiled file->file LZMA encoder/decoder for Windows
        !           100: history.txt - history of the LZMA SDK
        !           101: 
        !           102: 
        !           103: Source code structure
        !           104: ---------------------
        !           105: 
        !           106: SRC
        !           107:   Common  - common files for C++ projects
        !           108:   Windows - common files for Windows related code
        !           109:   7zip    - files related to 7-Zip Project
        !           110:     Common   - common files for 7-Zip
        !           111:     Compress - files related to compression/decompression
        !           112:       LZ     - files related to LZ (Lempel-Ziv) compression algorithm
        !           113:         BinTree    - Binary Tree Match Finder for LZ algorithm
        !           114:         HashChain  - Hash Chain Match Finder for LZ algorithm
        !           115:         Patricia   - Patricia Match Finder for LZ algorithm
        !           116:       RangeCoder   - Range Coder (special code of compression/decompression)
        !           117:       LZMA         - LZMA compression/decompression on C++
        !           118:       LZMA_Alone   - file->file LZMA compression/decompression
        !           119:       LZMA_C       - ANSI-C compatible LZMA decompressor
        !           120:         LzmaDecode.h  - interface for LZMA decoding on ANSI-C
        !           121:         LzmaDecode.c      - LZMA decoding on ANSI-C (new fastest version)
        !           122:         LzmaDecodeSize.c  - LZMA decoding on ANSI-C (old size-optimized version)
        !           123:         LzmaTest.c    - test application that decodes LZMA encoded file
        !           124:       Branch       - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
        !           125:     Archive - files related to archiving
        !           126:       7z_C     - 7z ANSI-C Decoder
        !           127: 
        !           128: Source code of LZMA SDK is only part of big 7-Zip project. That is 
        !           129: why LZMA SDK uses such complex source code structure. 
        !           130: 
        !           131: You can find ANSI-C LZMA decompressing code at folder 
        !           132:   SRC/7zip/Compress/LZMA_C
        !           133: 7-Zip doesn't use that ANSI-C LZMA code and that code was developed 
        !           134: specially for this SDK. And files from LZMA_C do not need files from 
        !           135: other directories of SDK for compiling.
        !           136: 
        !           137: 7-Zip source code can be downloaded from 7-Zip's SourceForge page:
        !           138: 
        !           139:   http://sourceforge.net/projects/sevenzip/
        !           140: 
        !           141: 
        !           142: LZMA Decompression features
        !           143: ---------------------------
        !           144:   - Variable dictionary size (up to 256 MB)
        !           145:   - Estimated compressing speed: about 500 KB/s on 1 GHz CPU
        !           146:   - Estimated decompressing speed: 
        !           147:       - 8-12 MB/s on 1 GHz Intel Pentium 3 or AMD Athlon
        !           148:       - 500-1000 KB/s on 100 MHz ARM, MIPS, PowerPC or other simple RISC
        !           149:   - Small memory requirements for decompressing (8-32 KB + DictionarySize)
        !           150:   - Small code size for decompressing: 2-8 KB (depending from 
        !           151:     speed optimizations) 
        !           152: 
        !           153: LZMA decoder uses only integer operations and can be 
        !           154: implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).
        !           155: 
        !           156: Some critical operations that affect to speed of LZMA decompression:
        !           157:   1) 32*16 bit integer multiply
        !           158:   2) Misspredicted branches (penalty mostly depends from pipeline length)
        !           159:   3) 32-bit shift and arithmetic operations
        !           160: 
        !           161: Speed of LZMA decompression mostly depends from CPU speed.
        !           162: Memory speed has no big meaning. But if your CPU has small data cache, 
        !           163: overall weight of memory speed will slightly increase.
        !           164: 
        !           165: 
        !           166: How To Use
        !           167: ----------
        !           168: 
        !           169: Using LZMA encoder/decoder executable
        !           170: --------------------------------------
        !           171: 
        !           172: Usage:  LZMA <e|d> inputFile outputFile [<switches>...]
        !           173: 
        !           174:   e: encode file
        !           175: 
        !           176:   d: decode file
        !           177: 
        !           178:   b: Benchmark. There are two tests: compressing and decompressing 
        !           179:      with LZMA method. Benchmark shows rating in MIPS (million 
        !           180:      instructions per second). Rating value is calculated from 
        !           181:      measured speed and it is normalized with AMD Athlon XP CPU
        !           182:      results. Also Benchmark checks possible hardware errors (RAM 
        !           183:      errors in most cases). Benchmark uses these settings:
        !           184:      (-a1, -d21, -fb32, -mfbt4). You can change only -d. Also you 
        !           185:      can change number of iterations. Example for 30 iterations:
        !           186:        LZMA b 30
        !           187:      Default number of iterations is 10.
        !           188: 
        !           189: <Switches>
        !           190:   
        !           191: 
        !           192:   -a{N}:  set compression mode 0 = fast, 1 = normal, 2 = max
        !           193:           default: 2 (max)
        !           194: 
        !           195:   d{N}:   Sets Dictionary size - [0, 28], default: 23 (8MB)
        !           196:           The maximum value for dictionary size is 256 MB = 2^28 bytes.
        !           197:           Dictionary size is calculated as DictionarySize = 2^N bytes. 
        !           198:           For decompressing file compressed by LZMA method with dictionary 
        !           199:           size D = 2^N you need about D bytes of memory (RAM).
        !           200: 
        !           201:   -fb{N}: set number of fast bytes - [5, 255], default: 128
        !           202:           Usually big number gives a little bit better compression ratio 
        !           203:           and slower compression process.
        !           204: 
        !           205:   -lc{N}: set number of literal context bits - [0, 8], default: 3
        !           206:           Sometimes lc=4 gives gain for big files.
        !           207: 
        !           208:   -lp{N}: set number of literal pos bits - [0, 4], default: 0
        !           209:           lp switch is intended for periodical data when period is 
        !           210:           equal 2^N. For example, for 32-bit (4 bytes) 
        !           211:           periodical data you can use lp=2. Often it's better to set lc0, 
        !           212:           if you change lp switch.
        !           213: 
        !           214:   -pb{N}: set number of pos bits - [0, 4], default: 2
        !           215:           pb switch is intended for periodical data 
        !           216:           when period is equal 2^N.
        !           217: 
        !           218:   -mf{MF_ID}: set Match Finder. Default: bt4. 
        !           219:               Compression ratio for all bt* and pat* almost the same.
        !           220:               Algorithms from hc* group doesn't provide good compression 
        !           221:               ratio, but they often works pretty fast in combination with 
        !           222:               fast mode (-a0). Methods from bt* group require less memory 
        !           223:               than methods from pat* group. Usually bt4 works faster than 
        !           224:               any pat*, but for some types of files pat* can work faster. 
        !           225: 
        !           226:               Memory requirements depend from dictionary size 
        !           227:               (parameter "d" in table below). 
        !           228: 
        !           229:                MF_ID     Memory                   Description
        !           230: 
        !           231:                 bt2    d*9.5 +  1MB  Binary Tree with 2 bytes hashing.
        !           232:                 bt3    d*9.5 + 65MB  Binary Tree with 2-3(full) bytes hashing.
        !           233:                 bt4    d*9.5 +  6MB  Binary Tree with 2-3-4 bytes hashing.
        !           234:                 bt4b   d*9.5 + 34MB  Binary Tree with 2-3-4(big) bytes hashing.
        !           235:                 pat2r  d*26  +  1MB  Patricia Tree with 2-bits nodes, removing.
        !           236:                 pat2   d*38  +  1MB  Patricia Tree with 2-bits nodes.
        !           237:                 pat2h  d*38  + 77MB  Patricia Tree with 2-bits nodes, 2-3 bytes hashing.
        !           238:                 pat3h  d*62  + 85MB  Patricia Tree with 3-bits nodes, 2-3 bytes hashing.
        !           239:                 pat4h  d*110 +101MB  Patricia Tree with 4-bits nodes, 2-3 bytes hashing.
        !           240:                 hc3    d*5.5 +  1MB  Hash Chain with 2-3 bytes hashing.
        !           241:                 hc4    d*5.5 +  6MB  Hash Chain with 2-3-4 bytes hashing.
        !           242: 
        !           243:   -eos:   write End Of Stream marker. By default LZMA doesn't write 
        !           244:           eos marker, since LZMA decoder knows uncompressed size 
        !           245:           stored in .lzma file header.
        !           246: 
        !           247:   -si:    Read data from stdin (it will write End Of Stream marker).
        !           248:   -so:    Write data to stdout
        !           249: 
        !           250: 
        !           251: Examples:
        !           252: 
        !           253: 1) LZMA e file.bin file.lzma -d16 -lc0 
        !           254: 
        !           255: compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K)  
        !           256: and 0 literal context bits. -lc0 allows to reduce memory requirements 
        !           257: for decompression.
        !           258: 
        !           259: 
        !           260: 2) LZMA e file.bin file.lzma -lc0 -lp2
        !           261: 
        !           262: compresses file.bin to file.lzma with settings suitable 
        !           263: for 32-bit periodical data (for example, ARM or MIPS code).
        !           264: 
        !           265: 3) LZMA d file.lzma file.bin
        !           266: 
        !           267: decompresses file.lzma to file.bin.
        !           268: 
        !           269: 
        !           270: Compression ratio hints
        !           271: -----------------------
        !           272: 
        !           273: Recommendations
        !           274: ---------------
        !           275: 
        !           276: To increase compression ratio for LZMA compressing it's desirable 
        !           277: to have aligned data (if it's possible) and also it's desirable to locate
        !           278: data in such order, where code is grouped in one place and data is 
        !           279: grouped in other place (it's better than such mixing: code, data, code,
        !           280: data, ...).
        !           281: 
        !           282: 
        !           283: Using Filters
        !           284: -------------
        !           285: You can increase compression ratio for some data types, using
        !           286: special filters before compressing. For example, it's possible to 
        !           287: increase compression ratio on 5-10% for code for those CPU ISAs: 
        !           288: x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.
        !           289: 
        !           290: You can find C/C++ source code of such filters in folder "7zip/Compress/Branch"
        !           291: 
        !           292: You can check compression ratio gain of these filters with such 
        !           293: 7-Zip commands (example for ARM code):
        !           294: No filter:
        !           295:   7z a a1.7z a.bin -m0=lzma
        !           296: 
        !           297: With filter for little-endian ARM code:
        !           298:   7z a a2.7z a.bin -m0=bc_arm -m1=lzma        
        !           299: 
        !           300: With filter for big-endian ARM code (using additional Swap4 filter):
        !           301:   7z a a3.7z a.bin -m0=swap4 -m1=bc_arm -m2=lzma
        !           302: 
        !           303: It works in such manner:
        !           304: Compressing    = Filter_encoding + LZMA_encoding
        !           305: Decompressing  = LZMA_decoding + Filter_decoding
        !           306: 
        !           307: Compressing and decompressing speed of such filters is very high,
        !           308: so it will not increase decompressing time too much.
        !           309: Moreover, it reduces decompression time for LZMA_decoding, 
        !           310: since compression ratio with filtering is higher.
        !           311: 
        !           312: These filters convert CALL (calling procedure) instructions 
        !           313: from relative offsets to absolute addresses, so such data becomes more 
        !           314: compressible. Source code of these CALL filters is pretty simple
        !           315: (about 20 lines of C++), so you can convert it from C++ version yourself.
        !           316: 
        !           317: For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.
        !           318: 
        !           319: 
        !           320: LZMA compressed file format
        !           321: ---------------------------
        !           322: Offset Size Description
        !           323:   0     1   Special LZMA properties for compressed data
        !           324:   1     4   Dictionary size (little endian)
        !           325:   5     8   Uncompressed size (little endian). -1 means unknown size
        !           326:  13         Compressed data
        !           327: 
        !           328: 
        !           329: ANSI-C LZMA Decoder
        !           330: ~~~~~~~~~~~~~~~~~~~
        !           331: 
        !           332: To use ANSI-C LZMA Decoder you need to files:
        !           333: LzmaDecode.h and one of the following two files:
        !           334: 1) LzmaDecode.c      - LZMA decoding on ANSI-C (new fastest version)
        !           335: 2) LzmaDecodeSize.c  - LZMA decoding on ANSI-C (old size-optimized version)
        !           336: use LzmaDecode.c, if you need fastest code.
        !           337: 
        !           338: 
        !           339: Memory requirements for LZMA decoding
        !           340: -------------------------------------
        !           341: 
        !           342: LZMA decoder doesn't allocate memory itself, so you must 
        !           343: calculate required memory, allocate it and send it to LZMA.
        !           344: 
        !           345: Stack usage of LZMA function for local variables is not 
        !           346: larger than 200 bytes.
        !           347: 
        !           348: Memory requirements for decompression depend 
        !           349: from interface that you want to use:
        !           350: 
        !           351:   a) Memory to memory decompression:
        !           352:     
        !           353:     M1 = (inputSize + outputSize + lzmaInternalSize).
        !           354: 
        !           355:   b) Decompression with buffering:
        !           356: 
        !           357:     M2 = (inputBufferSize + outputBufferSize + dictionarySize + lzmaInternalSize)
        !           358: 
        !           359: 
        !           360: How To decompress data
        !           361: ----------------------
        !           362: 
        !           363: 1) Read first byte of properties for LZMA compressed stream, 
        !           364:    check that it has correct value and calculate three 
        !           365:    LZMA property variables:
        !           366: 
        !           367:   int lc, lp, pb;
        !           368:   unsigned char prop0 = properties[0];
        !           369:   if (prop0 >= (9*5*5))
        !           370:   {
        !           371:     sprintf(rs + strlen(rs), "\n properties error");
        !           372:     return 1;
        !           373:   }
        !           374:   for (pb = 0; prop0 >= (9 * 5); 
        !           375:     pb++, prop0 -= (9 * 5));
        !           376:   for (lp = 0; prop0 >= 9; 
        !           377:     lp++, prop0 -= 9);
        !           378:   lc = prop0;
        !           379: 
        !           380: 2) Calculate required amount for LZMA lzmaInternalSize:
        !           381: 
        !           382:   lzmaInternalSize = (LZMA_BASE_SIZE + (LZMA_LIT_SIZE << (lc + lp))) * 
        !           383:      sizeof(CProb)
        !           384: 
        !           385:   LZMA_BASE_SIZE = 1846
        !           386:   LZMA_LIT_SIZE = 768
        !           387: 
        !           388:   LZMA decoder uses array of CProb variables as internal structure.
        !           389:   By default, CProb is (unsigned short)
        !           390:   But you can define _LZMA_PROB32 to make it (unsigned int)
        !           391:   It can increase speed on some 32-bit CPUs, but memory usage will 
        !           392:   be doubled in that case.
        !           393: 
        !           394: 
        !           395:   2b) If you use Decompression with buffering, add 100 bytes to 
        !           396:       lzmaInternalSize:
        !           397:      
        !           398:       #ifdef _LZMA_OUT_READ
        !           399:       lzmaInternalSize += 100;
        !           400:       #endif
        !           401: 
        !           402: 3) Allocate that memory with malloc or some other function:
        !           403: 
        !           404:   lzmaInternalData = malloc(lzmaInternalSize);
        !           405: 
        !           406: 
        !           407: 4) Decompress data:
        !           408: 
        !           409:   4a) If you use simple memory to memory decompression:
        !           410: 
        !           411:     int result = LzmaDecode(lzmaInternalData, lzmaInternalSize,
        !           412:         lc, lp, pb,
        !           413:         unsigned char *inStream, unsigned int inSize,
        !           414:         unsigned char *outStream, unsigned int outSize, 
        !           415:         &outSizeProcessed);
        !           416: 
        !           417:   4b) If you use Decompression with buffering
        !           418: 
        !           419:     4.1) Read dictionary size from properties
        !           420: 
        !           421:       unsigned int dictionarySize = 0;
        !           422:       int i;
        !           423:       for (i = 0; i < 4; i++)
        !           424:         dictionarySize += (unsigned int)(b) << (i * 8);
        !           425: 
        !           426:     4.2) Allocate memory for dictionary
        !           427: 
        !           428:       unsigned char *dictionary = malloc(dictionarySize);
        !           429: 
        !           430:     4.3) Initialize LZMA decoder:
        !           431: 
        !           432:     LzmaDecoderInit((unsigned char *)lzmaInternalData, lzmaInternalSize,
        !           433:         lc, lp, pb,
        !           434:         dictionary, dictionarySize,
        !           435:         &bo.ReadCallback);
        !           436: 
        !           437:     4.4) In loop call LzmaDecoderCode function:
        !           438: 
        !           439:     for (nowPos = 0; nowPos < outSize;)
        !           440:     {
        !           441:       unsigned int blockSize = outSize - nowPos;
        !           442:       unsigned int kBlockSize = 0x10000;
        !           443:       if (blockSize > kBlockSize)
        !           444:         blockSize = kBlockSize;
        !           445:       res = LzmaDecode((unsigned char *)lzmaInternalData, 
        !           446:       ((unsigned char *)outStream) + nowPos, blockSize, &outSizeProcessed);
        !           447:       if (res != 0)
        !           448:       {
        !           449:         printf("\nerror = %d\n", res);
        !           450:         break;
        !           451:       }
        !           452:       nowPos += outSizeProcessed;
        !           453:       if (outSizeProcessed == 0)
        !           454:       {
        !           455:         outSize = nowPos;
        !           456:         break;
        !           457:       }
        !           458:     }
        !           459: 
        !           460: 
        !           461: EXIT codes
        !           462: -----------
        !           463: 
        !           464: LZMA decoder can return one of the following codes:
        !           465: 
        !           466: #define LZMA_RESULT_OK 0
        !           467: #define LZMA_RESULT_DATA_ERROR 1
        !           468: #define LZMA_RESULT_NOT_ENOUGH_MEM 2
        !           469: 
        !           470: If you use callback function for input data and you return some 
        !           471: error code, LZMA Decoder also returns that code.
        !           472: 
        !           473: 
        !           474: 
        !           475: LZMA Defines
        !           476: ------------
        !           477: 
        !           478: _LZMA_IN_CB    - Use callback for input data
        !           479: 
        !           480: _LZMA_OUT_READ - Use read function for output data
        !           481: 
        !           482: _LZMA_LOC_OPT  - Enable local speed optimizations inside code.
        !           483:                  _LZMA_LOC_OPT is only for LzmaDecodeSize.c (size-optimized version).
        !           484:                  _LZMA_LOC_OPT doesn't affect LzmaDecode.c (speed-optimized version)
        !           485: 
        !           486: _LZMA_PROB32   - It can increase speed on some 32-bit CPUs, 
        !           487:                  but memory usage will be doubled in that case
        !           488: 
        !           489: _LZMA_UINT32_IS_ULONG  - Define it if int is 16-bit on your compiler
        !           490:                          and long is 32-bit.
        !           491: 
        !           492: 
        !           493: NOTES
        !           494: -----
        !           495: 1) please note that LzmaTest.c doesn't free allocated memory in some cases. 
        !           496: But in your real applicaions you must free memory after decompression.
        !           497: 
        !           498: 2) All numbers above were calculated for case when int is not more than 
        !           499:   32-bit in your compiler. If in your compiler int is 64-bit or larger 
        !           500:   probably LZMA can require more memory for some structures.
        !           501: 
        !           502: 
        !           503: 
        !           504: C++ LZMA Encoder/Decoder 
        !           505: ~~~~~~~~~~~~~~~~~~~~~~~~
        !           506: C++ LZMA code use COM-like interfaces. So if you want to use it, 
        !           507: you can study basics of COM/OLE.
        !           508: 
        !           509: By default, LZMA Encoder contains all Match Finders.
        !           510: But for compressing it's enough to have just one of them.
        !           511: So for reducing size of compressing code you can define:
        !           512:   #define COMPRESS_MF_BT
        !           513:   #define COMPRESS_MF_BT4
        !           514: and it will use only bt4 match finder.
        !           515: 
        !           516: 
        !           517: ---
        !           518: 
        !           519: http://www.7-zip.org
        !           520: http://www.7-zip.org/support.html
        !           521: 
        !           522: 

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>