File:  [ELWIX - Embedded LightWeight unIX -] / elwix / tools / oldlzma / lzma.txt
Revision 1.1.1.1 (vendor branch): download - view: text, annotated - select for diffs - revision graph
Tue May 14 09:04:51 2013 UTC (11 years, 1 month ago) by misho
Branches: misho, elwix1_9_mips, MAIN
CVS tags: start, elwix2_8, elwix2_7, elwix2_6, elwix2_3, elwix2_2, HEAD, ELWIX2_7, ELWIX2_6, ELWIX2_5, ELWIX2_2p0
oldlzma needs for uboot

    1: LZMA SDK 4.17 
    2: -------------
    3: 
    4: LZMA SDK 4.17  Copyright (C) 1999-2005 Igor Pavlov
    5: 
    6: LZMA SDK provides developers with documentation, source code,
    7: and sample code necessary to write software that uses LZMA compression. 
    8: 
    9: LZMA is default and general compression method of 7z format
   10: in 7-Zip compression program (www.7-zip.org). LZMA provides high 
   11: compression ratio and very fast decompression.
   12: 
   13: LZMA is an improved version of famous LZ77 compression algorithm. 
   14: It was improved in way of maximum increasing of compression ratio,
   15: keeping high decompression speed and low memory requirements for 
   16: decompressing.
   17: 
   18: 
   19: 
   20: LICENSE
   21: -------
   22: 
   23: LZMA SDK is licensed under two licenses:
   24: 
   25: 1) GNU Lesser General Public License (GNU LGPL)
   26: 2) Common Public License (CPL)
   27: 
   28: It means that you can select one of these two licenses and 
   29: follow rules of that license.
   30: 
   31: SPECIAL EXCEPTION
   32: Igor Pavlov, as the author of this code, expressly permits you 
   33: to statically or dynamically link your code (or bind by name) 
   34: to the files from LZMA SDK without subjecting your linked 
   35: code to the terms of the CPL or GNU LGPL. 
   36: Any modifications or additions to files from LZMA SDK, however, 
   37: are subject to the GNU LGPL or CPL terms.
   38: 
   39: 
   40: GNU LGPL and CPL licenses are pretty similar and both these
   41: licenses are classified as 
   42: 
   43: 1) "Free software licenses" at http://www.gnu.org/ 
   44: 2) "OSI-approved" at http://www.opensource.org/
   45: 
   46: 
   47: LZMA SDK also can be available under a proprietary license for 
   48: those who cannot use the GNU LGPL or CPL in their code. To request
   49: such proprietary license or any additional consultations,
   50: send email message from that page:
   51: http://www.7-zip.org/support.html
   52: 
   53: 
   54: You should have received a copy of the GNU Lesser General Public
   55: License along with this library; if not, write to the Free Software
   56: Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
   57: 
   58: You should have received a copy of the Common Public License
   59: along with this library.
   60: 
   61: 
   62: LZMA SDK Contents
   63: -----------------
   64: 
   65: LZMA SDK includes:
   66: 
   67:   - C++ source code of LZMA Encoder and Decoder
   68:   - C++ source code for file->file LZMA compressing and decompressing
   69:   - ANSI-C compatible source code for LZMA decompressing
   70:   - Compiled file->file LZMA compressing/decompressing program for Windows system
   71: 
   72: ANSI-C LZMA decompression code was ported from original C++ sources to C.
   73: Also it was simplified and optimized for code size. 
   74: But it is fully compatible with LZMA from 7-Zip.
   75: 
   76: 
   77: UNIX/Linux version 
   78: ------------------
   79: To compile C++ version of file->file LZMA, go to directory
   80: SRC/7zip/Compress/LZMA_Alone 
   81: and type "make" or "make clean all" to recompile all.
   82: 
   83: In some UNIX/Linux versions you must compile LZMA with static libraries.
   84: To compile with static libraries, change string in makefile
   85: LIB = -lm
   86: to string  
   87: LIB = -lm -static
   88: 
   89: 
   90: Files
   91: ---------------------
   92: SRC      - directory with source code
   93: lzma.txt - LZMA SDK description (this file)
   94: 7zFormat.txt - 7z Format description
   95: 7zC.txt  - 7z ANSI-C Decoder description (this file)
   96: methods.txt  - Compression method IDs for .7z
   97: LGPL.txt - GNU Lesser General Public License
   98: CPL.html - Common Public License
   99: lzma.exe - Compiled file->file LZMA encoder/decoder for Windows
  100: history.txt - history of the LZMA SDK
  101: 
  102: 
  103: Source code structure
  104: ---------------------
  105: 
  106: SRC
  107:   Common  - common files for C++ projects
  108:   Windows - common files for Windows related code
  109:   7zip    - files related to 7-Zip Project
  110:     Common   - common files for 7-Zip
  111:     Compress - files related to compression/decompression
  112:       LZ     - files related to LZ (Lempel-Ziv) compression algorithm
  113:         BinTree    - Binary Tree Match Finder for LZ algorithm
  114:         HashChain  - Hash Chain Match Finder for LZ algorithm
  115:         Patricia   - Patricia Match Finder for LZ algorithm
  116:       RangeCoder   - Range Coder (special code of compression/decompression)
  117:       LZMA         - LZMA compression/decompression on C++
  118:       LZMA_Alone   - file->file LZMA compression/decompression
  119:       LZMA_C       - ANSI-C compatible LZMA decompressor
  120:         LzmaDecode.h  - interface for LZMA decoding on ANSI-C
  121:         LzmaDecode.c      - LZMA decoding on ANSI-C (new fastest version)
  122:         LzmaDecodeSize.c  - LZMA decoding on ANSI-C (old size-optimized version)
  123:         LzmaTest.c    - test application that decodes LZMA encoded file
  124:       Branch       - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
  125:     Archive - files related to archiving
  126:       7z_C     - 7z ANSI-C Decoder
  127: 
  128: Source code of LZMA SDK is only part of big 7-Zip project. That is 
  129: why LZMA SDK uses such complex source code structure. 
  130: 
  131: You can find ANSI-C LZMA decompressing code at folder 
  132:   SRC/7zip/Compress/LZMA_C
  133: 7-Zip doesn't use that ANSI-C LZMA code and that code was developed 
  134: specially for this SDK. And files from LZMA_C do not need files from 
  135: other directories of SDK for compiling.
  136: 
  137: 7-Zip source code can be downloaded from 7-Zip's SourceForge page:
  138: 
  139:   http://sourceforge.net/projects/sevenzip/
  140: 
  141: 
  142: LZMA Decompression features
  143: ---------------------------
  144:   - Variable dictionary size (up to 256 MB)
  145:   - Estimated compressing speed: about 500 KB/s on 1 GHz CPU
  146:   - Estimated decompressing speed: 
  147:       - 8-12 MB/s on 1 GHz Intel Pentium 3 or AMD Athlon
  148:       - 500-1000 KB/s on 100 MHz ARM, MIPS, PowerPC or other simple RISC
  149:   - Small memory requirements for decompressing (8-32 KB + DictionarySize)
  150:   - Small code size for decompressing: 2-8 KB (depending from 
  151:     speed optimizations) 
  152: 
  153: LZMA decoder uses only integer operations and can be 
  154: implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).
  155: 
  156: Some critical operations that affect to speed of LZMA decompression:
  157:   1) 32*16 bit integer multiply
  158:   2) Misspredicted branches (penalty mostly depends from pipeline length)
  159:   3) 32-bit shift and arithmetic operations
  160: 
  161: Speed of LZMA decompression mostly depends from CPU speed.
  162: Memory speed has no big meaning. But if your CPU has small data cache, 
  163: overall weight of memory speed will slightly increase.
  164: 
  165: 
  166: How To Use
  167: ----------
  168: 
  169: Using LZMA encoder/decoder executable
  170: --------------------------------------
  171: 
  172: Usage:  LZMA <e|d> inputFile outputFile [<switches>...]
  173: 
  174:   e: encode file
  175: 
  176:   d: decode file
  177: 
  178:   b: Benchmark. There are two tests: compressing and decompressing 
  179:      with LZMA method. Benchmark shows rating in MIPS (million 
  180:      instructions per second). Rating value is calculated from 
  181:      measured speed and it is normalized with AMD Athlon XP CPU
  182:      results. Also Benchmark checks possible hardware errors (RAM 
  183:      errors in most cases). Benchmark uses these settings:
  184:      (-a1, -d21, -fb32, -mfbt4). You can change only -d. Also you 
  185:      can change number of iterations. Example for 30 iterations:
  186: 	LZMA b 30
  187:      Default number of iterations is 10.
  188: 
  189: <Switches>
  190:   
  191: 
  192:   -a{N}:  set compression mode 0 = fast, 1 = normal, 2 = max
  193:           default: 2 (max)
  194: 
  195:   d{N}:   Sets Dictionary size - [0, 28], default: 23 (8MB)
  196:           The maximum value for dictionary size is 256 MB = 2^28 bytes.
  197:           Dictionary size is calculated as DictionarySize = 2^N bytes. 
  198:           For decompressing file compressed by LZMA method with dictionary 
  199:           size D = 2^N you need about D bytes of memory (RAM).
  200: 
  201:   -fb{N}: set number of fast bytes - [5, 255], default: 128
  202:           Usually big number gives a little bit better compression ratio 
  203:           and slower compression process.
  204: 
  205:   -lc{N}: set number of literal context bits - [0, 8], default: 3
  206:           Sometimes lc=4 gives gain for big files.
  207: 
  208:   -lp{N}: set number of literal pos bits - [0, 4], default: 0
  209:           lp switch is intended for periodical data when period is 
  210:           equal 2^N. For example, for 32-bit (4 bytes) 
  211:           periodical data you can use lp=2. Often it's better to set lc0, 
  212:           if you change lp switch.
  213: 
  214:   -pb{N}: set number of pos bits - [0, 4], default: 2
  215:           pb switch is intended for periodical data 
  216:           when period is equal 2^N.
  217: 
  218:   -mf{MF_ID}: set Match Finder. Default: bt4. 
  219:               Compression ratio for all bt* and pat* almost the same.
  220:               Algorithms from hc* group doesn't provide good compression 
  221:               ratio, but they often works pretty fast in combination with 
  222:               fast mode (-a0). Methods from bt* group require less memory 
  223:               than methods from pat* group. Usually bt4 works faster than 
  224:               any pat*, but for some types of files pat* can work faster. 
  225: 
  226:               Memory requirements depend from dictionary size 
  227:               (parameter "d" in table below). 
  228: 
  229:                MF_ID     Memory                   Description
  230: 
  231:                 bt2    d*9.5 +  1MB  Binary Tree with 2 bytes hashing.
  232:                 bt3    d*9.5 + 65MB  Binary Tree with 2-3(full) bytes hashing.
  233:                 bt4    d*9.5 +  6MB  Binary Tree with 2-3-4 bytes hashing.
  234:                 bt4b   d*9.5 + 34MB  Binary Tree with 2-3-4(big) bytes hashing.
  235:                 pat2r  d*26  +  1MB  Patricia Tree with 2-bits nodes, removing.
  236:                 pat2   d*38  +  1MB  Patricia Tree with 2-bits nodes.
  237:                 pat2h  d*38  + 77MB  Patricia Tree with 2-bits nodes, 2-3 bytes hashing.
  238:                 pat3h  d*62  + 85MB  Patricia Tree with 3-bits nodes, 2-3 bytes hashing.
  239:                 pat4h  d*110 +101MB  Patricia Tree with 4-bits nodes, 2-3 bytes hashing.
  240:                 hc3    d*5.5 +  1MB  Hash Chain with 2-3 bytes hashing.
  241:                 hc4    d*5.5 +  6MB  Hash Chain with 2-3-4 bytes hashing.
  242: 
  243:   -eos:   write End Of Stream marker. By default LZMA doesn't write 
  244:           eos marker, since LZMA decoder knows uncompressed size 
  245:           stored in .lzma file header.
  246: 
  247:   -si:    Read data from stdin (it will write End Of Stream marker).
  248:   -so:    Write data to stdout
  249: 
  250: 
  251: Examples:
  252: 
  253: 1) LZMA e file.bin file.lzma -d16 -lc0 
  254: 
  255: compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K)  
  256: and 0 literal context bits. -lc0 allows to reduce memory requirements 
  257: for decompression.
  258: 
  259: 
  260: 2) LZMA e file.bin file.lzma -lc0 -lp2
  261: 
  262: compresses file.bin to file.lzma with settings suitable 
  263: for 32-bit periodical data (for example, ARM or MIPS code).
  264: 
  265: 3) LZMA d file.lzma file.bin
  266: 
  267: decompresses file.lzma to file.bin.
  268: 
  269: 
  270: Compression ratio hints
  271: -----------------------
  272: 
  273: Recommendations
  274: ---------------
  275: 
  276: To increase compression ratio for LZMA compressing it's desirable 
  277: to have aligned data (if it's possible) and also it's desirable to locate
  278: data in such order, where code is grouped in one place and data is 
  279: grouped in other place (it's better than such mixing: code, data, code,
  280: data, ...).
  281: 
  282: 
  283: Using Filters
  284: -------------
  285: You can increase compression ratio for some data types, using
  286: special filters before compressing. For example, it's possible to 
  287: increase compression ratio on 5-10% for code for those CPU ISAs: 
  288: x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.
  289: 
  290: You can find C/C++ source code of such filters in folder "7zip/Compress/Branch"
  291: 
  292: You can check compression ratio gain of these filters with such 
  293: 7-Zip commands (example for ARM code):
  294: No filter:
  295:   7z a a1.7z a.bin -m0=lzma
  296: 
  297: With filter for little-endian ARM code:
  298:   7z a a2.7z a.bin -m0=bc_arm -m1=lzma        
  299: 
  300: With filter for big-endian ARM code (using additional Swap4 filter):
  301:   7z a a3.7z a.bin -m0=swap4 -m1=bc_arm -m2=lzma
  302: 
  303: It works in such manner:
  304: Compressing    = Filter_encoding + LZMA_encoding
  305: Decompressing  = LZMA_decoding + Filter_decoding
  306: 
  307: Compressing and decompressing speed of such filters is very high,
  308: so it will not increase decompressing time too much.
  309: Moreover, it reduces decompression time for LZMA_decoding, 
  310: since compression ratio with filtering is higher.
  311: 
  312: These filters convert CALL (calling procedure) instructions 
  313: from relative offsets to absolute addresses, so such data becomes more 
  314: compressible. Source code of these CALL filters is pretty simple
  315: (about 20 lines of C++), so you can convert it from C++ version yourself.
  316: 
  317: For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.
  318: 
  319: 
  320: LZMA compressed file format
  321: ---------------------------
  322: Offset Size Description
  323:   0     1   Special LZMA properties for compressed data
  324:   1     4   Dictionary size (little endian)
  325:   5     8   Uncompressed size (little endian). -1 means unknown size
  326:  13         Compressed data
  327: 
  328: 
  329: ANSI-C LZMA Decoder
  330: ~~~~~~~~~~~~~~~~~~~
  331: 
  332: To use ANSI-C LZMA Decoder you need to files:
  333: LzmaDecode.h and one of the following two files:
  334: 1) LzmaDecode.c      - LZMA decoding on ANSI-C (new fastest version)
  335: 2) LzmaDecodeSize.c  - LZMA decoding on ANSI-C (old size-optimized version)
  336: use LzmaDecode.c, if you need fastest code.
  337: 
  338: 
  339: Memory requirements for LZMA decoding
  340: -------------------------------------
  341: 
  342: LZMA decoder doesn't allocate memory itself, so you must 
  343: calculate required memory, allocate it and send it to LZMA.
  344: 
  345: Stack usage of LZMA function for local variables is not 
  346: larger than 200 bytes.
  347: 
  348: Memory requirements for decompression depend 
  349: from interface that you want to use:
  350: 
  351:   a) Memory to memory decompression:
  352:     
  353:     M1 = (inputSize + outputSize + lzmaInternalSize).
  354: 
  355:   b) Decompression with buffering:
  356: 
  357:     M2 = (inputBufferSize + outputBufferSize + dictionarySize + lzmaInternalSize)
  358: 
  359: 
  360: How To decompress data
  361: ----------------------
  362: 
  363: 1) Read first byte of properties for LZMA compressed stream, 
  364:    check that it has correct value and calculate three 
  365:    LZMA property variables:
  366: 
  367:   int lc, lp, pb;
  368:   unsigned char prop0 = properties[0];
  369:   if (prop0 >= (9*5*5))
  370:   {
  371:     sprintf(rs + strlen(rs), "\n properties error");
  372:     return 1;
  373:   }
  374:   for (pb = 0; prop0 >= (9 * 5); 
  375:     pb++, prop0 -= (9 * 5));
  376:   for (lp = 0; prop0 >= 9; 
  377:     lp++, prop0 -= 9);
  378:   lc = prop0;
  379: 
  380: 2) Calculate required amount for LZMA lzmaInternalSize:
  381: 
  382:   lzmaInternalSize = (LZMA_BASE_SIZE + (LZMA_LIT_SIZE << (lc + lp))) * 
  383:      sizeof(CProb)
  384: 
  385:   LZMA_BASE_SIZE = 1846
  386:   LZMA_LIT_SIZE = 768
  387: 
  388:   LZMA decoder uses array of CProb variables as internal structure.
  389:   By default, CProb is (unsigned short)
  390:   But you can define _LZMA_PROB32 to make it (unsigned int)
  391:   It can increase speed on some 32-bit CPUs, but memory usage will 
  392:   be doubled in that case.
  393: 
  394: 
  395:   2b) If you use Decompression with buffering, add 100 bytes to 
  396:       lzmaInternalSize:
  397:      
  398:       #ifdef _LZMA_OUT_READ
  399:       lzmaInternalSize += 100;
  400:       #endif
  401: 
  402: 3) Allocate that memory with malloc or some other function:
  403: 
  404:   lzmaInternalData = malloc(lzmaInternalSize);
  405: 
  406: 
  407: 4) Decompress data:
  408: 
  409:   4a) If you use simple memory to memory decompression:
  410: 
  411:     int result = LzmaDecode(lzmaInternalData, lzmaInternalSize,
  412:         lc, lp, pb,
  413:         unsigned char *inStream, unsigned int inSize,
  414:         unsigned char *outStream, unsigned int outSize, 
  415:         &outSizeProcessed);
  416: 
  417:   4b) If you use Decompression with buffering
  418: 
  419:     4.1) Read dictionary size from properties
  420: 
  421:       unsigned int dictionarySize = 0;
  422:       int i;
  423:       for (i = 0; i < 4; i++)
  424:         dictionarySize += (unsigned int)(b) << (i * 8);
  425: 
  426:     4.2) Allocate memory for dictionary
  427: 
  428:       unsigned char *dictionary = malloc(dictionarySize);
  429: 
  430:     4.3) Initialize LZMA decoder:
  431: 
  432:     LzmaDecoderInit((unsigned char *)lzmaInternalData, lzmaInternalSize,
  433:         lc, lp, pb,
  434:         dictionary, dictionarySize,
  435:         &bo.ReadCallback);
  436: 
  437:     4.4) In loop call LzmaDecoderCode function:
  438: 
  439:     for (nowPos = 0; nowPos < outSize;)
  440:     {
  441:       unsigned int blockSize = outSize - nowPos;
  442:       unsigned int kBlockSize = 0x10000;
  443:       if (blockSize > kBlockSize)
  444:         blockSize = kBlockSize;
  445:       res = LzmaDecode((unsigned char *)lzmaInternalData, 
  446:       ((unsigned char *)outStream) + nowPos, blockSize, &outSizeProcessed);
  447:       if (res != 0)
  448:       {
  449:         printf("\nerror = %d\n", res);
  450:         break;
  451:       }
  452:       nowPos += outSizeProcessed;
  453:       if (outSizeProcessed == 0)
  454:       {
  455:         outSize = nowPos;
  456:         break;
  457:       }
  458:     }
  459: 
  460: 
  461: EXIT codes
  462: -----------
  463: 
  464: LZMA decoder can return one of the following codes:
  465: 
  466: #define LZMA_RESULT_OK 0
  467: #define LZMA_RESULT_DATA_ERROR 1
  468: #define LZMA_RESULT_NOT_ENOUGH_MEM 2
  469: 
  470: If you use callback function for input data and you return some 
  471: error code, LZMA Decoder also returns that code.
  472: 
  473: 
  474: 
  475: LZMA Defines
  476: ------------
  477: 
  478: _LZMA_IN_CB    - Use callback for input data
  479: 
  480: _LZMA_OUT_READ - Use read function for output data
  481: 
  482: _LZMA_LOC_OPT  - Enable local speed optimizations inside code.
  483:                  _LZMA_LOC_OPT is only for LzmaDecodeSize.c (size-optimized version).
  484:                  _LZMA_LOC_OPT doesn't affect LzmaDecode.c (speed-optimized version)
  485: 
  486: _LZMA_PROB32   - It can increase speed on some 32-bit CPUs, 
  487:                  but memory usage will be doubled in that case
  488: 
  489: _LZMA_UINT32_IS_ULONG  - Define it if int is 16-bit on your compiler
  490:                          and long is 32-bit.
  491: 
  492: 
  493: NOTES
  494: -----
  495: 1) please note that LzmaTest.c doesn't free allocated memory in some cases. 
  496: But in your real applicaions you must free memory after decompression.
  497: 
  498: 2) All numbers above were calculated for case when int is not more than 
  499:   32-bit in your compiler. If in your compiler int is 64-bit or larger 
  500:   probably LZMA can require more memory for some structures.
  501: 
  502: 
  503: 
  504: C++ LZMA Encoder/Decoder 
  505: ~~~~~~~~~~~~~~~~~~~~~~~~
  506: C++ LZMA code use COM-like interfaces. So if you want to use it, 
  507: you can study basics of COM/OLE.
  508: 
  509: By default, LZMA Encoder contains all Match Finders.
  510: But for compressing it's enough to have just one of them.
  511: So for reducing size of compressing code you can define:
  512:   #define COMPRESS_MF_BT
  513:   #define COMPRESS_MF_BT4
  514: and it will use only bt4 match finder.
  515: 
  516: 
  517: ---
  518: 
  519: http://www.7-zip.org
  520: http://www.7-zip.org/support.html
  521: 
  522: 

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>