elwix/tools/oldlzma/lzma.txt - view

File: [ELWIX - Embedded LightWeight unIX -] / elwix / tools / oldlzma / lzma.txt
Revision 1.1.1.1 (vendor branch): download - view: text, annotated - select for diffs - revision graph
Tue May 14 09:04:51 2013 UTC (11 years, 1 month ago) by misho
Branches: misho, elwix1_9_mips, MAIN
CVS tags: start, elwix2_8, elwix2_7, elwix2_6, elwix2_3, elwix2_2, HEAD, ELWIX2_7, ELWIX2_6, ELWIX2_5, ELWIX2_2p0

oldlzma needs for uboot

1: LZMA SDK 4.17 2: ------------- 3: 4: LZMA SDK 4.17 Copyright (C) 1999-2005 Igor Pavlov 5: 6: LZMA SDK provides developers with documentation, source code, 7: and sample code necessary to write software that uses LZMA compression. 8: 9: LZMA is default and general compression method of 7z format 10: in 7-Zip compression program (www.7-zip.org). LZMA provides high 11: compression ratio and very fast decompression. 12: 13: LZMA is an improved version of famous LZ77 compression algorithm. 14: It was improved in way of maximum increasing of compression ratio, 15: keeping high decompression speed and low memory requirements for 16: decompressing. 17: 18: 19: 20: LICENSE 21: ------- 22: 23: LZMA SDK is licensed under two licenses: 24: 25: 1) GNU Lesser General Public License (GNU LGPL) 26: 2) Common Public License (CPL) 27: 28: It means that you can select one of these two licenses and 29: follow rules of that license. 30: 31: SPECIAL EXCEPTION 32: Igor Pavlov, as the author of this code, expressly permits you 33: to statically or dynamically link your code (or bind by name) 34: to the files from LZMA SDK without subjecting your linked 35: code to the terms of the CPL or GNU LGPL. 36: Any modifications or additions to files from LZMA SDK, however, 37: are subject to the GNU LGPL or CPL terms. 38: 39: 40: GNU LGPL and CPL licenses are pretty similar and both these 41: licenses are classified as 42: 43: 1) "Free software licenses" at http://www.gnu.org/ 44: 2) "OSI-approved" at http://www.opensource.org/ 45: 46: 47: LZMA SDK also can be available under a proprietary license for 48: those who cannot use the GNU LGPL or CPL in their code. To request 49: such proprietary license or any additional consultations, 50: send email message from that page: 51: http://www.7-zip.org/support.html 52: 53: 54: You should have received a copy of the GNU Lesser General Public 55: License along with this library; if not, write to the Free Software 56: Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 57: 58: You should have received a copy of the Common Public License 59: along with this library. 60: 61: 62: LZMA SDK Contents 63: ----------------- 64: 65: LZMA SDK includes: 66: 67: - C++ source code of LZMA Encoder and Decoder 68: - C++ source code for file->file LZMA compressing and decompressing 69: - ANSI-C compatible source code for LZMA decompressing 70: - Compiled file->file LZMA compressing/decompressing program for Windows system 71: 72: ANSI-C LZMA decompression code was ported from original C++ sources to C. 73: Also it was simplified and optimized for code size. 74: But it is fully compatible with LZMA from 7-Zip. 75: 76: 77: UNIX/Linux version 78: ------------------ 79: To compile C++ version of file->file LZMA, go to directory 80: SRC/7zip/Compress/LZMA_Alone 81: and type "make" or "make clean all" to recompile all. 82: 83: In some UNIX/Linux versions you must compile LZMA with static libraries. 84: To compile with static libraries, change string in makefile 85: LIB = -lm 86: to string 87: LIB = -lm -static 88: 89: 90: Files 91: --------------------- 92: SRC - directory with source code 93: lzma.txt - LZMA SDK description (this file) 94: 7zFormat.txt - 7z Format description 95: 7zC.txt - 7z ANSI-C Decoder description (this file) 96: methods.txt - Compression method IDs for .7z 97: LGPL.txt - GNU Lesser General Public License 98: CPL.html - Common Public License 99: lzma.exe - Compiled file->file LZMA encoder/decoder for Windows 100: history.txt - history of the LZMA SDK 101: 102: 103: Source code structure 104: --------------------- 105: 106: SRC 107: Common - common files for C++ projects 108: Windows - common files for Windows related code 109: 7zip - files related to 7-Zip Project 110: Common - common files for 7-Zip 111: Compress - files related to compression/decompression 112: LZ - files related to LZ (Lempel-Ziv) compression algorithm 113: BinTree - Binary Tree Match Finder for LZ algorithm 114: HashChain - Hash Chain Match Finder for LZ algorithm 115: Patricia - Patricia Match Finder for LZ algorithm 116: RangeCoder - Range Coder (special code of compression/decompression) 117: LZMA - LZMA compression/decompression on C++ 118: LZMA_Alone - file->file LZMA compression/decompression 119: LZMA_C - ANSI-C compatible LZMA decompressor 120: LzmaDecode.h - interface for LZMA decoding on ANSI-C 121: LzmaDecode.c - LZMA decoding on ANSI-C (new fastest version) 122: LzmaDecodeSize.c - LZMA decoding on ANSI-C (old size-optimized version) 123: LzmaTest.c - test application that decodes LZMA encoded file 124: Branch - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code 125: Archive - files related to archiving 126: 7z_C - 7z ANSI-C Decoder 127: 128: Source code of LZMA SDK is only part of big 7-Zip project. That is 129: why LZMA SDK uses such complex source code structure. 130: 131: You can find ANSI-C LZMA decompressing code at folder 132: SRC/7zip/Compress/LZMA_C 133: 7-Zip doesn't use that ANSI-C LZMA code and that code was developed 134: specially for this SDK. And files from LZMA_C do not need files from 135: other directories of SDK for compiling. 136: 137: 7-Zip source code can be downloaded from 7-Zip's SourceForge page: 138: 139: http://sourceforge.net/projects/sevenzip/ 140: 141: 142: LZMA Decompression features 143: --------------------------- 144: - Variable dictionary size (up to 256 MB) 145: - Estimated compressing speed: about 500 KB/s on 1 GHz CPU 146: - Estimated decompressing speed: 147: - 8-12 MB/s on 1 GHz Intel Pentium 3 or AMD Athlon 148: - 500-1000 KB/s on 100 MHz ARM, MIPS, PowerPC or other simple RISC 149: - Small memory requirements for decompressing (8-32 KB + DictionarySize) 150: - Small code size for decompressing: 2-8 KB (depending from 151: speed optimizations) 152: 153: LZMA decoder uses only integer operations and can be 154: implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions). 155: 156: Some critical operations that affect to speed of LZMA decompression: 157: 1) 32*16 bit integer multiply 158: 2) Misspredicted branches (penalty mostly depends from pipeline length) 159: 3) 32-bit shift and arithmetic operations 160: 161: Speed of LZMA decompression mostly depends from CPU speed. 162: Memory speed has no big meaning. But if your CPU has small data cache, 163: overall weight of memory speed will slightly increase. 164: 165: 166: How To Use 167: ---------- 168: 169: Using LZMA encoder/decoder executable 170: -------------------------------------- 171: 172: Usage: LZMA <e|d> inputFile outputFile [<switches>...] 173: 174: e: encode file 175: 176: d: decode file 177: 178: b: Benchmark. There are two tests: compressing and decompressing 179: with LZMA method. Benchmark shows rating in MIPS (million 180: instructions per second). Rating value is calculated from 181: measured speed and it is normalized with AMD Athlon XP CPU 182: results. Also Benchmark checks possible hardware errors (RAM 183: errors in most cases). Benchmark uses these settings: 184: (-a1, -d21, -fb32, -mfbt4). You can change only -d. Also you 185: can change number of iterations. Example for 30 iterations: 186: LZMA b 30 187: Default number of iterations is 10. 188: 189: <Switches> 190: 191: 192: -a{N}: set compression mode 0 = fast, 1 = normal, 2 = max 193: default: 2 (max) 194: 195: d{N}: Sets Dictionary size - [0, 28], default: 23 (8MB) 196: The maximum value for dictionary size is 256 MB = 2^28 bytes. 197: Dictionary size is calculated as DictionarySize = 2^N bytes. 198: For decompressing file compressed by LZMA method with dictionary 199: size D = 2^N you need about D bytes of memory (RAM). 200: 201: -fb{N}: set number of fast bytes - [5, 255], default: 128 202: Usually big number gives a little bit better compression ratio 203: and slower compression process. 204: 205: -lc{N}: set number of literal context bits - [0, 8], default: 3 206: Sometimes lc=4 gives gain for big files. 207: 208: -lp{N}: set number of literal pos bits - [0, 4], default: 0 209: lp switch is intended for periodical data when period is 210: equal 2^N. For example, for 32-bit (4 bytes) 211: periodical data you can use lp=2. Often it's better to set lc0, 212: if you change lp switch. 213: 214: -pb{N}: set number of pos bits - [0, 4], default: 2 215: pb switch is intended for periodical data 216: when period is equal 2^N. 217: 218: -mf{MF_ID}: set Match Finder. Default: bt4. 219: Compression ratio for all bt* and pat* almost the same. 220: Algorithms from hc* group doesn't provide good compression 221: ratio, but they often works pretty fast in combination with 222: fast mode (-a0). Methods from bt* group require less memory 223: than methods from pat* group. Usually bt4 works faster than 224: any pat*, but for some types of files pat* can work faster. 225: 226: Memory requirements depend from dictionary size 227: (parameter "d" in table below). 228: 229: MF_ID Memory Description 230: 231: bt2 d*9.5 + 1MB Binary Tree with 2 bytes hashing. 232: bt3 d*9.5 + 65MB Binary Tree with 2-3(full) bytes hashing. 233: bt4 d*9.5 + 6MB Binary Tree with 2-3-4 bytes hashing. 234: bt4b d*9.5 + 34MB Binary Tree with 2-3-4(big) bytes hashing. 235: pat2r d*26 + 1MB Patricia Tree with 2-bits nodes, removing. 236: pat2 d*38 + 1MB Patricia Tree with 2-bits nodes. 237: pat2h d*38 + 77MB Patricia Tree with 2-bits nodes, 2-3 bytes hashing. 238: pat3h d*62 + 85MB Patricia Tree with 3-bits nodes, 2-3 bytes hashing. 239: pat4h d*110 +101MB Patricia Tree with 4-bits nodes, 2-3 bytes hashing. 240: hc3 d*5.5 + 1MB Hash Chain with 2-3 bytes hashing. 241: hc4 d*5.5 + 6MB Hash Chain with 2-3-4 bytes hashing. 242: 243: -eos: write End Of Stream marker. By default LZMA doesn't write 244: eos marker, since LZMA decoder knows uncompressed size 245: stored in .lzma file header. 246: 247: -si: Read data from stdin (it will write End Of Stream marker). 248: -so: Write data to stdout 249: 250: 251: Examples: 252: 253: 1) LZMA e file.bin file.lzma -d16 -lc0 254: 255: compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K) 256: and 0 literal context bits. -lc0 allows to reduce memory requirements 257: for decompression. 258: 259: 260: 2) LZMA e file.bin file.lzma -lc0 -lp2 261: 262: compresses file.bin to file.lzma with settings suitable 263: for 32-bit periodical data (for example, ARM or MIPS code). 264: 265: 3) LZMA d file.lzma file.bin 266: 267: decompresses file.lzma to file.bin. 268: 269: 270: Compression ratio hints 271: ----------------------- 272: 273: Recommendations 274: --------------- 275: 276: To increase compression ratio for LZMA compressing it's desirable 277: to have aligned data (if it's possible) and also it's desirable to locate 278: data in such order, where code is grouped in one place and data is 279: grouped in other place (it's better than such mixing: code, data, code, 280: data, ...). 281: 282: 283: Using Filters 284: ------------- 285: You can increase compression ratio for some data types, using 286: special filters before compressing. For example, it's possible to 287: increase compression ratio on 5-10% for code for those CPU ISAs: 288: x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC. 289: 290: You can find C/C++ source code of such filters in folder "7zip/Compress/Branch" 291: 292: You can check compression ratio gain of these filters with such 293: 7-Zip commands (example for ARM code): 294: No filter: 295: 7z a a1.7z a.bin -m0=lzma 296: 297: With filter for little-endian ARM code: 298: 7z a a2.7z a.bin -m0=bc_arm -m1=lzma 299: 300: With filter for big-endian ARM code (using additional Swap4 filter): 301: 7z a a3.7z a.bin -m0=swap4 -m1=bc_arm -m2=lzma 302: 303: It works in such manner: 304: Compressing = Filter_encoding + LZMA_encoding 305: Decompressing = LZMA_decoding + Filter_decoding 306: 307: Compressing and decompressing speed of such filters is very high, 308: so it will not increase decompressing time too much. 309: Moreover, it reduces decompression time for LZMA_decoding, 310: since compression ratio with filtering is higher. 311: 312: These filters convert CALL (calling procedure) instructions 313: from relative offsets to absolute addresses, so such data becomes more 314: compressible. Source code of these CALL filters is pretty simple 315: (about 20 lines of C++), so you can convert it from C++ version yourself. 316: 317: For some ISAs (for example, for MIPS) it's impossible to get gain from such filter. 318: 319: 320: LZMA compressed file format 321: --------------------------- 322: Offset Size Description 323: 0 1 Special LZMA properties for compressed data 324: 1 4 Dictionary size (little endian) 325: 5 8 Uncompressed size (little endian). -1 means unknown size 326: 13 Compressed data 327: 328: 329: ANSI-C LZMA Decoder 330: ~~~~~~~~~~~~~~~~~~~ 331: 332: To use ANSI-C LZMA Decoder you need to files: 333: LzmaDecode.h and one of the following two files: 334: 1) LzmaDecode.c - LZMA decoding on ANSI-C (new fastest version) 335: 2) LzmaDecodeSize.c - LZMA decoding on ANSI-C (old size-optimized version) 336: use LzmaDecode.c, if you need fastest code. 337: 338: 339: Memory requirements for LZMA decoding 340: ------------------------------------- 341: 342: LZMA decoder doesn't allocate memory itself, so you must 343: calculate required memory, allocate it and send it to LZMA. 344: 345: Stack usage of LZMA function for local variables is not 346: larger than 200 bytes. 347: 348: Memory requirements for decompression depend 349: from interface that you want to use: 350: 351: a) Memory to memory decompression: 352: 353: M1 = (inputSize + outputSize + lzmaInternalSize). 354: 355: b) Decompression with buffering: 356: 357: M2 = (inputBufferSize + outputBufferSize + dictionarySize + lzmaInternalSize) 358: 359: 360: How To decompress data 361: ---------------------- 362: 363: 1) Read first byte of properties for LZMA compressed stream, 364: check that it has correct value and calculate three 365: LZMA property variables: 366: 367: int lc, lp, pb; 368: unsigned char prop0 = properties[0]; 369: if (prop0 >= (9*5*5)) 370: { 371: sprintf(rs + strlen(rs), "\n properties error"); 372: return 1; 373: } 374: for (pb = 0; prop0 >= (9 * 5); 375: pb++, prop0 -= (9 * 5)); 376: for (lp = 0; prop0 >= 9; 377: lp++, prop0 -= 9); 378: lc = prop0; 379: 380: 2) Calculate required amount for LZMA lzmaInternalSize: 381: 382: lzmaInternalSize = (LZMA_BASE_SIZE + (LZMA_LIT_SIZE << (lc + lp))) * 383: sizeof(CProb) 384: 385: LZMA_BASE_SIZE = 1846 386: LZMA_LIT_SIZE = 768 387: 388: LZMA decoder uses array of CProb variables as internal structure. 389: By default, CProb is (unsigned short) 390: But you can define _LZMA_PROB32 to make it (unsigned int) 391: It can increase speed on some 32-bit CPUs, but memory usage will 392: be doubled in that case. 393: 394: 395: 2b) If you use Decompression with buffering, add 100 bytes to 396: lzmaInternalSize: 397: 398: #ifdef _LZMA_OUT_READ 399: lzmaInternalSize += 100; 400: #endif 401: 402: 3) Allocate that memory with malloc or some other function: 403: 404: lzmaInternalData = malloc(lzmaInternalSize); 405: 406: 407: 4) Decompress data: 408: 409: 4a) If you use simple memory to memory decompression: 410: 411: int result = LzmaDecode(lzmaInternalData, lzmaInternalSize, 412: lc, lp, pb, 413: unsigned char *inStream, unsigned int inSize, 414: unsigned char *outStream, unsigned int outSize, 415: &outSizeProcessed); 416: 417: 4b) If you use Decompression with buffering 418: 419: 4.1) Read dictionary size from properties 420: 421: unsigned int dictionarySize = 0; 422: int i; 423: for (i = 0; i < 4; i++) 424: dictionarySize += (unsigned int)(b) << (i * 8); 425: 426: 4.2) Allocate memory for dictionary 427: 428: unsigned char *dictionary = malloc(dictionarySize); 429: 430: 4.3) Initialize LZMA decoder: 431: 432: LzmaDecoderInit((unsigned char *)lzmaInternalData, lzmaInternalSize, 433: lc, lp, pb, 434: dictionary, dictionarySize, 435: &bo.ReadCallback); 436: 437: 4.4) In loop call LzmaDecoderCode function: 438: 439: for (nowPos = 0; nowPos < outSize;) 440: { 441: unsigned int blockSize = outSize - nowPos; 442: unsigned int kBlockSize = 0x10000; 443: if (blockSize > kBlockSize) 444: blockSize = kBlockSize; 445: res = LzmaDecode((unsigned char *)lzmaInternalData, 446: ((unsigned char *)outStream) + nowPos, blockSize, &outSizeProcessed); 447: if (res != 0) 448: { 449: printf("\nerror = %d\n", res); 450: break; 451: } 452: nowPos += outSizeProcessed; 453: if (outSizeProcessed == 0) 454: { 455: outSize = nowPos; 456: break; 457: } 458: } 459: 460: 461: EXIT codes 462: ----------- 463: 464: LZMA decoder can return one of the following codes: 465: 466: #define LZMA_RESULT_OK 0 467: #define LZMA_RESULT_DATA_ERROR 1 468: #define LZMA_RESULT_NOT_ENOUGH_MEM 2 469: 470: If you use callback function for input data and you return some 471: error code, LZMA Decoder also returns that code. 472: 473: 474: 475: LZMA Defines 476: ------------ 477: 478: _LZMA_IN_CB - Use callback for input data 479: 480: _LZMA_OUT_READ - Use read function for output data 481: 482: _LZMA_LOC_OPT - Enable local speed optimizations inside code. 483: _LZMA_LOC_OPT is only for LzmaDecodeSize.c (size-optimized version). 484: _LZMA_LOC_OPT doesn't affect LzmaDecode.c (speed-optimized version) 485: 486: _LZMA_PROB32 - It can increase speed on some 32-bit CPUs, 487: but memory usage will be doubled in that case 488: 489: _LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler 490: and long is 32-bit. 491: 492: 493: NOTES 494: ----- 495: 1) please note that LzmaTest.c doesn't free allocated memory in some cases. 496: But in your real applicaions you must free memory after decompression. 497: 498: 2) All numbers above were calculated for case when int is not more than 499: 32-bit in your compiler. If in your compiler int is 64-bit or larger 500: probably LZMA can require more memory for some structures. 501: 502: 503: 504: C++ LZMA Encoder/Decoder 505: ~~~~~~~~~~~~~~~~~~~~~~~~ 506: C++ LZMA code use COM-like interfaces. So if you want to use it, 507: you can study basics of COM/OLE. 508: 509: By default, LZMA Encoder contains all Match Finders. 510: But for compressing it's enough to have just one of them. 511: So for reducing size of compressing code you can define: 512: #define COMPRESS_MF_BT 513: #define COMPRESS_MF_BT4 514: and it will use only bt4 match finder. 515: 516: 517: --- 518: 519: http://www.7-zip.org 520: http://www.7-zip.org/support.html 521: 522: