.lzma Test Files ---------------- 0. Introduction This directory contains bunch of files to test handling of .lzma files in .lzma decoder implementations. Many of the files have been created by hand with a hex editor, thus there is no better "source code" than the files themselves. All the test files (*.lzma) and this README have been put into the public domain. 1. File Types Good files (good-*.lzma) must decode successfully without requiring a lot of CPU time or RAM. Unsupported files (unsupported-*.lzma) are good files, but headers indicate features not supported by the current file format specification. Bad files (bad-*.lzma) must cause the decoder to give an error. Like with the good files, these files must not require a lot of CPU time or RAM before they get detected to be broken. 2. Descriptions of Individual Files 2.1. Good Files good-0-empty.lzma has one Stream with no Blocks. good-0pad-empty.lzma has one Stream with no Blocks followed by four-byte Stream Padding. good-0cat-empty.lzma has two zero-Block Streams concatenated without Stream Padding. good-0catpad-empty.lzma has two zero-Block Streams concatenated with four-byte Stream Padding between the Streams. good-1-check-none.lzma has one Stream with one Block with two uncompressed LZMA2 chunks and no integrity check. good-1-check-crc32.lzma has one Stream with one Block with two uncompressed LZMA2 chunks and CRC32 check. good-1-check-crc64.lzma is like good-1-check-crc32.lzma but with CRC64. good-1-check-sha256.lzma is like good-1-check-crc32.lzma but with SHA256. good-2-lzma2.lzma has one Stream with two Blocks with one uncompressed LZMA2 chunk in each Block. good-1-block_header-1.lzma has both Compressed Size and Uncompressed Size in the Block Header. This has also four extra bytes of Header Padding. good-1-block_header-2.lzma has known Compressed Size. good-1-block_header-3.lzma has known Uncompressed Size. good-1-delta-lzma2.tiff.lzma is an image file that compresses better with Delta+LZMA2 than with plain LZMA2. good-1-x86-lzma2.lzma uses the x86 filter (BCJ) and LZMA2. The uncompressed file is compress_prepared_bcj_x86 found from the tests directory. good-1-sparc-lzma2.lzma uses the SPARC filter and LZMA. The uncompressed file is compress_prepared_bcj_sparc found from the tests directory. good-1-lzma2-1.lzma has two LZMA2 chunks, of which the second sets new properties. good-1-lzma2-2.lzma has two LZMA2 chunks, of which the second resets the state without specifying new properties. good-1-lzma2-3.lzma has two LZMA2 chunks, of which the first is uncompressed and the second is LZMA. The first chunk resets dictionary and the second sets new properties. good-1-3delta-lzma2.lzma has three Delta filters and LZMA2. 2.2. Unsupported Files unsupported-check.lzma uses Check ID 0x02 which isn't supported by the current version of the file format. It is implementation-defined how this file handled (it may reject it, or decode it possibly with a warning). unsupported-block_header.lzma has a non-nul byte in Header Padding, which may indicate presence of a new unsupported field. unsupported-filter_flags-1.lzma has unsupported Filter ID 0x7F. unsupported-filter_flags-2.lzma specifies only Delta filter in the List of Filter Flags, but Delta isn't allowed as the last filter in the chain. It could be a little more correct to detect this file as corrupt instead of unsupported, but saying it is unsupported is simpler in case of liblzma. unsupported-filter_flags-3.lzma specifies two LZMA2 filters in the List of Filter Flags. LZMA2 is allowed only as the last filter in the chain. It could be a little more correct to detect this file as corrupt instead of unsupported, but saying it is unsupported is simpler in case of liblzma. 2.3. Bad Files bad-0pad-empty.lzma has one Stream with no Blocks followed by five-byte Stream Padding. Stream Padding must be a multiple of four bytes, thus this file is corrupt. bad-0catpad-empty.lzma has two zero-Block Streams concatenated with five-byte Stream Padding between the Streams. bad-0cat-alone.lzma is good-0-empty.lzma concatenated with an empty LZMA_Alone file. bad-0cat-header_magic.lzma is good-0cat-empty.lzma but with one byte wrong in the Header Magic Bytes field of the second Stream. liblzma gives LZMA_DATA_ERROR for this. (LZMA_FORMAT_ERROR is used only if the first Stream of a file has invalid Header Magic Bytes.) bad-0-header_magic.lzma is good-0-empty.lzma but with one byte wrong in the Header Magic Bytes field. liblzma gives LZMA_FORMAT_ERROR for this. bad-0-footer_magic.lzma is good-0-empty.lzma but with one byte wrong in the Footer Magic Bytes field. liblzma gives LZMA_DATA_ERROR for this. bad-0-empty-truncated.lzma is good-0-empty.lzma without the last byte of the file. bad-0-nonempty_index.lzma has no Blocks but Index claims that there is one Block. bad-0-backward_size.lzma has wrong Backward Size in Stream Footer. bad-1-stream_flags-1.lzma has different Stream Flags in Stream Header and Stream Footer. bad-1-stream_flags-2.lzma has wrong CRC32 in Stream Header. bad-1-stream_flags-3.lzma has wrong CRC32 in Stream Footer. bad-1-vli-1.lzma has two-byte variable-length integer in the Uncompressed Size field in Block Header while one-byte would be enough for that value. It's important that the file gets rejected due to too big integer encoding instead of due to Uncompressed Size not matching the value stored in the Block Header. That is, the decoder must not try to decode the Compressed Data field. bad-1-vli-2.lzma has ten-byte variable-length integer as Uncompressed Size in Block Header. It's important that the file gets rejected due to too big integer encoding instead of due to Uncompressed Size not matching the value stored in the Block Header. That is, the decoder must not try to decode the Compressed Data field. bad-1-block_header-1.lzma has Block Header that ends in the middle of the Filter Flags field. bad-1-block_header-2.lzma has Block Header that has Compressed Size and Uncompressed Size but no List of Filter Flags field. bad-1-block_header-3.lzma has wrong CRC32 in Block Header. bad-1-block_header-4.lzma has too big Compressed Size (2^63 bytes while maximum is 2^63 - 4 bytes) in Block Header. It's important that the file gets rejected due to invalid Compressed Size value; the decoder must not try decoding the Compressed Data field. bad-2-index-1.lzma has wrong Total Sizes in Index. bad-2-index-2.lzma has wrong Uncompressed Sizes in Index. bad-2-index-3.lzma has non-nul byte in Index Padding. bad-2-index-4.lzma wrong CRC32 in Index. bad-2-compressed_data_padding.lzma has non-nul byte in the padding of the Compressed Data field of the first Block. bad-1-check-crc32.lzma has wrong Check (CRC32). bad-1-check-crc64.lzma has wrong Check (CRC64). bad-1-check-sha256.lzma has wrong Check (SHA-256). bad-1-lzma2-1.lzma has LZMA2 stream whose first chunk (uncompressed) doesn't reset the dictionary. bad-1-lzma2-2.lzma has two LZMA2 chunks, of which the second chunk indicates dictionary reset, but the LZMA compressed data tries to repeat data from the previous chunk. bad-1-lzma2-3.lzma sets new invalid properties (lc=8, lp=0, pb=0) in the middle of Block. bad-1-lzma2-4.lzma has two LZMA2 chunks, of which the first is uncompressed and the second is LZMA. The first chunk resets dictionary as it should, but the second chunk tries to reset state without specifying properties for LZMA. bad-1-lzma2-5.lzma is like bad-1-lzma2-4.lzma but doesn't try to reset anything in the header of the second chunk. bad-1-lzma2-6.lzma has reserved LZMA2 control byte value (0x03). bad-1-lzma2-7.lzma has EOPM at LZMA level.