7 This directory contains bunch of files to test handling of .lzma files
8 in .lzma decoder implementations. Many of the files have been created
9 by hand with a hex editor, thus there is no better "source code" than
10 the files themselves. All the test files (*.lzma) and this README have
11 been put into the public domain.
16 Good files (good-*.lzma) must decode successfully without requiring
17 a lot of CPU time or RAM. If the decoder supports only Single-Block
18 Streams, then good-multi-*.lzma won't decode, of course.
20 Bad files (bad-*.lzma) must cause the decoder to give an error. Like
21 with the good files, these files must not require a lot of CPU time
22 or RAM before they get detected to be broken.
24 Malicious files (malicious-*.lzma) are good in terms of the file format
25 specification, but try to trigger excessive CPU, RAM or disk usage in
26 the decoder. To prevent malicious files from putting the decoder in
27 inifinite loop (*), eating all available RAM or disk space, decoders
28 should have internal limitters that catch these situations.
30 (*) Strictly speaking not infinite, but if decoding of a small file
31 would take a few weeks or even years, it's an infinite loop in
35 2. Descriptions of Individual Files
39 good-single-none.lzma uses implicit Copy filter with known Uncompressed
42 good-single-none-pad.lzma is good-single-none.lzma with Footer Padding.
44 good-cat-single-none-pad.lzma is two good-single-none-pad.lzma files
45 concatenated as is. Fully decoding this file requires that the decoder
46 supports decoding concatenated files.
48 good-single-subblock_implicit.lzma uses implicit Subblock filter.
50 good-single-lzma.lzma is LZMA compressed file with EOPM.
52 good-single-subblock-lzma.lzma has basic combination of Subblock and
55 good-single-none-empty_1.lzma is an empty file with implicit Copy
56 filter and no integrity Check.
58 good-single-none-empty_2.lzma is an empty file with implicit Copy
59 filter and CRC32 as Check.
61 good-single-none-empty_3.lzma is an empty file with implicit Copy
62 filter, known Compressed Size, and no integrity Check.
64 good-single-lzma-empty.lzma is an empty file with LZMA filter and no
67 good-single-subblock_rle.lzma takes advantage of Subblock filter's
70 good-single-delta-lzma.tiff.lzma is an image file that compresses
71 better with Delta+LZMA than with plain LZMA.
73 good-single-lzma-flush_1.lzma has a flush marker in the middle of
74 the file, and no EOPM.
76 good-single-lzma-flush_2.lzma has a flush marker in the middle of
77 the file and just before EOPM.
82 bad-single-none-truncated.lzma is good-single-none.lzma without the
83 last byte of the file.
85 bad-cat-single-none-pad_garbage_1.lzma is good-cat-single-none-pad.lzma
86 with 0xFE appended to the end of the file. 0xFE doesn't begin .lzma
87 or LZMA_Alone format file.
89 bad-cat-single-none-pad_garbage_2.lzma is good-cat-single-none-pad.lzma
90 with 0xFF appended to the end of the file. 0xFF begins .lzma format
91 file, thus the decoder has to detect that the file is incomplete.
93 bad-cat-single-none-pad_garbage_3.lzma is good-cat-single-none-pad.lzma
94 with 0x5D appended to the end of the file. 0x5D is the most common
95 first byte of LZMA_Alone format file.
97 bad-single-none-footer_filter_flags.lzma has different Stream Flags
98 in Stream Footer than in Stream Header.
100 bad-single-none-too_long_vli.lzma has 10-byte variable-length integer.
102 bad-single-none-empty.lzma is like good-single-none-empty_3.lzma but
103 with non-zero value in the Compressed Size field.
105 bad-single-data_after_eopm_1.lzma has LZMA+Subblock, where the Subblock
106 filter gives one byte of data to LZMA after LZMA has detected EOPM.
108 bad-single-data_after_eopm_2.lzma is like
109 bad-single-data_after_eopm_1.lzma but Subblock gives 256 MiB of data
110 to LZMA after LZMA has detected EOPM.
112 bad-single-subblock_subblock.lzma has Subblock+Subblock, where the
113 Subblock decoder is given End of Input in the middle of a Subblock.
115 bad-single-subblock-padding_loop.lzma contains huge amount of
116 consecutive Padding bytes, which isn't allowed by the Subblock filter
117 format. If it were allowed, this file would hang the decoder for very
118 long time (weeks to years).
120 bad-single-subblock1023-slow.lzma is similar to
121 malicious-single-subblock31-slow.lzma except that this uses 1023 bytes
122 of Padding in every place instead of 31 bytes. The Subblock filter
123 format specification allows only 31-byte Padings, thus this file must
124 get detected as bad without producing any output. Allowing larger
125 Padding than 31 bytes was considered (so this test file was created),
126 but it seemed to be a bad idea since it would increase worst-case CPU
129 bad-single-lzma-flush_beginning.lzma has flush marker in the beginning
132 bad-single-lzma-flush_twice.lzma has two flush markers with no data
138 malicious-single-subblock31-slow.lzma requires quite a bit of CPU time
139 per decoded byte. It contains LZMA compressed Subblock filter data that
140 has as much Padding as the specification allows. LZMA is also used as
141 a Subfilter, to further slowdown the decoder. Every Subfilter instance
142 produces only one byte of output. If you can create a file that wastes
143 notably more CPU cycles than this file, please contact Lasse Collin.
145 malicious-single-subblock-256MiB.lzma is a tiny file that produces
146 256 MiB of output. It uses Subblock filter's run-length encoding
149 malicious-single-subblock-64PiB.lzma is a tiny file that produces
150 64 PiB of output (if you have patience to wait). This is done by
151 chaining two Subblock filters and using their run-length encoders.
153 malicious-multi-metadata-64PiB.lzma is like
154 malicious-single-subblock-64PiB.lzma but the huge amount of output
155 is in a Metadata Block. Trying to decode this file may take years
156 unless the decoder catches that the Metadata has unreasonable size.