Topic created on: January 11, 2011 16:39 CST by logik .
I have an example binary that generates some compressed text. Fortunately it dumps the original text to disk first, so I do have copies of both.
The header for the compressed text is in the following format:
LL MM 40 00 00
LL and MM are the LSB and MSB giving the total length of compressed data. 0x40 is the third byte and is fixed. The 00 00 might be the first "command".
And then portions of text start. Other common "tokens" in the output which are obviously related to compression and not plaintext include:
04 xx - 2 byte patterns, presumably 1st byte is a command and 2nd byte is some sort of length or offset
The data is compressed in 16KiB blocks. Plaintext is included in the output and can be spotted fairly easily, from around 15 characters all the way down to single characters.
Can anybody identify if this is a common compression algorithm which has been reused? I'd like to be able to decompress the data and get back some original text.