compression algorithm!

Philip Newton philip.newton at gmail.com
Wed Apr 8 09:56:06 BST 2009


On Wed, Apr 8, 2009 at 07:08, abhishek jain <abhishek.netjain at gmail.com> wrote:
> Hi Friends,I have a task to discover or search for a compression algorithm
> which compresses even 300 - 400 characters to about at least 200-300%
> compression reducing them to 150 characters.

What kind of text are we talking about?

If it's random data (i.e. all 256 characters are equally possible),
then what you are asking is, of course, impossible.

If it's English text but otherwise not especially repetitive, 50%
compression is probably hard if not impossible to achieve for a
general-purpose compression algorithm.

If it's something repetitive (say, status reports which always start
the same way or always contain certain fixed phrases), then a custom
codebook may be the way to go. (For example, "Server 'indigo' has
failed due to: case temperature exceeded maximum permissible
temperature" might compress to "sift" given { s => "Server ", i =>
"'indigo'", f => " has failed due to: ", t => "case temperature
exceeded maximum permissible temperature" }.)

Cheers,
-- 
Philip Newton <philip.newton at gmail.com>


More information about the london.pm mailing list