Page 2 of 4 FirstFirst 1234 LastLast
Results 21 to 40 of 64
  1. #21
    Secondary Hivemind Nexus Boris's Avatar
    Join Date
    Apr 2012
    Location
    Netherlands
    Posts
    1,616
    Quote Originally Posted by MeltdownInteractiveMedia View Post
    Writing a software tool that does this and selling it for cash is more my idea, and my forte.
    How exactly does publishing a paper make one money?
    Because if you've figured out how to compress random data, there's probably a big prize waiting for you. Lots of fields would be changed (like cryptography) and there are loads of parties in the world who would pay very big salaries to have someone with that skill set.

    To be completely honest, I'm saying this to illustrate what this would mean. You'd turn upside down a few big areas in computing. Or you made a mistake somewhere and it's not lossless. Which is where my money is, I'm sorry to say.

  2. #22
    You are correct Boris.

    The compression was so simple, that I thought de-compression would be a breeeze, while it was, the de-compression revealed a fatal flaw, so lots of data was missing in the compressed version.

    I really thought I was onto something :(

  3. #23
    Secondary Hivemind Nexus somini's Avatar
    Join Date
    Jun 2011
    Location
    NEuro Troika Franchulate #3
    Posts
    3,990
    Quote Originally Posted by Boris View Post
    To be completely honest, I'm saying this to illustrate what this would mean. You'd turn upside down a few big areas in computing. Or you made a mistake somewhere and it's not lossless. Which is where my money is, I'm sorry to say.
    Basically this. I too would be interested in trying this out, keep us posted.
    Steam(shots), Imgur, Flickr, Bak'laag, why do you forsake me?

  4. #24
    Secondary Hivemind Nexus Boris's Avatar
    Join Date
    Apr 2012
    Location
    Netherlands
    Posts
    1,616
    Quote Originally Posted by somini View Post
    Basically this. I too would be interested in trying this out, keep us posted.
    Maybe I was too subtle, but what the guy was claiming is pretty much impossible.

  5. #25
    Moderator QuantaCat's Avatar
    Join Date
    Jun 2011
    Location
    Vienna, Austria
    Posts
    7,141
    lossless compression is a big business. FLAC is a fantastic example of this.

    as for compression software I suggest iSkysoft Ultimate Convertor. works great for non professional compression.
    - Tom De Roeck.

    verse publications & The Shopkeeper, an interactive short.

    "Quantacat's name is still recognised even if he watches on with detached eyes like Peter Molyneux over a cube in 3D space, staring at it with tears in his eyes, softly whispering... Someday they'll get it."

  6. #26
    Secondary Hivemind Nexus somini's Avatar
    Join Date
    Jun 2011
    Location
    NEuro Troika Franchulate #3
    Posts
    3,990
    Quote Originally Posted by Boris View Post
    Maybe I was too subtle, but what the guy was claiming is pretty much impossible.
    Or a breakthrough that will change computer science for ever.
    Probably it's nothing, but I excited nonetheless. FOR SCIENCE!
    Steam(shots), Imgur, Flickr, Bak'laag, why do you forsake me?

  7. #27
    Secondary Hivemind Nexus Boris's Avatar
    Join Date
    Apr 2012
    Location
    Netherlands
    Posts
    1,616
    Quote Originally Posted by somini View Post
    Or a breakthrough that will change computer science for ever.
    Probably it's nothing, but I excited nonetheless. FOR SCIENCE!
    Yeah. Which is why I worded it the way I did. However, the chance is small that someone with the smarts to turn upside down communications theory, compression algorithms and encryption (pretty much some of the biggest fields in computer science right now) is going to publish it on the Tech Help section of a gaming website.

    A person with such a discovery would be in line for a Nobel Prize. And since a lot of really smart people have been trying to do what he claimed and failed, it seems reasonable to assume that it does not work until proven otherwise.

  8. #28
    Secondary Hivemind Nexus somini's Avatar
    Join Date
    Jun 2011
    Location
    NEuro Troika Franchulate #3
    Posts
    3,990
    Quote Originally Posted by Boris View Post
    Yeah. Which is why I worded it the way I did. However, the chance is small that someone with the smarts to turn upside down communications theory, compression algorithms and encryption (pretty much some of the biggest fields in computer science right now) is going to publish it on the Tech Help section of a gaming website.

    A person with such a discovery would be in line for a Nobel Prize. And since a lot of really smart people have been trying to do what he claimed and failed, it seems reasonable to assume that it does not work until proven otherwise.
    Sure, I assume it doesn't work. Cautiously optimism.
    But you never know. Grigori Perelman might be lurking around here. ;)
    Steam(shots), Imgur, Flickr, Bak'laag, why do you forsake me?

  9. #29
    Quote Originally Posted by Boris View Post
    Yeah. Which is why I worded it the way I did. However, the chance is small that someone with the smarts to turn upside down communications theory, compression algorithms and encryption (pretty much some of the biggest fields in computer science right now) is going to publish it on the Tech Help section of a gaming website.

    A person with such a discovery would be in line for a Nobel Prize. And since a lot of really smart people have been trying to do what he claimed and failed, it seems reasonable to assume that it does not work until proven otherwise.
    An algorithm that compresses arbitrary data is actually slightly harder than you make it sound, by which I mean mathematically impossible. If you have an algorithm that can take any file file that is X bytes large and then produce a file that is at most X-1 bytes then you have lost information: there are more possible source files than possible target files so at least two source files must have the same target and the operation cannot be reversed. It follows that there is no such thing as a general lossless compression algorithm.

    Practical lossless compression works by being tailored to compress the most common patterns in the stuff you typically feed the compression program. Zip, rar, Lagarith, FLAC and all the rest pay for this by ending up with files larger than the source for some types of data. They're still useful is because we only care about compressing specific types of data like conversations, music and film so the algorithms are optimized for that and the fact that they end up producing larger files when fed types of random noise isn't really a problem in practice.

    So, if at any point in life someone believes themselves to have found a general lossless compression method: nope, they haven't. Nor will anyone else do so at any point ever. Sorry.

  10. #30
    Secondary Hivemind Nexus somini's Avatar
    Join Date
    Jun 2011
    Location
    NEuro Troika Franchulate #3
    Posts
    3,990
    Quote Originally Posted by Xerophyte View Post
    An algorithm that compresses arbitrary data is actually slightly harder than you make it sound, by which I mean mathematically impossible. If you have an algorithm that can take any file file that is X bytes large and then produce a file that is at most X-1 bytes then you have lost information: there are more possible source files than possible target files so at least two source files must have the same target and the operation cannot be reversed. It follows that there is no such thing as a general lossless compression algorithm.

    Practical lossless compression works by being tailored to compress the most common patterns in the stuff you typically feed the compression program. Zip, rar, Lagarith, FLAC and all the rest pay for this by ending up with files larger than the source for some types of data. They're still useful is because we only care about compressing specific types of data like conversations, music and film so the algorithms are optimized for that and the fact that they end up producing larger files when fed types of random noise isn't really a problem in practice.

    So, if at any point in life someone believes themselves to have found a general lossless compression method: nope, they haven't. Nor will anyone else do so at any point ever. Sorry.
    There goes that Nobel prize...
    Physics is why we can't have nice things. :)
    Steam(shots), Imgur, Flickr, Bak'laag, why do you forsake me?

  11. #31
    Secondary Hivemind Nexus Boris's Avatar
    Join Date
    Apr 2012
    Location
    Netherlands
    Posts
    1,616
    Quote Originally Posted by Xerophyte View Post
    An algorithm that compresses arbitrary data is actually slightly harder than you make it sound, by which I mean mathematically impossible. If you have an algorithm that can take any file file that is X bytes large and then produce a file that is at most X-1 bytes then you have lost information: there are more possible source files than possible target files so at least two source files must have the same target and the operation cannot be reversed. It follows that there is no such thing as a general lossless compression algorithm.

    Practical lossless compression works by being tailored to compress the most common patterns in the stuff you typically feed the compression program. Zip, rar, Lagarith, FLAC and all the rest pay for this by ending up with files larger than the source for some types of data. They're still useful is because we only care about compressing specific types of data like conversations, music and film so the algorithms are optimized for that and the fact that they end up producing larger files when fed types of random noise isn't really a problem in practice.

    So, if at any point in life someone believes themselves to have found a general lossless compression method: nope, they haven't. Nor will anyone else do so at any point ever. Sorry.
    I know this, I was just being polite.

  12. #32
    You can't compress them anymore - mp3 is a heavily compressed file and it won't compress anymore. Same with wmv, aac, avi, m4p, mpeg, etc.

    All a zip or rar utility will do is bundle them into one package, but you can't make the package any smaller.

  13. #33
    Moderator QuantaCat's Avatar
    Join Date
    Jun 2011
    Location
    Vienna, Austria
    Posts
    7,141
    did everyone miss the post that the original poster made, that his code had a flaw?
    - Tom De Roeck.

    verse publications & The Shopkeeper, an interactive short.

    "Quantacat's name is still recognised even if he watches on with detached eyes like Peter Molyneux over a cube in 3D space, staring at it with tears in his eyes, softly whispering... Someday they'll get it."

  14. #34
    Isn't Variable-Length coding a general lossless encoding technique? (https://en.wikipedia.org/wiki/Variable-length_code)
    Or am I missing something from your post?

  15. #35
    Quote Originally Posted by vincent509 View Post
    Isn't Variable-Length coding a general lossless encoding technique? (https://en.wikipedia.org/wiki/Variable-length_code)
    Or am I missing something from your post?
    No, it is not. For example, take the example prefix coding from the wiki article

    Symbol Code
    a 0
    b 10
    c 110
    d 111
    The "natural" binary coding of those 4 symbols would be

    Symbol Code
    a 00
    b 01
    c 10
    d 11
    The variable length coding makes the assumption that 'a' is more common than 'c' and 'd' together but for an uncommon message like "dc" the proposed variable length encoding is "111110" and larger than the natural encoding "1110". It's still a very useful approach since in a lot of real-world applications you can safely assume that certain symbols are far more common than others and tailor the coding accordingly. Such an algorithm will unavoidably fail to compress in cases where that assumption doesn't hold.

    There is literally no such thing as a universal compression algorithm and it is literally impossible to make one. You cannot compress 1000 possible input messages to 500 possible output messages (i.e. a single bit reduction in message size) and expect to always reconstruct the original message from output -- at least one output will have more than one input that could cause it.
    Last edited by Xerophyte; 24-05-2013 at 08:20 PM.

  16. #36
    I see your point now, and I must confess that I did misunderstand what you said. Specifically the part where it is impossible for an algorithm to encode ANY file to a smaller size. Thinking a little bit more on this, it is easy to imagine at least one case where this would be impossible; a data set that has no correlation at all between the bits - complete noise.
    Although if the data is large enough, I would be surprised if an entropy-based encoding would not yield some results.

    My meaning of general was thus rather general-purpose than every case.

    Your second point still have me a bit confused, I am pretty confident that lossless encoding won't produce ambiguous results. Are you talking about a lossy compression, that always removes some data? In that case I didn't know that they had the property of not being able to guarantee unambiguous results, although it makes sense if you think about it.

  17. #37
    Lossless encodings, like the one from the wiki above, won't produce ambiguous results but they pay by producing an output that is larger than the input for certain inputs. E.g. inputting "dc" in the example. This is not a problem in practice -- assuming that whoever made the encoding knew what they were doing -- since the inputs that are enlarged by your lossless encoding are unlikely to the point of impossibility while the inputs that are significantly compressed are very common, but if you count them then there will be at least as many inputs that are enlarged by the lossless algorithm as there are inputs that are compressed.

    I'm not saying that it's impossible to encode any file to a smaller size, obviously we can losslessly compress a lot of things. What is impossible is finding an algorithm that losslessly encodes all files to a smaller size. For every file you make smaller, there must be another file you make larger.

    Also, yes, if you allow loss then you can guarantee that all inputs are compressed, but that's no fun... :)

    E: Ah. I realize I may have misunderstood you initially. Variable length coding is certainly a lossless encoding technique but there are plenty of those. Gray code, bit parity transfer techniques, etc. What it is not is a general (lossless) compression technique -- those don't exist and cannot be constructed.
    Last edited by Xerophyte; 25-05-2013 at 01:31 AM.

  18. #38
    Secondary Hivemind Nexus somini's Avatar
    Join Date
    Jun 2011
    Location
    NEuro Troika Franchulate #3
    Posts
    3,990
    Quote Originally Posted by Xerophyte View Post
    Lossless encodings, like the one from the wiki above, won't produce ambiguous results but they pay by producing an output that is larger than the input for certain inputs. E.g. inputting "dc" in the example. This is not a problem in practice -- assuming that whoever made the encoding knew what they were doing -- since the inputs that are enlarged by your lossless encoding are unlikely to the point of impossibility while the inputs that are significantly compressed are very common, but if you count them then there will be at least as many inputs that are enlarged by the lossless algorithm as there are inputs that are compressed.

    I'm not saying that it's impossible to encode any file to a smaller size, obviously we can losslessly compress a lot of things. What is impossible is finding an algorithm that losslessly encodes all files to a smaller size. For every file you make smaller, there must be another file you make larger.

    Also, yes, if you allow loss then you can guarantee that all inputs are compressed, but that's no fun... :)

    E: Ah. I realize I may have misunderstood you initially. Variable length coding is certainly a lossless encoding technique but there are plenty of those. Gray code, bit parity transfer techniques, etc. What it is not is a general (lossless) compression technique -- those don't exist and cannot be constructed.
    I'm just studying variable-length encoding now and it's a really neat technique.

    The solution for the problem of "compressed" files being bigger than the input is having the format support the inclusion of the original uncompressed file.
    For example, the compressed file has 1 bit that indicates if the file is compressed or not.
    Steam(shots), Imgur, Flickr, Bak'laag, why do you forsake me?

  19. #39
    Quote Originally Posted by somini View Post
    The solution for the problem of "compressed" files being bigger than the input is having the format support the inclusion of the original uncompressed file.
    For example, the compressed file has 1 bit that indicates if the file is compressed or not.
    Which, of course, means that it'll be one bit larger after compression. QED. :)

    But, yes, this is in effect what a lot of the general lossless compression programs do: the archive is divided into chunks, each chunk in the archive has a header, the header specifies the type of compression for the chunk (among other things) and one of the possible compression types is none. You can see Wikipedia on the Zip format for an example.

    I'm not saying that lossless compression programs are somehow bad, just that it's a mathematical certainty that for every file such an algorithm or program will compress there must exist a corresponding file that it will instead enlarge. These do not have to be the same magnitude. I've seen a surprising number of people on various programming forums working on a method that they're sure will compress anything by 30%, or 80%, or whatever, just as soon as they fix some small niggling bugs with the decompression -- point is this is simply not possible to do. Lossless compression is hard and never universal.

  20. #40
    Secondary Hivemind Nexus somini's Avatar
    Join Date
    Jun 2011
    Location
    NEuro Troika Franchulate #3
    Posts
    3,990
    Quote Originally Posted by Xerophyte View Post
    Which, of course, means that it'll be one bit larger after compression. QED. :)
    Curses, foiled again! :)

    Yeah, universal compression isn't easy, but for each file type there are some really advance algorithms, even for lossless.
    Using headers is a simple enough workaround for general purpose. It could be taken to the next level and split by file, and then compress the files depending on the data type. The problem is that most files people try to compress are already compressed. MP3, videos, binary files.

    As for general purpose use 7zip because 7z uses LZMA(MOAR compression), the programs aren't shareware crap, and it supports zip anyway.
    Steam(shots), Imgur, Flickr, Bak'laag, why do you forsake me?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •