Thread: How does chat encoding/decoding work?

Results 1 to 8 of 8
  1. #1 How does chat encoding/decoding work? 
    ???

    funkE's Avatar
    Join Date
    Feb 2008
    Posts
    2,612
    Thanks given
    255
    Thanks received
    989
    Rep Power
    1366
    https://github.com/Rabrg/refactored-...atEncoder.java

    Can anyone break down the code and explain it? I'm trying to understand it but I'm having trouble. Just explain it as you would explain it to yourself, no need to dumb it down.

    What's the theory behind all of this?

    Thanks for reading.
    .
    Reply With Quote  
     

  2. Thankful user:


  3. #2  
    Renown Programmer
    Method's Avatar
    Join Date
    Feb 2009
    Posts
    1,455
    Thanks given
    0
    Thanks received
    845
    Rep Power
    3019
    The basic idea is to take advantage of the fact that some characters appear more frequently than others in English text. The client uses this knowledge to encode the 13 most commonly used characters using only 4 bits instead of 8 bits. This lets you save a good deal of space in the general case and also when the message consists mostly of the common characters. Messages happen to be limited to 80 characters, though, so the savings aren't all that great (e.g. 40 bytes vs. 80 bytes for a message consisting entirely of the most frequent characters) in this case.

    Most of the code is dedicated to dealing with packing the encoded characters into bytes while dealing with overflowed nibbles and isn't too important.

    In general, making use of things like character frequencies and other text metadata can help save a lot of space. Here's some related pages on Wikipedia that might be interesting:

    Letter frequency - Wikipedia, the free encyclopedia
    Information theory - Wikipedia, the free encyclopedia

    In addition, if you're interested, the newer clients make use of Huffman coding for text compression. It's pretty standard in more general compression algorithms too.
    :-)
    Reply With Quote  
     


  4. #3  
    ???

    funkE's Avatar
    Join Date
    Feb 2008
    Posts
    2,612
    Thanks given
    255
    Thanks received
    989
    Rep Power
    1366
    Thanks for the information. I'm sure other people will find it interesting as well.
    .
    Reply With Quote  
     

  5. #4  
    Renown Programmer
    veer's Avatar
    Join Date
    Nov 2007
    Posts
    3,746
    Thanks given
    354
    Thanks received
    1,370
    Rep Power
    3032
    hasn't this been answered many times before on this forum
    Reply With Quote  
     

  6. #5  
    ???

    funkE's Avatar
    Join Date
    Feb 2008
    Posts
    2,612
    Thanks given
    255
    Thanks received
    989
    Rep Power
    1366
    Quote Originally Posted by veer View Post
    hasn't this been answered many times before on this forum
    no .
    .
    Reply With Quote  
     

  7. #6  
    Renown Programmer
    veer's Avatar
    Join Date
    Nov 2007
    Posts
    3,746
    Thanks given
    354
    Thanks received
    1,370
    Rep Power
    3032
    well supah fly, as graham said, you can think of it as a basic variable-width encoding which gives priority (i.e. most frequent) characters singleton 4-bit codes while using 2 such codes to encode *less* frequent characters... it's a basic means of special-purpose text compression. in your code specifically, `characterBit` is actually the next nibble if input, while `validCharacterIndex` is sorta the lead unit, the first code used to encode one of the less frequent characters. if we have read a lead unit, it knows to read a trail unit and combine them (hence `(validCharacterIndex << 4) + characterBit`)

    if you look at the order of the table, you should recognize immediately it's based on letter frequency ****ysis. pretty similar to etaoin shrdlu - Wikipedia, the free encyclopedia eh
    Reply With Quote  
     

  8. #7  
    ???

    funkE's Avatar
    Join Date
    Feb 2008
    Posts
    2,612
    Thanks given
    255
    Thanks received
    989
    Rep Power
    1366
    Quote Originally Posted by veer View Post
    well supah fly, as graham said, you can think of it as a basic variable-width encoding which gives priority (i.e. most frequent) characters singleton 4-bit codes while using 2 such codes to encode *less* frequent characters... it's a basic means of special-purpose text compression. in your code specifically, `characterBit` is actually the next nibble if input, while `validCharacterIndex` is sorta the lead unit, the first code used to encode one of the less frequent characters. if we have read a lead unit, it knows to read a trail unit and combine them (hence `(validCharacterIndex << 4) + characterBit`)

    if you look at the order of the table, you should recognize immediately it's based on letter frequency ****ysis. pretty similar to etaoin shrdlu - Wikipedia, the free encyclopedia eh
    graham? haha

    anyway, yeah. thanks for the explanation. it helps to understand what goes on in the client.
    .
    Reply With Quote  
     

  9. #8  
    Renown Programmer
    veer's Avatar
    Join Date
    Nov 2007
    Posts
    3,746
    Thanks given
    354
    Thanks received
    1,370
    Rep Power
    3032
    oops totally thought that was graham
    Reply With Quote  
     


Thread Information
Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)


User Tag List

Similar Threads

  1. How does the server actually work?
    By infestor1 in forum Help
    Replies: 4
    Last Post: 06-01-2012, 07:15 AM
  2. Replies: 10
    Last Post: 06-29-2010, 11:29 AM
  3. How does npc transformation mask work?
    By shoopdawhoop in forum Help
    Replies: 2
    Last Post: 03-18-2010, 08:04 PM
  4. How does the Projectile Frame Work?
    By Jcclub_jcat in forum Help
    Replies: 1
    Last Post: 02-27-2010, 01:28 AM
  5. [Help] How does XML in rs2hd Work?
    By becool007 in forum Help
    Replies: 3
    Last Post: 07-27-2009, 04:48 AM
Posting Permissions
  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •