Search code examples
javascriptnode.jsstringbinarycompression

Reliably compress a string of 1s and 0s into a smaller string, and convert it back to the Exact same string. Everytime


I was wondering if there is a better way to do this in NodeJS or plain javascript.

I wrote some functions that can do this by counting how many times they repeat, but i was wondering if there is a better way, or if theres any improvements to be made.

The input will always be a 256 bit number string consisting only of 1s and 0s ('00001110100101.....')

function toSmallString(x) { // {{{
  return new Promise((resolve, reject) => {
    // always start with a 0 or a 1 to show bigstring where to start
    var nn = `${x[0]}`;
    var ar = x.split('')
    var n = 0;
    var lv = ar[0];

    // a blank item to iterate, catches the last set of numbers
    ar.push('')

    // 0 means ten and continue since we will always be above 1
    for (const v of ar) {
      // if flipping bits, store number and start over
      if (lv !== v) { nn += n ;lv = v; n = 1 }

      // increment if last value is same as this value, if value becomes 10, add a 0 to nn and start over
      else { n = n + 1;if(n === 10){nn += '0';n = 0} }
    }
    resolve(nn)
  })
}

function toBigString(x) {
  return new Promise((resolve, reject) => {
    var ar = x.split('')
    var sn = parseInt(ar.shift())
    var nn = ''

        // 0 means 10 zeros and continue without bitflipping
        // any other number will expand and then flip bits

    for (const v of ar) {
      var n = parseInt(v)
      var i = 1
      if(n === 0){
        nn += sn.toString().repeat(10)
      } else {
        while(i <= n){
          nn += sn
          i = i + 1
        }
        // bit flip
        sn = (sn === 0) ? 1 : 0;
      }
    }
    resolve(nn)
  })
}

I tried a couple things, im very inexperienced (hexadecimal, scientific notation) but the values would change form when trying to convert it back to the original 256 bits of 1s and 0s.

each 1 and 0 is important, their placing and order is vital to preserve.

My plan is to store hundreds of thousands of these 256 bit strings, so a form of string conpression would be the way to go.

Edit: Here is a list of ~7k Input Strings


Solution

  • A simple solution could be to use a BigInt (which can safely represent a 256-bit number), then convert it to a hex string.

    function toSmallString(x) {
      return BigInt('0b' + x).toString(16); // bin => hex
    }
    function toBigString(x) {
      return BigInt('0x' + x).toString(2).padStart(256, '0'); // hex => bin
    }