-------------------------- add new-----------------------------
Let me fill more info here:
The actual situation is that I have this LONG STRING in environment-A, and I need to copy and paste it to environment-B;
UNFORTUNATELY, envir-A and envir-B are not connected (no mutual access), so I'm thinking about a way to encode/decode to represent it, otherwise for more files I have to input the string hand by hand----which is slow and not reproducible.
Any suggestion or gadget recommend? Many thanks!
I'm facing a weird problem to encode a SUPER LONG binaries to a simple form, like several digits.
Say, there is a long string consist of only 1 and 0, e.g. "110...011" of length 1,000 to 100,000 or even more digits, and I would like to encode this STRING to something that has fewer digits/chars. Then I need to reverse it back to original STRING.
Currently I'am trying using hex / int method in Python to 'compress' this String, and 'decompress' it back to original form.
A example would be:
1.input string : '110011110110011'
'''
def Bi_to_Hex_Int(input_str, method ):
#2to16
if method=='hex':
string= str(input_str)
input_two= string
result= hex(int(input_two,2))
#2to10
if method=='int':
string= str(input_str)
input_two= string
result= int(input_two,2)
print("input_bi length",len(str(input_two)), "\n output hex length",len(str(result)),'\n method: {}'.format(method) )
return result
res_16 =Bi_to_Hex_Int(gene , 'hex')
=='0x67b3'
res_10 =Bi_to_Hex_Int(gene , 'int')
== 26547
'''
Then I can reverse it back:
'''
def HexInt_to_bi(input_str , method):
if method =='hex':
back_two = bin(int(input_str,16))
back_two = back_two[2:]
if method =='int':
back_two = bin( int(input_str ))
back_two = back_two[2:]
print("input_hex length",len(str(input_str)), "\n output bi length",len(str(back_two)) )
return back_two
hexback_two = HexInt_to_bi(res_16, 'hex')
intback_two = HexInt_to_bi(res_10 , 'int')
'''
BUT, this does have a problem, I tried around 500 digits of String:101010...0001(500d), the best 'compressed' result is around 127 digits by hex;
So is there a better way to further 'compress' string to fewer digits?
**Say 5,000 digits of string consist of 1s&0s, compress to 50/100 something of digits/chars(even lower) ** ??
If you want it that simple, say 1 hex character compresses 4 binary characters (2 ^ 4 = 16). Compression ratio you want is about 100 / 50 times. For 50 times you need 50 binary characters to be compressed into 1 character, means you require 2 ^ 50 different characters to encode any combination. Quite a lot that is.
If you accept lower ratio, you may try base64 like described here. Its compress ratio is 6 to 1.
Otherwise you have to come up with some complex algorithm like splitting your string into blocks, looking for similar amongst them, encoding them with different symbols, building a map of those symbols, etc.
Probably it's easier to compress your string with an archivator, then return a base64 representation of the result.
If task allows, you may store the whole strings somewhere and give them short unique names, so instead of compression and decompression you have to store and retrieve strings by names.