Search code examples
bashencodingcharacter-encodingbase64

Encode to Base 64 with utf-8 characterset in linux


I am trying to encode the contents of a vql file to base64 with the character set utf-8.

Tried base64 file-name but seems the encoded output is different to when i use my java code:

String text = new String(Files.readAllBytes(Paths.get("C:\\Documents\\pull.vql")), StandardCharsets.UTF_8);
String encodedString = Base64.getEncoder().encodeToString(text.getBytes());
System.out.println(encodedString);

I have tried the following too but without any luck:

base64 pull.vql -d | iconv -f utf8 -t iso8859-5

Part of the Input is the following:

# REQUIRES-PROPERTIES-FILE - # Do not remove this comment!
# 
# Generated with Platform 8.0 update 20220815.

ENTER SINGLE USER MODE;
# ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


# 0 ====================================================================

# #######################################
# DATABASE
# #######################################
CREATE OR REPLACE DATABASE ci_cd_test 'db to implement and test ci-cd workflow';

CONNECT DATABASE ci_cd_test;

# #######################################
# FOLDERS
# #######################################
CREATE OR REPLACE FOLDER '/1 - Connectivity' ;

CREATE OR REPLACE FOLDER '/1 - connectivity/1 - data sources' ;

CREATE OR REPLACE FOLDER '/1 - connectivity/2 - base views' ;

Desired Output:

IyBSRVFVSVJFUy1QUk9QRVJUSUVTLUZJTEUgLSAjIERvIG5vdCByZW1vdmUgdGhpcyBjb21tZW50IQ0KIyANCiMgR2VuZXJhdGVkIHdpdGggUGxhdGZvcm0gOC4wIHVwZGF0ZSAyMDIyMDgxNS4NCg0KRU5URVIgU0lOR0xFIFVTRVIgTU9ERTsNCiMgKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrDQoNCg0KIyAwID09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09DQoNCiMgIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjDQojIERBVEFCQVNFDQojICMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIw0KQ1JFQVRFIE9SIFJFUExBQ0UgREFUQUJBU0UgY2lfY2RfdGVzdCAnZGIgdG8gaW1wbGVtZW50IGFuZCB0ZXN0IGNpLWNkIHdvcmtmbG93JzsNCg0KQ09OTkVDVCBEQVRBQkFTRSBjaV9jZF90ZXN0Ow0KDQojICMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIw0KIyBGT0xERVJTDQojICMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIw0KQ1JFQVRFIE9SIFJFUExBQ0UgRk9MREVSICcvMSAtIENvbm5lY3Rpdml0eScgOw0KDQpDUkVBVEUgT1IgUkVQTEFDRSBGT0xERVIgJy8xIC0gY29ubmVjdGl2aXR5LzEgLSBkYXRhIHNvdXJjZXMnIDsNCg0KQ1JFQVRFIE9SIFJFUExBQ0UgRk9MREVSICcvMSAtIGNvbm5lY3Rpdml0eS8yIC0gYmFzZSB2aWV3cycgOw==

Output when using java and bash are totally different Any suggestion would be great. Thanks


Solution

  • Once you're processing the files on Linux, and wants to have a Windows output, you should first convert the Linux LE to Windows LE, e.g. by sed.

    So sed 's/$/\r/g' pull.vql | base64 -w 0 | iconv -f utf8 -t iso8859-5 should get the expected result, or convert the file before conversion.