https://nodejs.org/docs/latest/api/process.html#processargv https://www.golinuxcloud.com/pass-arguments-to-npm-script/
passing a parameter by invoking a script in package.json as follows:
--pathToFile=./ESMM/Parametrização_Dezembro_PS1_2022.xlsx
in code retrieve that parameter as argument
const value = process.argv.find( element => element.startsWith( `--pathToFile=` ) );
const pathToFile=value.replace( `--pathToFile=` , '' );
The string that's obtain seems to be in the wrong format/encoding
./ESMM/Parametrização_Dezembro_PS1_2022.xlsx
I tried converting to latin1 (other past issues were fixed with this encoding)
const latin1Buffer = buffer.transcode(Buffer.from(pathToFile), "utf8", "latin1");
const latin1String = latin1Buffer.toString("latin1");
but still don't get the string in the correct encoding:
./ESMM/Parametriza?º?úo_Dezembro_PS1_2022.xlsx
My package.json
is in UTF-8.
My current locale is (chcp): Active code page: 850
OS: Windows
This seems to be related to:
will try those configurations
const min = parseInt("0xD800",16), max = parseInt("0xDFFF",16);
console.log(min);//55296
console.log(max);//57343
let textFiltered = "",specialChars = 0;
for(let charAux of pathToFile){
const hexChar = Buffer.from(charAux, 'utf8').toString('hex');
console.log(hexChar)
const intChar = parseInt(hexChar,16);
if(hexChar.length > 2){
//if(intChar>min && intChar<max){
//console.log(Buffer.from(charAux, 'utf8').toString('hex'))
specialChars++;
console.log(`specialChars(${specialChars}): ${hexChar}`);
}else{
textFiltered += String.fromCharCode(intChar);
}
}
console.log(textFiltered); //normal characters
./ESMM/Parametrizao_Dezembro_PS1_2022.xlsx
console.log(
specialChars(${specialChars}): ${hexChar}
); //specialCharacters
specialChars(1): e2949c
specialChars(2): c2ba
specialChars(3): e2949c
specialChars(4): c3ba
seems that e2949c hex value to indicate a special character since it repeats and 0xc2ba should be able to convert to "ç" and 0xc3ba to "ã" idealy still trying to figure that out.
As @JosefZ indicated but for Python, in my case gona use a direct conversion since will alls have the keyword "Parametrização" as part of the parameter.
The probleam that encountered in this case is that my package.json and my script are in the correct format UTF8 as stated by @tripleee (thanks for the help providade) but process.argv that returns <string[]> that basicaly UTF16... so my solution is deal with the ├ that in hex is "e2949c" and retrive the correct characters:
const UTF8_Character = "e2949c" //├
//for this cases use this json/array that haves the correct encoding
const personalized_encoding = {
"c2ba": "ç",
"c3ba": "ã"
}
let textFiltered = "",specialChars = 0;
for(let charAux of pathToFile){
const hexChar = Buffer.from(charAux, 'utf8').toString('hex');
//console.log(hexChar)
const intChar = parseInt(hexChar,16);
if(hexChar.length > 2){
if(hexChar === UTF8_Character) continue;
specialChars++;
//console.log(`specialChars(${specialChars}): ${hexChar}`);
textFiltered += personalized_encoding[hexChar];
}else{
textFiltered += String.fromCharCode(intChar);
}
}
console.log(textFiltered);