I've got some problems trying to delete from my string a subsequence \u000
.
Firstly, I read bytes [] from my file into string by String str = new String(bytes, "UTF8");
then I get the str
which equals \u0004Word
which means 4Word
. 4
is length of word Word
. So now I need to convert it to regular 4Words
. replaceAll("\u000", "");
, replaceALL("\\\\u000", "")
etc doesn't work. How to do that?
void FillingStorage() throws Exception{
Path path = Paths.get(System.getProperty("db.file"));//that's my file
byte[] data = Files.readAllBytes(path);
String str = new String(data, "UTF8");
System.out.println(str);
String res = str.replaceAll("I don't know what to write here cos nothing I've tried works");
return;
}
UPDATE!
Firstly, I fill my HashMap with Key -> Value and Key1 -> Value1
. Then I write it in file as bytes.
So when I try to convert it back to string and print it I see: Key Value Key1 Value1
instead of 3Key 5Value 4Key1 6Value1
. But suprisingly if you look at string that I print you will see smth like that: \u0003Key \u0005Value etc...
so looks like that my string contains these numbers but java can't print them.
This is how I write my bytes in file:
DataOutputStream stream = new DataOutputStream(new FileOutputStream(System.getProperty("db.file"), true));
for (Map.Entry<String, String> entry : storage.entrySet()) {
byte[] bytesKey = entry.getKey().getBytes(StandardCharsets.UTF_8);
stream.write((int)bytesKey.length);//it disappears!
stream.write(bytesKey);
byte[] bytesVal = entry.getValue().getBytes(StandardCharsets.UTF_8);
stream.write((Integer)bytesVal.length);//disappears too!
stream.write(bytesVal);
}
stream.close();
First of all, your requirement does not call for regular expressions, so you should have used replace()
instead.
Second, \uxxxx
is character literal syntax in Java, so it is not exactly clear that you actually have the characters \
u
0
0
0
in your string; it would be much more logical that your byte array simply starts with the single byte equal to 4, which is the string length.
In that case you should simply discard the initial byte from the array when converting to String
, using the constructor which accepts offset
and len
arguments.
If you happen to indeed have all those chars in the string, again simply using substring
to get rid of the initial 6 characters should be all you need.