Search code examples
javasplitemoticons

String split in java does not work?


I split this String by space : String input = ":-) :) :o) :] :3 :c) :> =] 8) =) :} :^)"; (space between emoticons)

And result is:

:-)?:)?:o)?:]?:3?:c)?:>
=]

8)

=)?:}?:^)

There are some strange characters in the results. I don't know why. Please help me.

Here is the code:

fileReader = new BufferedReader(new FileReader("emoticon.txt"));
String line = "";
while ((line = fileReader.readLine()) != null){
    String[] icons = parts[0].split("\\s+");
    ....
}

Thank for any advices. Here is emoticon file:
https://www.dropbox.com/s/6ovz0aupqo1utrx/emoticon.txt


Solution

  • String input = ":-) :) :o) :] :3 :c) :> =] 8) =) :} :^)";
    String[] similies = input.split(" ");
    for (String simili : similies) {
        System.out.println(simili);
    }
    

    This works fine. Output :

    :-)
    :)
    :o)
    :]
    :3
    :c)
    :>
    =]
    8)
    =)
    :}
    :^)
    

    and in case if there is any tab/newline/spaces and you wnat to split, in that case you can use

    input.split("\\s+"); 
    

    in your example there is few more charaters are their like  and non breaking whitespaces so you have to explicitly handle these type of charater. Here is the code:

    public static void main(final String[] args) throws Exception {
        BufferedReader fileReader = new BufferedReader(new FileReader("emoticon.txt"));
        String line = "";
        while ((line = fileReader.readLine()) != null) {
            line = line.replaceAll("Â", "");
            line = line.replace("" + ((char) 160), " ");
                System.out.println("line: " + line);
            String[] icons = line.split("\\s+");
            for (String icon : icons) {
                System.out.println(icon);
            }
            System.out.println("=======================");
        }
    }