Search code examples
javaprologtcpffiswi-prolog

Problems with German Umlaute, TCPIP from SWI Prolog to JAVA


I try to implement a tcpip connection between a Prolog server application and a Java client application. It works but I have the problem that I do not receive the German Umlaute (ä,ü, etc.) correctly. That is what I did:

in Java, the client side:

inputreader = new BufferedReader(new InputStreamReader(clientSocket.getInputStream(),"UTF-8" ));

...

StringBuilder sb = new StringBuilder();
String responseLine;
while ((responseLine = inputreader.readLine()) != null) {
    // System.out.println("Server: " + responseLine);

    // Character.toString ((char) i);
    sb.append(responseLine);
    if( responseLine.indexOf( ".") != -1 && responseLine.length() == 1 ){
       break;
    }
}

in SWI Prolog, the server side:

:- set_prolog_flag(encoding,utf8).

% based on http://swi-prolog.996271.n3.nabble.com/socket-communication-td1653.html
server(PortNumber) :-
    setup_call_cleanup(tcp_socket(S),  % no leaks, please
           (true; fail),
           tcp_close_socket(S)),
    tcp_bind(S, PortNumber),
    tcp_listen(S, 5),
    format('listen to portnumber ~w~n', [PortNumber]),
    server_loop(S).

server_loop(S) :-
    tcp_accept(S, S1, From),
    format('receiving traffic from: ~q~n', [From]),
    setup_call_cleanup(tcp_open_socket(S1, In, Out),
           server_operation(In, Out),
           (  writeln('closing...'),
              close(In),
              close(Out))), !,
    server_loop(S).

server_operation(In, Out) :-
    \+at_end_of_stream(In),
    read_pending_input(In, Codes, []), 
    atom_codes(Text,Codes),
    write('received from client: '),write(Text),nl,

    % job which has to be done in Prolog
    extract_fct_call( Text, Fname, ListOfIname, ListOfIcontent ),
    call_fct( Fname, ListOfIname, ListOfIcontent, XMLreply ),

    atom_codes( XMLreply, CodeReply ),
    % append defined EOM as \n.\n
    append( CodeReply, [10,46,10], CodeMessage ),
    format(Out, '~s', [Message]),
    flush_output(Out),
    server_operation(In, Out).

server_operation(_In, _Out).

So, I thought defining both side UTF-8 encoding would do it, but it doesn't. I start on SWI Prolog side with "Gemüse" and receive on Java side "Gemxse". I tried to send ASCII Code by using

atom_codes( Message, CodeMessage ),
format(Out, '~s', [CodeMessage]),

and on Java side

inputreader = new BufferedReader(new InputStreamReader(clientSocket.getInputStream(),"US-ASCII" ));

or

inputreader = new BufferedReader(new InputStreamReader(clientSocket.getInputStream(),"ASCII" ));

but the result does not change. What can I do?

Many thanks in advance!

SOLUTION:

With the help of CapelliC it was solved by adding

set_stream(Stream, encoding(utf8))

to

server_operation(In, Out) :-
    set_stream(In, encoding(utf8)),
    set_stream(Out, encoding(utf8)),
    \+at_end_of_stream(In),
    ...

then the Umlaute are received in the Java application (be sure that there, in the properties of the project, the encoding of text is also set to UTF-8).


Solution

  • stream_property(Stream, encoding(Encoding)) will allow you to inspect and change the encoding on Prolog side. But once you have verified that Prolog actually defaults to UTF-8, and you're happy with that, maybe you should adjust the Java interface, maybe specifying the encoding on every IO operation

    byte[] utf8Bytes = s.getBytes("UTF-8");