Search code examples
javagsontelegram-bottamil

com.google.code.gson cannot parse tamil results


So, I'm trying to fetch JSON results from https://api-thirukkural.vercel.app/api?num=1139 using Java-Telegram-Bot-Api and send it to telegram. I use com.google.code.gson dependency for parsing JSON.

The expected results from API:

{"number":1139,"sect_tam":"காமத்துப்பால்","chapgrp_tam":"களவியல்","chap_tam":"நாணுத் துறவுரைத்தல்","line1":"அறிகிலார் எல்லாரும் என்றேஎன் காமம்","line2":"மறுகின் மறுகும் மருண்டு.","tam_exp":"என்னைத் தவிர யாரும் அறியவில்லை என்பதற்காக என் காதல் தெருவில் பரவி மயங்கித் திரிகின்றது போலும்!","sect_eng":"Love","chapgrp_eng":"The Pre-marital love","chap_eng":"Declaration of Love's special Excellence","eng":"My perplexed love roves public street Believing that none knows its secret","eng_exp":"And thus, in public ways, perturbed will rove"}

Here is a piece of my java code:

  String results = "";
        Random random = new Random();
        SendMessage message = new SendMessage();
        String apiUrl = "https://api-thirukkural.vercel.app/api?num=" + random.nextInt(1329 + 1);
        try {
            URL url = new URL(apiUrl);
            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
            conn.setRequestMethod("GET");
            Scanner sc = new Scanner(url.openStream());
            while (sc.hasNext()) {
                results += sc.nextLine();
            }
            sc.close();
            JSONArray jsonArray = new JSONArray("[" + results + "]");
            JSONObject object = jsonArray.getJSONObject(0);
            message.setChatId(update.getMessage().getChatId().toString());
            message.setText("Number: " + object.getInt("number") + "\n\n" + object.getString("line1") + "\n"
                    + object.getString("line2") + "\n\n" + object.getString("tam_exp") + "\n\n" + object.getString("eng_exp"));
            conn.disconnect();
            execute(message);
        } catch (Exception e) {
            e.printStackTrace();
        }

The result in telegram:

Number: 1139

அறிகிலார� எல�லார�ம� என�றேஎன� காமம�
மற�கின� மற�க�ம� மர�ண�ட�.

என�னைத� தவிர யார�ம� அறியவில�லை என�பதற�காக என� காதல� தெர�வில� பரவி மயங�கித� திரிகின�றத� போல�ம�!

And thus, in public ways, perturbed will rove

Is this a problem in gson dependency? Can someone help me fix this? Thanks.


Solution

  • You need to specify the Charset on Scanner. That is probably the problem.

    Example:

    new Scanner(url.openStream(), StandardCharsets.UTF_8.name());
    

    You should use the Charset that fits.