Search code examples
javaperformancehashmap

How can I reduce the time of the search? (edit1)


I have a problem with my code, it takes too long to run the program when I call the method somme_2, and I want to reduce the run time. By the way the txt file that I'm using in this program contains almost 500_000 lines. Do you have any idea how to fix it ?

This is my main

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

public class Main {
    public static void main(String[] args) throws IOException {
        Somme2 somme2 = new Somme2("src/dico.txt");
        //somme2.remove(alphabeticalOrder("Amsterdam" + "ADN"), "adn");
        //somme2.remove(alphabeticalOrder("Amsterdam" + "ADN"), "riflassent");
        somme2.somme_2(alphabeticalOrder("volontiers" + "tranquillement"));

    }

    private static String alphabeticalOrder(String word) {
        word = word.toLowerCase();
        List<Character> list = new ArrayList<>();
        for (int i = 0; i < word.length(); i++) {
            list.add(word.charAt(i));
        }
        Collections.sort(list);
        String string = "";
        for (int i = 0; i < list.size(); i++) {
            string = string + list.get(i);
        }
        return string;
    }
}

And this is my class which contain the somme_2 function :

import java.io.FileNotFoundException;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.*;

public class Somme2 {
    private String path;

    public Somme2(String path) {
        this.path = path;
    }

    /***
     * values() returns a list which contains every word in our file.
     * @return List<String>
     * @throws FileNotFoundException
     */
    private List<String> values() throws IOException {
        return Files.readAllLines(Path.of(path));
    }

    /***
     * alphabeticalOrder() returns the word in the parameter, alphabetically
     * @param word
     * @return String
     */
    private String alphabeticalOrder(String word) {
        word = word.toLowerCase();
        char[] chars = word.toCharArray();
        Arrays.sort(chars);
        return new String(chars);
    }

    /***
     * addToHashMap() returns a hashMap, which uses as a key alphabetically words, and as value classical words
     * @return HashMap<String, List<String>>
     * @throws FileNotFoundException
     */
    private HashMap<String, List<String>> addToHashMap() throws IOException {
        HashMap<String, List<String>> hashMap = new HashMap<>();
        for (String value : values()) {
            String word = alphabeticalOrder(value);
            if (hashMap.containsKey(word)) {
                hashMap.get(word).add(value);
            }
            else {
                List<String> list = new ArrayList<>();
                list.add(value);
                hashMap.put(word, list);
            }
        }
        return hashMap;
    }

    /***
     * findTheLetter return the index of the letter
     * @param letter
     * @param word
     * @return
     */
    private int findTheLetter(char letter, char[] word) {
        for (int i = 0; i < word.length; i++) {
            if (word[i] == letter) {
                return i;
            }
        }
        return 0;
    }

    private char[] removeLetter(char letter, char[] word) {
        int x = findTheLetter(letter, word);
        char[] newWord;
        if (word.length == 1) {
            newWord = new char[word.length];
        } else {
            newWord = new char[word.length - 1];
        }
        for (int i = 0; i < word.length - 1; i++) {
            if (i < x) {
                newWord[i] = word[i];
            }
            else {
                newWord[i] = word[i + 1];
            }
        }
        return newWord;
    }

    public String remove(String word, String string) {
        char[] myWord = string.toCharArray();
        char[] words = word.toCharArray();
        for (int i = 0; i < string.length(); i++) {
            words = removeLetter(myWord[i], words);
        }
        return new String(words);
    }


    /***
     * somme_2
     * @param alph
     * @throws FileNotFoundException
     */
    public void somme_2(String alph) throws IOException {
        HashMap<String, List<String >> hashMap = addToHashMap();
        List<String> strings = new ArrayList<>();
        strings.addAll(hashMap.keySet());
        String alphWord = "" + alph;
        for (String string : strings) {
            if (hashMap.containsKey(remove(alphabeticalOrder(alphWord), string))) {
                System.out.println("\"" + hashMap.get(string) + "\" and \"" + hashMap.get(remove(alphWord, string)) + "\" give \"" + alphWord + "\"");
                return;
            }
        }
        System.out.println("There are no words");
    }
}

in case you wanna know, this is a part of the txt file :

A
ABS
ADN
ADNc
ADP
ADSL
AIEA
ARN
ARNm
ASBL
ASC
ASCII
AUD
Aarhus
Aaron
Aarschot
Abbeville
Abd
Abdelkader
Abel
Abidjan
Abitibi-Témiscamingue
Abkhazie
Abraham
Abu
Abuja
Abymes
Abyssinie
Acadie
Acapulco
Accra
Achaïe
Achgabat

I just fix the time problem, but now my program sometimes shows result like : "[constitutionnaliserions]" and "[V, v]" give "aeeeiilllmnnnooqrrstttuv"

which is wrong because "constitutionnaliserions" + "V" alphabetically ordered don't give "aeeeiilllmnnnooqrrstttuv".

Thank you for your help !


Solution

  • There are a lot of ways to improve your code.

    Lets start with the most important part: algorithm

    You iterate over all possible pairs and check the condition, but you can iterate only 1 time over all the words and for each word try to find "complementary" word.

    Example: alph = 'aabbcc' and on current iteration over all words we have word = 'acc'. And we want to find another word in our file (complementaryWord), which combined with word will give us alph. Obviously, this complementaryWord should be abb (abb + acc = aabbcc), and its easy to find this word, because you already have hashmap with them.

    Overall, it will improve complexity from O(n^2) to O(n).

    Code imporvements

    • alphabeticalOrder is doing a lot of unnecessary object allocaions and overall is doesn't look good. Try working here directly with char[]. Also, if you know that words could contain only specific set of letters (like only latin or some other alphabet) you can use bucket sort to improve time complexity.

    • Scanner is slow for big input files. Use, for example, BufferedRead or read lines using newer Files.readAllLines().

    • remove duplicated computations: for example, compute alphabeticalOrder(value) once and store result in variable.

    • in addToHashMap() use computeIfAbsent()to make it shorter and clearer

    • use equals() to compare string instead of hashCode(). As noted in the comments, if s1.hashCode() == s2.hashCode() it doesnt mean that s1.equals(s2).