I have a problem with my code, it takes too long to run the program when I call the method somme_2, and I want to reduce the run time. By the way the txt file that I'm using in this program contains almost 500_000 lines. Do you have any idea how to fix it ?
This is my main
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class Main {
public static void main(String[] args) throws IOException {
Somme2 somme2 = new Somme2("src/dico.txt");
//somme2.remove(alphabeticalOrder("Amsterdam" + "ADN"), "adn");
//somme2.remove(alphabeticalOrder("Amsterdam" + "ADN"), "riflassent");
somme2.somme_2(alphabeticalOrder("volontiers" + "tranquillement"));
}
private static String alphabeticalOrder(String word) {
word = word.toLowerCase();
List<Character> list = new ArrayList<>();
for (int i = 0; i < word.length(); i++) {
list.add(word.charAt(i));
}
Collections.sort(list);
String string = "";
for (int i = 0; i < list.size(); i++) {
string = string + list.get(i);
}
return string;
}
}
And this is my class which contain the somme_2 function :
import java.io.FileNotFoundException;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.*;
public class Somme2 {
private String path;
public Somme2(String path) {
this.path = path;
}
/***
* values() returns a list which contains every word in our file.
* @return List<String>
* @throws FileNotFoundException
*/
private List<String> values() throws IOException {
return Files.readAllLines(Path.of(path));
}
/***
* alphabeticalOrder() returns the word in the parameter, alphabetically
* @param word
* @return String
*/
private String alphabeticalOrder(String word) {
word = word.toLowerCase();
char[] chars = word.toCharArray();
Arrays.sort(chars);
return new String(chars);
}
/***
* addToHashMap() returns a hashMap, which uses as a key alphabetically words, and as value classical words
* @return HashMap<String, List<String>>
* @throws FileNotFoundException
*/
private HashMap<String, List<String>> addToHashMap() throws IOException {
HashMap<String, List<String>> hashMap = new HashMap<>();
for (String value : values()) {
String word = alphabeticalOrder(value);
if (hashMap.containsKey(word)) {
hashMap.get(word).add(value);
}
else {
List<String> list = new ArrayList<>();
list.add(value);
hashMap.put(word, list);
}
}
return hashMap;
}
/***
* findTheLetter return the index of the letter
* @param letter
* @param word
* @return
*/
private int findTheLetter(char letter, char[] word) {
for (int i = 0; i < word.length; i++) {
if (word[i] == letter) {
return i;
}
}
return 0;
}
private char[] removeLetter(char letter, char[] word) {
int x = findTheLetter(letter, word);
char[] newWord;
if (word.length == 1) {
newWord = new char[word.length];
} else {
newWord = new char[word.length - 1];
}
for (int i = 0; i < word.length - 1; i++) {
if (i < x) {
newWord[i] = word[i];
}
else {
newWord[i] = word[i + 1];
}
}
return newWord;
}
public String remove(String word, String string) {
char[] myWord = string.toCharArray();
char[] words = word.toCharArray();
for (int i = 0; i < string.length(); i++) {
words = removeLetter(myWord[i], words);
}
return new String(words);
}
/***
* somme_2
* @param alph
* @throws FileNotFoundException
*/
public void somme_2(String alph) throws IOException {
HashMap<String, List<String >> hashMap = addToHashMap();
List<String> strings = new ArrayList<>();
strings.addAll(hashMap.keySet());
String alphWord = "" + alph;
for (String string : strings) {
if (hashMap.containsKey(remove(alphabeticalOrder(alphWord), string))) {
System.out.println("\"" + hashMap.get(string) + "\" and \"" + hashMap.get(remove(alphWord, string)) + "\" give \"" + alphWord + "\"");
return;
}
}
System.out.println("There are no words");
}
}
in case you wanna know, this is a part of the txt file :
A
ABS
ADN
ADNc
ADP
ADSL
AIEA
ARN
ARNm
ASBL
ASC
ASCII
AUD
Aarhus
Aaron
Aarschot
Abbeville
Abd
Abdelkader
Abel
Abidjan
Abitibi-Témiscamingue
Abkhazie
Abraham
Abu
Abuja
Abymes
Abyssinie
Acadie
Acapulco
Accra
Achaïe
Achgabat
I just fix the time problem, but now my program sometimes shows result like : "[constitutionnaliserions]" and "[V, v]" give "aeeeiilllmnnnooqrrstttuv"
which is wrong because "constitutionnaliserions" + "V" alphabetically ordered don't give "aeeeiilllmnnnooqrrstttuv".
Thank you for your help !
There are a lot of ways to improve your code.
You iterate over all possible pairs and check the condition, but you can iterate only 1 time over all the words and for each word try to find "complementary" word.
Example: alph
= 'aabbcc' and on current iteration over all words we have word
= 'acc'. And we want to find another word in our file (complementaryWord
), which combined with word
will give us alph
. Obviously, this complementaryWord
should be abb
(abb
+ acc
= aabbcc
), and its easy to find this word, because you already have hashmap with them.
Overall, it will improve complexity from O(n^2)
to O(n)
.
alphabeticalOrder
is doing a lot of unnecessary object allocaions and overall is doesn't look good. Try working here directly with char[]
. Also, if you know that words could contain only specific set of letters (like only latin or some other alphabet) you can use bucket sort
to improve time complexity.
Scanner
is slow for big input files. Use, for example, BufferedRead
or read lines using newer Files.readAllLines()
.
remove duplicated computations: for example, compute alphabeticalOrder(value)
once and store result in variable.
in addToHashMap()
use computeIfAbsent()to make it shorter and clearer
use equals()
to compare string instead of hashCode()
. As noted in the comments, if s1.hashCode() == s2.hashCode()
it doesnt mean that s1.equals(s2)
.