Search code examples
javaarraylistsearch-engine

How can I retrieve the value in a Hashmap stored in an arraylist type hashmap?


I am a beginner in Java. Basically, I have loaded each text document and stored each individual words in the text document in the hasmap. Afterwhich, I tried storing all the hashmaps in an ArrayList. Now I am stuck with how to retrieve all the words in my hashmaps that is in the arraylist!

 private static long numOfWords = 0;
 private String userInputString;
 private static long wordCount(String data) {
    long words = 0;
    int index = 0;
    boolean prevWhiteSpace = true;
    while (index < data.length()) {
        //Intialise character variable that will be checked.
        char c = data.charAt(index++);
        //Determine whether it is a space.
        boolean currWhiteSpace = Character.isWhitespace(c);

        //If previous is a space and character checked is not a space,
        if (prevWhiteSpace && !currWhiteSpace) {
            words++;
        }
        //Assign current character's determination of whether it is a spacing as previous.
        prevWhiteSpace = currWhiteSpace;
    }
    return words;
} //
public static ArrayList StoreLoadedFiles()throws Exception{
final File f1 = new     File   ("C:/Users/Admin/Desktop/dataFiles/"); //specify the directory to load files
 String data=""; //reset the words stored
  ArrayList<HashMap> hmArr = new ArrayList<HashMap>(); //array of hashmap


   for (final File fileEntry : f1.listFiles()) {
   Scanner input = new Scanner(fileEntry); //load files
     while (input.hasNext()) { //while there are still words in the document, continue to load all the words in a file

            data += input.next();
            input.useDelimiter("\t"); //similar to split function 

        } //while loop       
     String  textWords = data.replaceAll("\\s+", " "); //remove all found whitespaces

 HashMap<String, Integer> hm = new HashMap<String, Integer>();  //Creates a Hashmap that would be renewed when next document is loaded.

    String[] words = textWords.split(" "); //store individual words into a String array
     for (int j = 0; j < numOfWords; j++) {
                int wordAppearCount = 0;
                if (hm.containsKey(words[j].toLowerCase().replaceAll("\\W", ""))) { //replace non-word characters
                    wordAppearCount = hm.get(words[j].toLowerCase().replaceAll("\\W", "")); //remove  non-word character and retrieve the index of the word
                }
                if (!words[j].toLowerCase().replaceAll("\\W", "").equals("")) {
                    //Words stored in hashmap are in lower case and have special characters removed.
                    hm.put(words[j].toLowerCase().replaceAll("\\W", ""), ++wordAppearCount);//index of word and string word stored in hashmap
                }
   }
      hmArr.add(hm);//stores every single hashmap inside an ArrayList of hashmap
   } //end of for loop
   return hmArr; //return hashmap ArrayList
}
    public static void LoadAllHashmapWords(ArrayList m){

for(int i=0;i<m.size();i++){
m.get(i); //stuck here!

   }

Solution

  • Firstly your login wont work correctly. In the StoreLoadedFiles() method you iterate through the words like for (int j = 0; j < numOfWords; j++) { . The numOfWords field is initialized to zero and hence this loop wont execute at all. You should initialize that with length of words array.

    Having said that to retrieve the value from hashmap from a list of hashmap, you should first iterate through the list and with each hashmap you could take the entry set. Map.Entry is basically the pair that you store in the hashmap. So when you invoke map.entrySet() method it returns a java.util.Set<Map.Entry<Key, Value>>. A set is returned because the key will be unique.

    So a complete program will look like.

    import java.io.File;
    import java.io.FileNotFoundException;
    import java.util.ArrayList;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map.Entry;
    import java.util.Scanner;
    
    public class FileWordCounter {
    
        public static List<HashMap<String, Integer>> storeLoadedFiles()  {
            final File directory = new File("C:/Users/Admin/Desktop/dataFiles/"); 
            List<HashMap<String, Integer>> listOfWordCountMap = new ArrayList<HashMap<String, Integer>>(); 
            Scanner input = null;
            StringBuilder data; 
            try {
                for (final File fileEntry : directory.listFiles()) {
                    input = new Scanner(fileEntry);
                    input.useDelimiter("\t");
                    data = new StringBuilder();
                    while (input.hasNext()) { 
                        data.append(input.next());
                    }
                    input.close();
                    String wordsInFile = data.toString().replaceAll("\\s+", " "); 
                    HashMap<String, Integer> wordCountMap = new HashMap<String, Integer>(); 
    
                    for(String word : wordsInFile.split(" ")){
                        String strippedWord = word.toLowerCase().replaceAll("\\W", "");
                        int wordAppearCount = 0;
                        if(strippedWord.length() > 0){
                            if(wordCountMap.containsKey(strippedWord)){
                                wordAppearCount = wordCountMap.get(strippedWord);
                            }
                            wordCountMap.put(strippedWord, ++wordAppearCount);
                        }
                    }
                    listOfWordCountMap.add(wordCountMap);
                } 
            } catch (FileNotFoundException e) {
                e.printStackTrace();
            } finally {
                if(input != null) {
                    input.close();
                }
            }
            return listOfWordCountMap; 
        }
    
        public static void loadAllHashmapWords(List<HashMap<String, Integer>> listOfWordCountMap) {
            for(HashMap<String, Integer> wordCountMap : listOfWordCountMap){
                for(Entry<String, Integer> wordCountEntry : wordCountMap.entrySet()){
                    System.out.println(wordCountEntry.getKey() + " - " + wordCountEntry.getValue());
                }
            }
        }
    
        public static void main(String[] args) {
            List<HashMap<String, Integer>> listOfWordCountMap = storeLoadedFiles();
            loadAllHashmapWords(listOfWordCountMap);
        }
    }
    

    Since you are beginner in Java programming I would like to point out a few best practices that you could start using from the beginning.

    1. Closing resources : In your while loop to read from files you are opening a Scanner like Scanner input = new Scanner(fileEntry);, But you never closes it. This causes memory leaks. You should always use a try-catch-finally block and close resources in finally block.

    2. Avoid unnecessary redundant calls : If an operation is the same while executing inside a loop try moving it outside the loop to avoid redundant calls. In your case for example the scanner delimiter setting as input.useDelimiter("\t"); is essentially a one time operation after a scanner is initialized. So you could move that outside the while loop.

    3. Use StringBuilder instead of String : For repeated string manipulations such as concatenation should be done using a StringBuilder (or StringBuffer when you need synchronization) instead of using += or +. This is because String is an immutable object, meaning its value cannot be changed. So each time when you do a concatenation a new String object is created. This results in a lot of unused instances in memory. Where as StringBuilder is mutable and values could be changed.

    4. Naming convention : The usual naming convention in Java is starting with lower-case letter and first letter upper-case for each word. So its a standard practice to name a method as storeLoadedFiles as opposed to StoreLoadedFiles. (This could be opinion based ;))

    5. Give descriptive names : Its a good practice to give descriptive names. It helps in later code maintenance. Say its better to give a name as wordCountMap as opposed to hm. So in future if someone tries to go through your code they'll get a better and faster understanding about your code with descriptive names. Again opinion based.

    6. Use generics as much as possible : This avoid additional casting overhead.

    7. Avoid repetition : Similar to point 2 if you have an operation that result in the same output and need to be used multiple times try moving it to a variable and use the variable. In your case you were using words[j].toLowerCase().replaceAll("\\W", "") multiple times. All the time the result is the same but it creates unnecessary instances and repetitions. So you could move that to a String and use that String elsewhere.

    8. Try using for-each loop where ever possible : This relieves us from taking care of indexing.

    These are just suggestions. I tried to include most of it in my code but I wont say its the perfect one. Since you are a beginner if you tried to include these best practices now itself it'll get ingrained in you. Happy coding.. :)