Search code examples
javatrie

Trie datastructure in Java


So I built myself a trie data structure in java but instead of arrays containing LinkedLists to each node's children. But I am having some problems. The first word gets added just fine, but on the second word, it is always comparing the wrong prefixes. For example, I start off by adding "at". This works. Then I add "Hello" and this is the result:

    adding 'at'
CURRENT CHAR IS: a
List is empty, can't iterate
List is empty, can't iterate
Can't find this char, adding node...
Returning child
CURRENT CHAR IS: t
List is empty, can't iterate
List is empty, can't iterate
Can't find this char, adding node...
END OF LINE; SET IT TO TRUE--------------
Returning child
adding 'Hello'
CURRENT CHAR IS: H
List is NOT empty
char H lista a?
false
List is empty, can't iterate
List is NOT empty
char H lista a?
false
List is empty, can't iterate
Can't find this char, adding node...
Returning child
CURRENT CHAR IS: e
List is NOT empty
char e lista at?
false
List is empty, can't iterate
List is NOT empty
char e lista at?
false
List is empty, can't iterate
Can't find this char, adding node...
Returning child
CURRENT CHAR IS: l
List is empty, can't iterate
List is empty, can't iterate
Can't find this char, adding node...
Returning child
CURRENT CHAR IS: l
List is empty, can't iterate
List is empty, can't iterate
Can't find this char, adding node...
Returning child
CURRENT CHAR IS: o
List is empty, can't iterate
List is empty, can't iterate
Can't find this char, adding node...
END OF LINE; SET IT TO TRUE--------------

Here is my code(sorry for all the prints and comments, have been debugging for a couple of hours) Trie

public class Trie {

    private Node root;
    private int size;

    /**
     * Creates the root node and sets the size to 0.
     */
    public Trie() {
        root = new Node();
        size = 0;
    }

    /**
     * Adds a new word to the trie.
     * 
     * @param word
     * @return
     */
    //vi lägger in "Hello"
    public boolean add(String word) {
        System.out.println("adding " + word);
        Node curr = root;
        if (curr == null || word == null) {
            return false;
        }
        int i = 0;
        char[] chars = word.toCharArray();

        // Loop through all letters in the word.
        while (i < chars.length) {
            System.out.println("lengt = " + chars.length);
            LinkedList<Node> children = curr.getChildren();
            // If the children nodes does not contain the letter at i.
            System.out.println("BEFORE CURRENT CHAR IS: " + chars[i]);



            if (!childContain(children, String.valueOf(chars[i]))) {//TODO? listan är tom.
                // Insert a new node
                insertNode(curr, chars[i]);
                System.out.println("Can't find this word, adding...");
                // if we have traversed all letters.
                if (i == chars.length - 1) {
                    System.out.println("END OF LINE; SET IT TO TRUE--------------");
                    // Get the child and set word to true ie we have found it.
                    getChild(curr, chars[i]).setWord(true);
                    size++;
                    return true;
                }

            }
            // get the current child.
            curr = getChild(curr, chars[i]); //NOT SURE ABOUT THIS!
            // If the current childs text is equal the word or not the word.
            if (curr.getText().equals(word) && !curr.isWord()) {
                // set the current word to true.
                curr.setWord(true);
                size++;
                return true;
            }
            i++;
        }
        return false;
    }

    /**
     * Returns true if the given string is found.
     * TODO: FIX!
     * @param str
     * @return
     */

    public boolean find(String str) {
        System.out.println(str);
        System.out.println("-----------------------------------------");

        LinkedList<Node> children = root.getChildren();
        Node node = null;
        char[] chars = str.toCharArray();
        //Loop over all letters.
        for (int i = 0; i < chars.length; i++) {
            char c = chars[i];
            //If child contains c.
            if (childContain(children, String.valueOf(c))) {
                System.out.println("DET FINNS");
                //get the child and it's children.
                node = getChild(root, c);
                children = node.getChildren();

            } else {
                System.out.println("DET FINNS INGET");
                return false;
            }
        }
        return true;
    }

    /**
     * Inserts a new Node with a char.
     * 
     * @param node
     * @param c
     * @return
     */
    private Node insertNode(Node node, Character c) {
        if (childContain(node.getChildren(), String.valueOf(c))) {
            return null;
        }

        Node next = new Node(node.getText() + c.toString());
        node.getChildren().add(next);
        return next;
    }

    /**
     * Retrieves a given node's child with the character c
     * 
     * @param trie
     * @param c
     * @return
     */
    private Node getChild(Node node, Character c) {
        LinkedList<Node> list = node.getChildren();
        Iterator<Node> iter = list.iterator();
        while (iter.hasNext()) {
            Node n = iter.next();
            if (n.getText().equals(String.valueOf(c)));
            {
                System.out.println("Returning child");
                return n;
            }
        }
        System.out.println("Returning null"); // This could never happen
        return null;
    }

    /**
     * Checks if any child contains the char c.
     * 
     * @param list
     * @param c
     * @return
     */
    private boolean childContain(LinkedList<Node> list, String c) {
        Iterator<Node> iter = list.iterator();

        while (iter.hasNext()) {
            System.out.println("List is NOT empty");
            Node n = iter.next();

            //GÖr en substräng av den existerande noden

            System.out.println("char " + (c) +" lista " + n.getText() + "?");
            System.out.println(n.getText().equals(c));


            if (n.getText().equals(c))
            {
                return true;
            }
        }
        System.out.println("List is empty, can't iterate");
        return false;
    }

    /**
     * Returns the trie's size.
     * 
     * @return
     */
    public int getSize() {
        return size;
    }
}

Node:

public class Node {

    private boolean isWord;
    private String text;
    private LinkedList<Node> children;

    public Node() {
        children = new LinkedList<Node>();
        text = "";
        isWord = false;
    }

    public Node(String text) {
        this();
        this.text = text;
    }

    public LinkedList<Node> getChildren(){
        return children;
    }
    public boolean isWord() {
        return isWord;
    }

    public void setWord(boolean isWord) {
        this.isWord = isWord;
    }

    public String getText() {
        return text;
    }

    public void setText(String text) {
        this.text = text;
    }

    @Override
    public String toString(){
        return text;
    }
}

Solution

  • I found 3 bugs.

    Frist, this getChild() method has misplaced semicolon that is causing your method to return at the first node:

    if (n.getText().equals(String.valueOf(c)))/*; - remove this semicolon*/
            {
                System.out.println("Returning child");
                return n;
            }
    

    Second, I'm assuming you want each node in your trie to only contain a single letter. Therefore, you must correct the insertNode() method like so:

    Node next = new Node(/*node.getText() +  - remove this*/c.toString());
    

    If you run that code, you will get a NullPointerException trying to find words that you add. This is because in your find method has a few bugs. I rewrote it and added some comments that explain the changes. Please let me know if it's unclear:

    public boolean find(String str) {
    
        LinkedList<Node> children = root.getChildren();
        // start the node at the root
        Node node = root;
        char[] chars = str.toCharArray();
        //Loop over all letters.
        for (int i = 0; i < chars.length; i++) {
            char c = chars[i];
            //If child contains c.
            if (childContain(children, String.valueOf(c))) {
                //get the child *of the node, not root* and it's children.
                node = getChild(node, c);
    
                // there are better ways to handle this, but I think this explicitly shows what the situations is
                if (node == null) {
                    // we have reached a node that does not have children
                    if (i == chars.length - 1) {
                        // we are at the end of the word - it is found
                        return true;
                    } else {
                        // we are in the middle of the word - it is not present
                        return false;
                    }
                }
    
                // if we have reached the end of the word this will cause NullPointer
                children = node.getChildren();
    
            } else {
                return false;
            }
        }
        return true;
    }
    

    When I run this snippet:

    public static void main(String[] args) {
        Trie trie = new Trie();
        trie.add("at");
        trie.add("Hello");
        System.out.println(trie.find("at"));
        System.out.println(trie.find("Hello"));
        System.out.println(trie.find("yea"));
    }
    

    I get "true", "true", "false"