Monday, February 7, 2011

Ternary Search Tree implementation

The ternary search tree contains three types of links. First, there are pointers that correspond to links in the corresponding trie, shown as dashed down-arrows. Traversing a down-link corresponds to “matching” the character from which the arrow starts. The left- and right- links are traversed when the current character does not match the desired character at the current position. We take the left-link if the character we are looking for is alphabetically before the character in the current node, and the right-link in the opposite case.

Ternary tree data structure solves the memory problem of tries which we discussed in the earlier post in a more clever way. To avoid the memory occupied by unnecessary pointers, each trie node is represented as a tree-within-a-tree rather than as an array. Each non-null pointer in the trie node gets its own node in a ternary search tree.

Each node in a Ternary search tree could be implemented like this in java:

class TNode
{
    char m_char;
    TNode left, center, right;
    boolean wordEnd;

    public TNode(char ch, boolean wordEnd)
    {
        m_char = ch;
        wordEnd = wordEnd;
    }
}

Here is a Ternary search tree that stores words AB, ABBA, ABCD, and BCD. Nodes that terminate words are marked yellow:

image_thumb[7]

Here is a simple Ternary search tree implementation in java:


public class TernarySearchTree {

    TNode root = null;
   
    public TernarySearchTree()
    {
        this.root = null;
    }

    private void insert(String key, int pos, TNode node)
    {
        char s[] = key.toCharArray();
        if (node == null) { node = new TNode(s[pos], false); }

        if (s[pos] < node.m_char) {
            insert(key, pos, node.left);
            }
        else if (s[pos] > node.m_char) {
            insert(key, pos, node.right);
            }
        else
        {
            if (pos + 1 == key.length()) { node.wordEnd = true; }
            else { insert(key, pos + 1, node.center); }
        }
    }

    public void insert(String s)
    {
        if (s == null || s == "") throw new IllegalArgumentException();

        insert(s, 0, this.root);
    }

    public boolean containsKey(String key)
    {
        if (key == null || key == "") throw new IllegalArgumentException();

        int pos = 0;
        TNode node = this.root;
        char s[] = key.toCharArray();
        while (node != null)
        {

            if (s[pos] < node.m_char) { node = node.left; }
            else if (s[pos] > node.m_char) { node = node.right; }
            else
            {
                if (++pos == key.length()) return node.wordEnd;
                node = node.center;
            }
        }

        return false;
    }
   
    public static void main(String[] args)
    {
        TernarySearchTree tree = new TernarySearchTree();
        tree.insert("AB");
        tree.insert("ABBA");
        tree.insert("ABCD");
        tree.insert("BCD");
       
        boolean found = tree.containsKey("AB");
       
        if(found)
            System.out.println("AB is found in the tree");
        else
            System.out.println("AB is not found");
       
        found = tree.containsKey("ABCD");
       
        if(found)
            System.out.println("ABCD is found in the tree");
        else
            System.out.println("ABCD is not found");

    }
}

class TNode
{
    char m_char;
    TNode left, center, right;
    boolean wordEnd;

    public TNode(char ch, boolean wordEnd)
    {
        m_char = ch;
        wordEnd = wordEnd;
    }
}

Trie implementation

A trie is a tree-like data structure in which each node contains an array of pointers, one pointer for each character in the alphabet. Starting at the root node, we can trace a word by following pointers corresponding to the letters in the target word.

Some of the uses of this Trie data structure is

1) Auto completion of the words or sentences like in Google Suggest.

2) Dictionary etc..

Each node in a trie could be implemented like this in java:

class Node {
    Node[] children;
    boolean isKey;

    public Node() {
        isKey = false;
        children = new Node[26];
    }

    public Node(boolean key) {
        isKey = key;
        children = new Node[26];
    }

}

 

Here is a trie that stores words AB, ABBA, ABCD, and BCD. Nodes that terminate words are marked yellow:

 

image

 

Here is a simple Trie search tree implementation in java:

public class Trie {

    Node root;

    public Trie() {
        this.root = new Node();
    }

    /**
     * Method to insert a string to Node and its children
     *
     * @param key
     *            the string to insert (the string is assumed to be uppercase)
     * @return true if the node or one of its children is changed, false
     *         otherwise
     */
    public boolean insert(String key) {
        return root.insert(key.toUpperCase());
    }

    /**
     * Returns whether key is a valid prefix for certain key in this trie. For
     * example: if key "hello" is in this trie, tests with all prefixes "hel",
     * "hell", "hello" return true
     *
     * @param prefix
     *            the prefix to check
     * @return true if the prefix is valid, false otherwise
     */
    public boolean containPrefix(String prefix) {
        return root.containPrefix(prefix.toUpperCase());
    }

    /**
     * Returns whether key is a valid key in this trie. For example: if key
     * "hello" is in this trie, tests with all prefixes "hel", "hell" return
     * false
     *
     * @param key
     *            the key to check
     * @return true if the key is valid, false otherwise
     */
    public boolean containKey(String key) {
        return root.containKey(key.toUpperCase());
    }
   
    public static void main(String[] args)
    {
       
        Trie trie = new Trie();
        trie.insert("AB");
        trie.insert("ABBA");
        trie.insert("ABCD");
        trie.insert("BCD");
       
        boolean found = trie.containKey("AB");
       
        if(found)
            System.out.println("AB is found in the trie");
        else
            System.out.println("AB is not found");
       
        found = trie.containPrefix("ABC");
       
        if(found)
            System.out.println("ABC prefix is found in the trie");
        else
            System.out.println("ABC prefix is not found");

    }
}

class Node {
    Node[] children;
    boolean isKey;

    public Node() {
        isKey = false;
        children = new Node[26];
    }

    public Node(boolean key) {
        isKey = key;
        children = new Node[26];
    }

    /**
     * Method to insert a string to Node and its children
     *
     * @param key
     *            the string to insert (the string is assumed to be uppercase)
     * @return true if the node or one of its children is changed, false
     *         otherwise
     */
    public boolean insert(String key) {
        // If the key is empty, this node is a key
        if (key.length() == 0) {
            if (isKey)
                return false;
            else {
                isKey = true;
                return true;
            }
        } else {// otherwise, insert in one of its child

            int childNodePosition = key.charAt(0) - 'A';
            if (children[childNodePosition] == null) {
                children[childNodePosition] = new Node();
                children[childNodePosition].insert(key.substring(1));
                return true;
            } else {
                return children[childNodePosition].insert(key.substring(1));
            }
        }
    }

    /**
     * Returns whether key is a valid prefix for certain key in this trie. For
     * example: if key "hello" is in this trie, tests with all prefixes "hel",
     * "hell", "hello" return true
     *
     * @param prefix
     *            the prefix to check
     * @return true if the prefix is valid, false otherwise
     */
    public boolean containPrefix(String prefix) {
        // If the prefix is empty, return true
        if (prefix.length() == 0) {
            return true;
        } else {// otherwise, check in one of its child
            int childNodePosition = prefix.charAt(0) - 'A';
            return children[childNodePosition] != null
                    && children[childNodePosition].containPrefix(prefix
                            .substring(1));
        }
    }

    /**
     * Returns whether key is a valid key in this trie. For example: if key
     * "hello" is in this trie, tests with all prefixes "hel", "hell" return
     * false
     *
     * @param key
     *            the key to check
     * @return true if the key is valid, false otherwise
     */
    public boolean containKey(String key) {
        // If the prefix is empty, return true
        if (key.length() == 0) {
            return isKey;
        } else {// otherwise, check in one of its child
            int childNodePosition = key.charAt(0) - 'A';
            return children[childNodePosition] != null
                    && children[childNodePosition].containKey(key.substring(1));
        }
    }

    public boolean isKey() {
        return isKey;
    }

    public void setKey(boolean key) {
        isKey = key;
    }
}