Monday, February 7, 2011

Ternary Search Tree implementation

The ternary search tree contains three types of links. First, there are pointers that correspond to links in the corresponding trie, shown as dashed down-arrows. Traversing a down-link corresponds to “matching” the character from which the arrow starts. The left- and right- links are traversed when the current character does not match the desired character at the current position. We take the left-link if the character we are looking for is alphabetically before the character in the current node, and the right-link in the opposite case.

Ternary tree data structure solves the memory problem of tries which we discussed in the earlier post in a more clever way. To avoid the memory occupied by unnecessary pointers, each trie node is represented as a tree-within-a-tree rather than as an array. Each non-null pointer in the trie node gets its own node in a ternary search tree.

Each node in a Ternary search tree could be implemented like this in java:

class TNode
{
    char m_char;
    TNode left, center, right;
    boolean wordEnd;

    public TNode(char ch, boolean wordEnd)
    {
        m_char = ch;
        wordEnd = wordEnd;
    }
}

Here is a Ternary search tree that stores words AB, ABBA, ABCD, and BCD. Nodes that terminate words are marked yellow:

image_thumb[7]

Here is a simple Ternary search tree implementation in java:


public class TernarySearchTree {

    TNode root = null;
   
    public TernarySearchTree()
    {
        this.root = null;
    }

    private void insert(String key, int pos, TNode node)
    {
        char s[] = key.toCharArray();
        if (node == null) { node = new TNode(s[pos], false); }

        if (s[pos] < node.m_char) {
            insert(key, pos, node.left);
            }
        else if (s[pos] > node.m_char) {
            insert(key, pos, node.right);
            }
        else
        {
            if (pos + 1 == key.length()) { node.wordEnd = true; }
            else { insert(key, pos + 1, node.center); }
        }
    }

    public void insert(String s)
    {
        if (s == null || s == "") throw new IllegalArgumentException();

        insert(s, 0, this.root);
    }

    public boolean containsKey(String key)
    {
        if (key == null || key == "") throw new IllegalArgumentException();

        int pos = 0;
        TNode node = this.root;
        char s[] = key.toCharArray();
        while (node != null)
        {

            if (s[pos] < node.m_char) { node = node.left; }
            else if (s[pos] > node.m_char) { node = node.right; }
            else
            {
                if (++pos == key.length()) return node.wordEnd;
                node = node.center;
            }
        }

        return false;
    }
   
    public static void main(String[] args)
    {
        TernarySearchTree tree = new TernarySearchTree();
        tree.insert("AB");
        tree.insert("ABBA");
        tree.insert("ABCD");
        tree.insert("BCD");
       
        boolean found = tree.containsKey("AB");
       
        if(found)
            System.out.println("AB is found in the tree");
        else
            System.out.println("AB is not found");
       
        found = tree.containsKey("ABCD");
       
        if(found)
            System.out.println("ABCD is found in the tree");
        else
            System.out.println("ABCD is not found");

    }
}

class TNode
{
    char m_char;
    TNode left, center, right;
    boolean wordEnd;

    public TNode(char ch, boolean wordEnd)
    {
        m_char = ch;
        wordEnd = wordEnd;
    }
}

5 comments:

eggmatters said...

Won't containsKey() always return false?

Prog47 said...

It should be noted that this code is actually WRONG.

Larry M. Lemons said...

The code is mostly right. With some minor changes (such as returning the object and storing it upon an insert), it can be fixed.

Larry M. Lemons said...

The code is mostly correct. With some minor changes, such as returning the object upon an insert, it can be fixed.

Unknown said...

This code is correct with modification. In the insert function
if (s[pos] < node.m_char)
{
insert(key, pos, node.left);
}

Here it is calling a new insert function with null node(node.left). Here node.left is a null node. So in each call of insert function a node is created but it is not linked with its parent node. So initialize a node(left/right/center) first then call insert function.

Modified part for left node:
if (s[pos] < node.m_char)
{
if(node.left == null) node.left = new Node(s[pos], false);
insert(key, pos, node.left);
}

So left node are linked with its parent. change for the right and center node.