Data Structure and Algorithm
Data Structure
Trie
Introduction to Trie

Trie Data Structure (Introduction)

A Trie (derived from the word "retrieval") is an efficient information retrieval data structure. It is also known as a Prefix Tree or Search Tree.

Using Trie, search complexities can be brought to an optimal limit (key length). If we store keys in a binary search tree, a well-balanced BST will need time proportional to M * log N, where M is the maximum string length and N is the number of keys in the tree. Using Trie, we can search the key in O(M) time. However, the penalty is on Trie storage requirements.


Need for Trie Data Structure

While search/insert operations in balanced trees (like AVL or Red-Black Trees) take O(M * log N) and hash tables average O(M), hash tables can have collisions, leading to O(N) worst-case lookups.

Advantages of Trie over Hash Table:

  1. Predictable Lookups: Trie provides a guaranteed O(M) lookup time complexity, where M is the length of the string.
  2. Prefix Search / Auto-Complete: Trie naturally supports prefix-based queries (e.g., finding all words starting with "and").
  3. No Collision Handling: Unlike hash tables, there are no collision issues or need for hash functions.
  4. Alphabetical Ordering: Trie keys are stored in a structured, lexicographical manner.

Properties of a Trie

  1. There is a single root node in each Trie, which typically does not represent any character.
  2. Each node (except the root) represents a single character.
  3. Node paths from the root to any node represent prefix/words.
  4. Each node contains a pointer array pointing to its child nodes (typically of size 26 for English alphabets) and a boolean flag isEndOfWord to indicate if the node marks the end of a complete word.

Representation of a Trie Node

In Javascript, a Trie Node can be represented as a class:

class TrieNode {
    constructor() {
        // Pointer array to child nodes (size 26 for English lowercase letters)
        this.children = new Array(26).fill(null);
        // Flag to mark the end of a word
        this.isEndOfWord = false;
    }
}

Basic Operations

1. Insertion

To insert a word into the Trie:

  1. Start at the root node.
  2. For each character in the word, calculate its index (e.g., char.charCodeAt(0) - 'a'.charCodeAt(0)).
  3. If the child node for that index doesn't exist, create a new TrieNode.
  4. Move the pointer to that child node.
  5. After processing all characters, set isEndOfWord to true on the final node.
function insert(root, key) {
    let curr = root;
 
    for (let i = 0; i < key.length; i++) {
        let index = key[i].charCodeAt(0) - 'a'.charCodeAt(0);
 
        if (curr.children[index] === null) {
            curr.children[index] = new TrieNode();
        }
        curr = curr.children[index];
    }
 
    // Mark last node as leaf
    curr.isEndOfWord = true;
}

2. Search

To search for a word in the Trie:

  1. Start at the root node.
  2. For each character, check if the corresponding child node exists.
  3. If at any point the child node is null, the word does not exist in the Trie.
  4. Move the pointer to the child node.
  5. After checking all characters, return true if isEndOfWord is true (meaning it's a complete word) and false otherwise.
function search(root, key) {
    let curr = root;
 
    for (let i = 0; i < key.length; i++) {
        let index = key[i].charCodeAt(0) - 'a'.charCodeAt(0);
 
        if (curr.children[index] === null) {
            return false;
        }
        curr = curr.children[index];
    }
 
    return curr.isEndOfWord;
}

Complexity Analysis

OperationTime ComplexityAuxiliary Space
InsertionO(M)O(M * N)
SearchingO(M)O(1)
  • M: Length of the word.
  • N: Number of words inserted (in worst-case, each character requires a new node allocation).