Trie Data Structure (Introduction)
A Trie (derived from the word "retrieval") is an efficient information retrieval data structure. It is also known as a Prefix Tree or Search Tree.
Using Trie, search complexities can be brought to an optimal limit (key length). If we store keys in a binary search tree, a well-balanced BST will need time proportional to M * log N, where M is the maximum string length and N is the number of keys in the tree. Using Trie, we can search the key in O(M) time. However, the penalty is on Trie storage requirements.
Need for Trie Data Structure
While search/insert operations in balanced trees (like AVL or Red-Black Trees) take O(M * log N) and hash tables average O(M), hash tables can have collisions, leading to O(N) worst-case lookups.
Advantages of Trie over Hash Table:
- Predictable Lookups: Trie provides a guaranteed
O(M)lookup time complexity, whereMis the length of the string. - Prefix Search / Auto-Complete: Trie naturally supports prefix-based queries (e.g., finding all words starting with "and").
- No Collision Handling: Unlike hash tables, there are no collision issues or need for hash functions.
- Alphabetical Ordering: Trie keys are stored in a structured, lexicographical manner.
Properties of a Trie
- There is a single root node in each Trie, which typically does not represent any character.
- Each node (except the root) represents a single character.
- Node paths from the root to any node represent prefix/words.
- Each node contains a pointer array pointing to its child nodes (typically of size 26 for English alphabets) and a boolean flag
isEndOfWordto indicate if the node marks the end of a complete word.
Representation of a Trie Node
In Javascript, a Trie Node can be represented as a class:
class TrieNode {
constructor() {
// Pointer array to child nodes (size 26 for English lowercase letters)
this.children = new Array(26).fill(null);
// Flag to mark the end of a word
this.isEndOfWord = false;
}
}Basic Operations
1. Insertion
To insert a word into the Trie:
- Start at the root node.
- For each character in the word, calculate its index (e.g.,
char.charCodeAt(0) - 'a'.charCodeAt(0)). - If the child node for that index doesn't exist, create a new
TrieNode. - Move the pointer to that child node.
- After processing all characters, set
isEndOfWordtotrueon the final node.
function insert(root, key) {
let curr = root;
for (let i = 0; i < key.length; i++) {
let index = key[i].charCodeAt(0) - 'a'.charCodeAt(0);
if (curr.children[index] === null) {
curr.children[index] = new TrieNode();
}
curr = curr.children[index];
}
// Mark last node as leaf
curr.isEndOfWord = true;
}2. Search
To search for a word in the Trie:
- Start at the root node.
- For each character, check if the corresponding child node exists.
- If at any point the child node is
null, the word does not exist in the Trie. - Move the pointer to the child node.
- After checking all characters, return
trueifisEndOfWordistrue(meaning it's a complete word) andfalseotherwise.
function search(root, key) {
let curr = root;
for (let i = 0; i < key.length; i++) {
let index = key[i].charCodeAt(0) - 'a'.charCodeAt(0);
if (curr.children[index] === null) {
return false;
}
curr = curr.children[index];
}
return curr.isEndOfWord;
}Complexity Analysis
| Operation | Time Complexity | Auxiliary Space |
|---|---|---|
| Insertion | O(M) | O(M * N) |
| Searching | O(M) | O(1) |
- M: Length of the word.
- N: Number of words inserted (in worst-case, each character requires a new node allocation).