String Algorithms & Techniques

Searching for a pattern in text by checking every position is O(n·m) — and on a repetitive text like "AAAAAAAAB" it does almost the same comparisons over and over. KMP does the same search in O(n + m) by never re-examining a character it already matched. How can it possibly know what to skip? That trick — and a handful of others — is what separates people who dread string problems from people who finish them in minutes.

Strings introduce a set of algorithmic techniques beyond what you saw with arrays. Pattern matching, character frequency analysis, and the sliding window on characters are the core skills that separate those who struggle with string problems from those who solve them in minutes.

Operation	Time	Space	Notes
Naive Pattern Match	O(n*m)	O(1)	Check every position
KMP Algorithm	O(n+m)	O(m)	Preprocess pattern with failure function
Rabin-Karp	O(n+m)	O(1)	Rolling hash; O(n*m) worst case
Two Pointers (palindrome)	O(n)	O(1)	Compare from both ends
Sliding Window	O(n)	O(k)	k = character set size
Character Frequency	O(n)	O(1)	Fixed 26 or 128 array

Naive Pattern Matching

The simplest way to find a pattern in a text: try every position and check character by character.

Analogy: You lost your house key somewhere in a row of boxes. You open each box, check if the key pattern matches — if not, move to the next box.

Pattern Matching: "ABD" in "ABCABCABD"

Step 1 of 9

Text:

A

B

C

A

B

C

A

B

D

Pattern:

A

B

D

Position 0: Compare text[0]='A' with pattern[0]='A'. Match! Continue.

#include <vector>
#include <string>
using namespace std;
 
vector<int> naiveSearch(const string& text, const string& pattern) {
    vector<int> positions;
    int n = text.length(), m = pattern.length();
 
    for (int i = 0; i <= n - m; i++) {
        int j = 0;
        while (j < m && text[i + j] == pattern[j]) {
            j++;
        }
        if (j == m) {
            positions.push_back(i);  // Pattern found at index i
        }
    }
    return positions;
}

Time Complexity: O((n - m + 1) _ m) which simplifies to O(n _ m)
Space Complexity: O(1) extra (excluding output)
When to use: Small inputs, quick prototyping, or when the interviewer says “start with brute force”

KMP (Knuth-Morris-Pratt) Algorithm

KMP avoids re-examining characters by preprocessing the pattern to build a “failure function” (also called the prefix table or LPS array). When a mismatch occurs, instead of going back to the start, we skip ahead using the failure function.

Analogy: Imagine you are comparing two rulers. When you find a mismatch, instead of starting over completely, you notice that part of what you already matched is also a prefix of the pattern — so you slide the pattern forward just enough to align that repeated prefix.

The LPS (Longest Proper Prefix which is also Suffix) Array:

For pattern "ABCABD", the LPS array is [0, 0, 0, 1, 2, 0].

LPS[3] = 1 because "ABCA" has prefix "A" that equals suffix "A" (length 1)
LPS[4] = 2 because "ABCAB" has prefix "AB" that equals suffix "AB" (length 2)

Building LPS array for pattern ABCABD

Step 1 of 6

A

[0]

B

[1]

C

[2]

A

[3]

B

[4]

D

[5]

LPS[0] is always 0. Pattern prefix of length 1 has no proper prefix that is also a suffix.

#include <vector>
#include <string>
using namespace std;
 
vector<int> buildLPS(const string& pattern) {
    int m = pattern.length();
    vector<int> lps(m, 0);
    int len = 0;  // Length of previous longest prefix suffix
    int i = 1;
 
    while (i < m) {
        if (pattern[i] == pattern[len]) {
            len++;
            lps[i] = len;
            i++;
        } else {
            if (len != 0) {
                len = lps[len - 1];  // Fall back (don't increment i)
            } else {
                lps[i] = 0;
                i++;
            }
        }
    }
    return lps;
}
 
vector<int> kmpSearch(const string& text, const string& pattern) {
    int n = text.length(), m = pattern.length();
    vector<int> lps = buildLPS(pattern);
    vector<int> positions;
 
    int i = 0, j = 0;
    while (i < n) {
        if (text[i] == pattern[j]) {
            i++;
            j++;
        }
        if (j == m) {
            positions.push_back(i - j);
            j = lps[j - 1];
        } else if (i < n && text[i] != pattern[j]) {
            if (j != 0)
                j = lps[j - 1];
            else
                i++;
        }
    }
    return positions;
}

Time Complexity: O(n + m) — linear!
Space Complexity: O(m) for the LPS array
When to use: Large texts with repeated patterns, when O(n*m) is too slow

Interview reality: You rarely need to code KMP from scratch in interviews. But understanding why it works (the LPS array) is frequently asked. Focus on being able to explain the concept and compute the LPS array by hand.

Rabin-Karp Algorithm

Rabin-Karp uses a rolling hash to quickly compare the pattern with substrings of the text. Instead of comparing characters one by one, it compares hash values. Only when hashes match does it verify character by character.

Analogy: Instead of reading every book title to find a match, you compare the number of letters first (a quick hash). Only if the count matches do you actually read the title.

Rolling Hash Formula:

For a window of text t[i..i+m-1], the hash is:

hash = (t[i] * d^(m-1) + t[i+1] * d^(m-2) + ... + t[i+m-1]) mod q

When sliding the window by one position:

new_hash = (d * (old_hash - t[i] * d^(m-1)) + t[i+m]) mod q

This recalculation takes O(1) — that is the power of rolling hash.

#include <vector>
#include <string>
using namespace std;
 
vector<int> rabinKarp(const string& text, const string& pattern) {
    int n = text.length(), m = pattern.length();
    int d = 256;      // Number of characters in alphabet
    int q = 101;      // A prime number for hashing
    vector<int> result;
 
    int h = 1;  // d^(m-1) % q
    for (int i = 0; i < m - 1; i++)
        h = (h * d) % q;
 
    // Calculate initial hashes
    int patternHash = 0, textHash = 0;
    for (int i = 0; i < m; i++) {
        patternHash = (d * patternHash + pattern[i]) % q;
        textHash = (d * textHash + text[i]) % q;
    }
 
    // Slide the pattern over text
    for (int i = 0; i <= n - m; i++) {
        if (patternHash == textHash) {
            // Hash match -- verify characters
            if (text.substr(i, m) == pattern) {
                result.push_back(i);
            }
        }
        // Calculate hash for next window
        if (i < n - m) {
            textHash = (d * (textHash - text[i] * h) + text[i + m]) % q;
            if (textHash < 0) textHash += q;
        }
    }
    return result;
}

Time Complexity: O(n + m) average, O(n * m) worst case (hash collisions)
Space Complexity: O(1) extra
When to use: Searching for multiple patterns, plagiarism detection, DNA sequence matching

Two Pointers for Palindrome Checking

The most common string technique in interviews: start from both ends and walk inward, comparing characters.

Analogy: Two people reading the same word from opposite ends of a banner. If they always see the same letter, it is a palindrome.

Palindrome check: Is 'racecar' a palindrome?

Step 1 of 4

r

[0]

a

[1]

c

[2]

e

[3]

c

[4]

a

[5]

r

[6]

left=0, right=6. s[0]='r' == s[6]='r'. Match! Move inward.

bool isPalindrome(const string& s) {
    int left = 0, right = s.length() - 1;
    while (left < right) {
        if (s[left] != s[right]) return false;
        left++;
        right--;
    }
    return true;
}
 
// With alphanumeric filtering (LeetCode-style)
bool isPalindromeClean(const string& s) {
    int left = 0, right = s.length() - 1;
    while (left < right) {
        while (left < right && !isalnum(s[left])) left++;
        while (left < right && !isalnum(s[right])) right--;
        if (tolower(s[left]) != tolower(s[right])) return false;
        left++;
        right--;
    }
    return true;
}

Time Complexity: O(n) — single pass from both ends
Space Complexity: O(1)

Recall the palindrome check

Two pointers, walk inward, compare. Fill in the one comparison that decides it:

python · fill in the blanks0/1 hints

def is_palindrome(s):
  left, right = 0, len(s) - 1
  while left < right:
      # ??? the comparison that rejects a non-palindrome
          return False
      left += 1
      right -= 1
  return True

Sliding Window on Strings

The sliding window technique from arrays works even better on strings. It is the go-to approach for substring problems.

Key pattern: Maintain a window [left, right) and a character frequency map. Expand right to include more characters, shrink left when a condition is violated.

Example: Longest Substring Without Repeating Characters

Predict first

Sliding a no-repeats window across 'abcabcbb' — what's the length of the longest substring with all-distinct characters?

Sliding window on "abcabcbb"

Step 1 of 8

a

[0]

b

[1]

c

[2]

a

[3]

b

[4]

c

[5]

b

[6]

b

[7]

Window: 'a'. Set: {a}. Length=1. Max=1.

int lengthOfLongestSubstring(const string& s) {
    unordered_set<char> window;
    int left = 0, maxLen = 0;
 
    for (int right = 0; right < s.length(); right++) {
        while (window.count(s[right])) {
            window.erase(s[left]);
            left++;
        }
        window.insert(s[right]);
        maxLen = max(maxLen, right - left + 1);
    }
    return maxLen;
}

Time Complexity: O(n) — each character is added and removed at most once
Space Complexity: O(min(n, k)) where k is the character set size

Character Frequency Counting

The bread and butter of string problems. Counting character frequencies with an array or hash map unlocks solutions to anagrams, permutations, and character-based comparisons.

// Check if two strings are anagrams
bool isAnagram(const string& s, const string& t) {
    if (s.length() != t.length()) return false;
 
    int freq[26] = {0};
    for (char c : s) freq[c - 'a']++;
    for (char c : t) freq[c - 'a']--;
 
    for (int i = 0; i < 26; i++) {
        if (freq[i] != 0) return false;
    }
    return true;
}

Time Complexity: O(n) — single pass to count, single pass to compare
Space Complexity: O(1) — fixed 26-element array for lowercase English

Choosing the Right Technique

Problem Pattern	Technique	Time
”Find pattern in text”	Naive / KMP / Rabin-Karp	O(n*m) / O(n+m)
“Is it a palindrome?”	Two Pointers from ends	O(n)
“Longest/shortest substring with property X”	Sliding Window	O(n)
“Are these anagrams?”	Character Frequency Array	O(n)
“Group similar strings”	Sort each string or frequency key	O(n * k log k)
“All permutations of pattern in text”	Sliding Window + Frequency	O(n)

Mental model: If the problem is about a substring (contiguous), think Sliding Window. If it is about character composition (order does not matter), think Frequency Counting. If it is about matching from both ends, think Two Pointers.

Quick Check

What does the LPS array in KMP represent?

You need to find all anagrams of a pattern (length m) in a text (length n). What is the optimal approach?

These techniques get their own full chapters later — pattern matching deepens in Day 27 (Advanced Strings), and the window/two-pointer ideas anchor Day 24 and Day 25. Next: the string practice questions, where you’ll choose among them under interview pressure. That wraps Day 2 — onward to Day 3 — Linked Lists, where giving up O(1) indexing buys O(1) insertion anywhere.

Basic Operations Problem List

Finished this page?