Skip to main content

3 posts tagged with "java"

View All Tags

LeetCode 28: String Matching using KMP Algorithm

· 3 min read
Asif Rehan
Senior Software Engineer, ML Infra @ Ford Pro

Understanding the Problem

String matching is a fundamental problem in computer science. Given two strings - a "haystack" and a "needle" - we need to find if and where the needle appears in the haystack.

Naive Approach

The naive solution uses two pointers:

  • One for the "needle" string
  • One for the "haystack" string

Starting from each character in the haystack, we compare every character in the needle with the corresponding character in the haystack. If all characters match, we've found our solution. If any character doesn't match, we move to the next character in the haystack and restart from the beginning of the needle string.

This results in an O(mn) time complexity, where:

  • m = length of haystack string
  • n = length of needle string

The KMP Algorithm Solution

We can do better using the Knuth-Morris-Pratt (KMP) algorithm. This algorithm uses a clever technique called the prefix-suffix array, which helps skip parts of the haystack string that we already know won't match based on the pattern in the needle.

Key Benefits

  • Achieves linear O(m+n) time complexity
  • Uses extra memory for the prefix array
  • Significantly faster for large strings with repeating patterns

For a visual understanding of the algorithm, check out this excellent tutorial with animations on YouTube.

Java Implementation

Here's a complete implementation of the KMP algorithm:

public int strStr(String haystack, String needle) {
if (haystack == null || needle == null) return 0;
char[] text = haystack.toCharArray();
char[] pattern = needle.toCharArray();

if (pattern.length == 0) return 0;
if (text.length == 0) return -1;

int[] prefixSuffix = prefixSuffix(pattern);
int i = 0;
int j = 0;

while (i < text.length && j < pattern.length) {
if (text[i] == text[j]) {
i++;
j++;
} else {
if (j == 0) {
i++;
} else {
j = prefixSuffix[j-1];
}
}
}

if (j == pattern.length) {
return true;
}
return false;
}

int[] prefixSuffix(char[] pattern) {
int[] pre = new int[pattern.length];
pre[0] = 0;

int j = 0;
for (int i = 1; i < pattern.length; i++) {
while (j > 0 && pattern[i] != pattern[j]) {
j = pre[j-1];
}
if (pattern[i] == pattern[j]) {
j++;
}
pre[i] = j;
}
return pre;
}

Understanding the Implementation

Main Algorithm (strStr method)

  1. Initial null checks and edge cases
  2. Convert strings to char arrays for easier manipulation
  3. Generate the prefix-suffix array for the pattern
  4. Use two pointers to traverse the text and pattern
  5. Return true if the entire pattern is found, false otherwise

Prefix-Suffix Array Generation (prefixSuffix method)

  1. Creates an array to store prefix-suffix values
  2. Uses two pointers to track matching prefixes and suffixes
  3. Builds the array incrementally based on pattern matches
  4. Returns the completed prefix-suffix array

Time and Space Complexity

  • Time Complexity: O(m + n)
    • m = length of haystack string
    • n = length of needle string
  • Space Complexity: O(n)
    • Additional space needed for the prefix-suffix array

Conclusion

The KMP algorithm provides an efficient solution to the string matching problem by utilizing pattern information to avoid unnecessary comparisons. While it requires additional space and is more complex to implement than the naive approach, its linear time complexity makes it significantly more efficient for large strings, especially when dealing with patterns that have repeating elements.

Understanding QuickSort and QuickSelect using Hoare Partition

· 3 min read
Asif Rehan
Senior Software Engineer, ML Infra @ Ford Pro

This article explores the implementation of the QuickSort algorithm using Hoare's Partitioning scheme. While slightly more challenging to implement, this partitioning approach proves more efficient than Lomuto's scheme, performing three times fewer swaps on average [Wikipedia].

Algorithm Complexity

  • Average Time Complexity: O(n log n)
  • Worst Time Complexity: O(n²)
  • Space Complexity: O(1)

Understanding Hoare Partition Intuitively

The key distinction between Hoare and Lomuto partitioning schemes lies in their approach to array division. As explained by Kartikey Gupta on StackExchange:

Lomuto divides the array into three parts: elements smaller than the pivot, the pivot itself, and elements greater than the pivot. Hoare takes a different approach by dividing the array into two parts: elements smaller than the pivot, and elements equal to or greater than the pivot.

A common misconception arises from the fact that Lomuto's partition returns the final position of the pivot, whereas Hoare's partition returns the position just before the final position of the pivot. This explains why Lomuto quicksorts from lo to p-1, while Hoare quicksorts from lo to p.

Implementation in Java

Here's a complete implementation of QuickSort using Hoare's partitioning scheme:

public class QuickSortHoare {
int[] arr;

public QuickSortHoare(int[] arr) {
this.arr = arr;
}

private void sort(int i, int j) {
if (i < j) {
int p = partition(i, j);
sort(i, p);
sort(p + 1, j);
}
}

private void sort() {
sort(0, arr.length - 1);
}

private int partition(int left, int right) {
int pivot = (left + right) / 2;
int pivotValue = arr[pivot];
int i = left - 1;
int j = right + 1;

while (true) {
do {
i++;
} while (arr[i] < pivotValue);

do {
j--;
} while (arr[j] > pivotValue);

if (i < j) {
swap(i, j);
} else {
return j;
}
}
}

private void swap(int i, int j) {
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
}

Key Implementation Details

  1. The partition method uses the middle element as the pivot
  2. Two pointers (i and j) move towards each other until they find elements that are on the wrong side
  3. The algorithm swaps elements when both pointers find elements to swap
  4. The process continues until the pointers meet or cross each other

Additional Resources

For more information about Hoare's partitioning scheme, check out Michael Cotterell's implementation. Note that his version returns the low pointer from partitioning, requiring recursive sorting on [low: partition-1] and [partition+1: high] intervals.

Conclusion

Hoare's partitioning scheme offers a more efficient approach to QuickSort implementation, despite its slightly more complex nature. Understanding the key differences between Hoare and Lomuto partitioning schemes is crucial for implementing and optimizing sorting algorithms in practice.

LeetCode: Kth Largest Element in an Array using QuickSelect with Hoare Partition

· 3 min read
Asif Rehan
Senior Software Engineer, ML Infra @ Ford Pro

Finding the Kth Largest Element in an Array is a problem that can be solved using approaches like QuickSort in (O(n logn) time and then picking the k-th element in the sorted array. A faster way is to use QuickSelect, which solves it in O(n) time. The solution below outperforms 100% of Java submissions on LeetCode for runtime.

QuickSelect

QuickSelect is a sorting algorithm that uses partitioning schemes like Hoare’s or Lomuto’s. It finds the (k)-th smallest value in (O(n)) time by employing a divide-and-conquer strategy. It recursively partitions and processes only the relevant section of the array.

private void quickSelect(int left, int right){
if (left < right) {
int p = partition(left, right);
if (kSmallest <= p) quickSelect(left, p);
else {
quickSelect(p + 1, right);
}
}
}

Notice if kSmallest <= p, then we know the nums array is partitioned from left to p which has values smaller than the values on the right portion from indices p+1 to right. So we only recursively sort the left.

If that is not the case, then we can sort on the right. But we cannot stop if p==kSmallest like in Lomuto’s partition because Hoare’s does not ensure that the value at the partition is the final position of the number.

Hoare’s Partitioning Scheme

Hoare’s partitioning is shown below. This partitioning scheme is much faster than Lomuto’s. There are plenty of online resources on this most of them are confusing in the one on Wikipedia. I wrote piece here on QuickSort that could help.

private int partition(int left, int right){

int p = (left+right)/2;
int pVal = nums[p];
int i = left - 1;
int j = right + 1;
while(true){
do {
i++;
} while( nums[i] < pVal);
do {
j--;
} while (nums[j] > pVal);
if(i<j){
swap(i,j);
} else{
return j;
}
}
}

Solution: Kth Largest Element in an Array

The solution below beats 100% Java submissions on LeetCode for runtime.

public class KthLargestElement {
int kSmallest;
int[] nums;

public int findKthLargest(int[] nums, int k) {
this.nums = nums;
this.kSmallest = nums.length-k;
quickSelect(0, nums.length-1);
return nums[kSmallest];
}

private void quickSelect(int left, int right){
if (left < right) {
int p = partition(left, right);

if (kSmallest <= p) quickSelect(left, p);
else {
quickSelect(p+1, right);
}
}
}
private int partition(int left, int right){

int p = (left+right)/2;
int pVal = nums[p];
int i = left - 1;
int j = right + 1;
while(true){
do {
i++;
} while( nums[i] < pVal);
do {
j--;
} while (nums[j] > pVal);
if(i<j){
swap(i,j);
} else{
return j;
}
}
}

private void swap(int i, int j){
int temp = nums[i];
nums[i] = nums[j];
nums[j] = temp;
}
}