3 posts tagged with "java"

LeetCode 28: String Matching using KMP Algorithm

November 7, 2021 · 3 min read

Senior Software Engineer, ML Infra @ Ford Pro

Understanding the Problem

String matching is a fundamental problem in computer science. Given two strings - a "haystack" and a "needle" - we need to find if and where the needle appears in the haystack.

Naive Approach

The naive solution uses two pointers:

One for the "needle" string
One for the "haystack" string

Starting from each character in the haystack, we compare every character in the needle with the corresponding character in the haystack. If all characters match, we've found our solution. If any character doesn't match, we move to the next character in the haystack and restart from the beginning of the needle string.

This results in an O(mn) time complexity, where:

m = length of haystack string
n = length of needle string

The KMP Algorithm Solution

We can do better using the Knuth-Morris-Pratt (KMP) algorithm. This algorithm uses a clever technique called the prefix-suffix array, which helps skip parts of the haystack string that we already know won't match based on the pattern in the needle.

Key Benefits

Achieves linear O(m+n) time complexity
Uses extra memory for the prefix array
Significantly faster for large strings with repeating patterns

For a visual understanding of the algorithm, check out this excellent tutorial with animations on YouTube.

Java Implementation

Here's a complete implementation of the KMP algorithm:

public int strStr(String haystack, String needle) {
    if (haystack == null || needle == null) return 0;
    char[] text = haystack.toCharArray();
    char[] pattern = needle.toCharArray();

    if (pattern.length == 0) return 0;
    if (text.length == 0) return -1;

    int[] prefixSuffix = prefixSuffix(pattern);
    int i = 0;
    int j = 0;
    
    while (i < text.length && j < pattern.length) {
        if (text[i] == text[j]) {
            i++;
            j++;
        } else {
            if (j == 0) {
                i++;
            } else {
                j = prefixSuffix[j-1];
            }
        }
    }
    
    if (j == pattern.length) {
        return true;
    }
    return false;
}

int[] prefixSuffix(char[] pattern) {
    int[] pre = new int[pattern.length];
    pre[0] = 0;
    
    int j = 0;
    for (int i = 1; i < pattern.length; i++) {
        while (j > 0 && pattern[i] != pattern[j]) {
            j = pre[j-1];
        }
        if (pattern[i] == pattern[j]) {
            j++;
        }
        pre[i] = j;
    }
    return pre;
}

Understanding the Implementation

Main Algorithm (`strStr` method)

Initial null checks and edge cases
Convert strings to char arrays for easier manipulation
Generate the prefix-suffix array for the pattern
Use two pointers to traverse the text and pattern
Return true if the entire pattern is found, false otherwise

Prefix-Suffix Array Generation (`prefixSuffix` method)

Creates an array to store prefix-suffix values
Uses two pointers to track matching prefixes and suffixes
Builds the array incrementally based on pattern matches
Returns the completed prefix-suffix array

Time and Space Complexity

Time Complexity: O(m + n)
- m = length of haystack string
- n = length of needle string
Space Complexity: O(n)
- Additional space needed for the prefix-suffix array

Conclusion

The KMP algorithm provides an efficient solution to the string matching problem by utilizing pattern information to avoid unnecessary comparisons. While it requires additional space and is more complex to implement than the naive approach, its linear time complexity makes it significantly more efficient for large strings, especially when dealing with patterns that have repeating elements.

Understanding QuickSort and QuickSelect using Hoare Partition

April 19, 2021 · 3 min read

Asif Rehan

Senior Software Engineer, ML Infra @ Ford Pro

This article explores the implementation of the QuickSort algorithm using Hoare's Partitioning scheme. While slightly more challenging to implement, this partitioning approach proves more efficient than Lomuto's scheme, performing three times fewer swaps on average [Wikipedia].

Algorithm Complexity

Average Time Complexity: O(n log n)
Worst Time Complexity: O(n²)
Space Complexity: O(1)

Understanding Hoare Partition Intuitively

The key distinction between Hoare and Lomuto partitioning schemes lies in their approach to array division. As explained by Kartikey Gupta on StackExchange:

Lomuto divides the array into three parts: elements smaller than the pivot, the pivot itself, and elements greater than the pivot. Hoare takes a different approach by dividing the array into two parts: elements smaller than the pivot, and elements equal to or greater than the pivot.

A common misconception arises from the fact that Lomuto's partition returns the final position of the pivot, whereas Hoare's partition returns the position just before the final position of the pivot. This explains why Lomuto quicksorts from lo to p-1, while Hoare quicksorts from lo to p.

Implementation in Java

Here's a complete implementation of QuickSort using Hoare's partitioning scheme:

public class QuickSortHoare {
    int[] arr;
    
    public QuickSortHoare(int[] arr) {
        this.arr = arr;
    }

    private void sort(int i, int j) {
        if (i < j) {     
            int p = partition(i, j);
            sort(i, p);
            sort(p + 1, j);
        }
    }

    private void sort() {
        sort(0, arr.length - 1);
    }
    
    private int partition(int left, int right) {
        int pivot = (left + right) / 2;
        int pivotValue = arr[pivot];
        int i = left - 1;
        int j = right + 1;
        
        while (true) {
            do {
                i++;
            } while (arr[i] < pivotValue);
            
            do {
                j--;
            } while (arr[j] > pivotValue);

            if (i < j) {
                swap(i, j);
            } else {
                return j;
            }
        }
    }

    private void swap(int i, int j) {
        int temp = arr[i];
        arr[i] = arr[j];
        arr[j] = temp;
    }
}

Key Implementation Details

The partition method uses the middle element as the pivot
Two pointers (i and j) move towards each other until they find elements that are on the wrong side
The algorithm swaps elements when both pointers find elements to swap
The process continues until the pointers meet or cross each other

Additional Resources

For more information about Hoare's partitioning scheme, check out Michael Cotterell's implementation. Note that his version returns the low pointer from partitioning, requiring recursive sorting on [low: partition-1] and [partition+1: high] intervals.

Conclusion

Hoare's partitioning scheme offers a more efficient approach to QuickSort implementation, despite its slightly more complex nature. Understanding the key differences between Hoare and Lomuto partitioning schemes is crucial for implementing and optimizing sorting algorithms in practice.

LeetCode: Kth Largest Element in an Array using QuickSelect with Hoare Partition

April 18, 2021 · 3 min read

Asif Rehan

Senior Software Engineer, ML Infra @ Ford Pro

Finding the Kth Largest Element in an Array is a problem that can be solved using approaches like QuickSort in (O(n logn) time and then picking the k-th element in the sorted array. A faster way is to use QuickSelect, which solves it in O(n) time. The solution below outperforms 100% of Java submissions on LeetCode for runtime.

QuickSelect

QuickSelect is a sorting algorithm that uses partitioning schemes like Hoare’s or Lomuto’s. It finds the (k)-th smallest value in (O(n)) time by employing a divide-and-conquer strategy. It recursively partitions and processes only the relevant section of the array.

private void quickSelect(int left, int right){
    if (left < right) {
        int p = partition(left, right);
        if (kSmallest <= p) quickSelect(left, p);
        else {
            quickSelect(p + 1, right);
        }    
    }
}

Notice if kSmallest <= p, then we know the nums array is partitioned from left to p which has values smaller than the values on the right portion from indices p+1 to right. So we only recursively sort the left.

If that is not the case, then we can sort on the right. But we cannot stop if p==kSmallest like in Lomuto’s partition because Hoare’s does not ensure that the value at the partition is the final position of the number.

Hoare’s Partitioning Scheme

Hoare’s partitioning is shown below. This partitioning scheme is much faster than Lomuto’s. There are plenty of online resources on this most of them are confusing in the one on Wikipedia. I wrote piece here on QuickSort that could help.

private int partition(int left, int right){

        int p = (left+right)/2;
        int pVal = nums[p];
        int i = left - 1;
        int j = right + 1;
        while(true){
            do {
                i++;
            } while( nums[i] < pVal);
            do {
                j--;
            } while (nums[j] > pVal);
            if(i<j){
                swap(i,j);
            } else{
                return j;
            }
        }
    }

Solution: Kth Largest Element in an Array

The solution below beats 100% Java submissions on LeetCode for runtime.

public class KthLargestElement {
    int kSmallest;
    int[] nums;

    public int findKthLargest(int[] nums, int k) {
        this.nums = nums;
        this.kSmallest = nums.length-k;
        quickSelect(0, nums.length-1);
        return nums[kSmallest];
    }

    private void quickSelect(int left, int right){
        if (left < right) {
            int p = partition(left, right);
            
            if (kSmallest <= p) quickSelect(left, p);
            else {
                quickSelect(p+1, right);
            }    
        }
    }
    private int partition(int left, int right){

        int p = (left+right)/2;
        int pVal = nums[p];
        int i = left - 1;
        int j = right + 1;
        while(true){
            do {
                i++;
            } while( nums[i] < pVal);
            do {
                j--;
            } while (nums[j] > pVal);
            if(i<j){
                swap(i,j);
            } else{
                return j;
            }
        }
    }

    private void swap(int i, int j){
        int temp = nums[i];
        nums[i] = nums[j];
        nums[j] = temp;
    }
}

Understanding the Problem​

Naive Approach​

The KMP Algorithm Solution​

Key Benefits​

Java Implementation​

Understanding the Implementation​

Main Algorithm (strStr method)​

Prefix-Suffix Array Generation (prefixSuffix method)​

Time and Space Complexity​

Conclusion​

Algorithm Complexity​

Understanding Hoare Partition Intuitively​

Implementation in Java​

Key Implementation Details​

Additional Resources​

Conclusion​

QuickSelect​

Hoare’s Partitioning Scheme​

Solution: Kth Largest Element in an Array​

Understanding the Problem

Naive Approach

The KMP Algorithm Solution

Key Benefits

Java Implementation

Understanding the Implementation

Main Algorithm (`strStr` method)

Prefix-Suffix Array Generation (`prefixSuffix` method)

Time and Space Complexity

Conclusion

Algorithm Complexity

Understanding Hoare Partition Intuitively

Implementation in Java

Key Implementation Details

Additional Resources

Conclusion

QuickSelect

Hoare’s Partitioning Scheme

Solution: Kth Largest Element in an Array