Some thoughts on Contains Duplicate III problem in leetcode

 Today I come across a "medium" problem in leetcode, which there're only around 20% acceptance rate, and also have ridiculous high number of dislike. 

Here is the problem description:

Given an integer array nums and two integers k and t, return true if there are two distinct indices i and j in the array such that abs(nums[i] - nums[j]) <= t and abs(i - j) <= k.

 Some sample input for this problem:

Input: nums = [1,2,3,1], k = 3, t = 0
Output: true
Input: nums = [1,5,9,1,5,9], k = 2, t = 3
Output: false

At first, this looks like a typical sliding window problem with size k (based on condition abs(i-j) <= k), we can use brute force approach for this one, but it is not optimum and can't pass all the test cases. We will 2 nested loops over the nums array, and try to find the pair satisfied the condition.

for (int i = 0; i < nums.length; i++ ) {

    for(int j = i + 1; j <= i + k; j++) {

         // check if abs(nums[j] - nums[i]) <= t, then return true;

    }

}

// otherwise, return false if could not find any pair i, j satisfied the condition.

This algorithm should work, however, the time complexity if O(n*k), and it would becoming bad once k is big (such as k >= n/4).

Another approach I am trying to tackle is to use the HashMap to store  all items in rang [nums[i]-t, nums[i] + t] to it, and then we only need to check if this HashMap containing the next item nums[j];

HashMap<Long> map =  new HashMap<Long>();

    int start = 0;

    while(start < nums.length) {

        if(map.contains(nums[start]) && map.get(nums[start]) > 0) return true;

        addToMap(map, nums[start], t);

        if(start >= k){

            removeFromMap(map, nums[start - k], t);

        }

        start ++;

    }

    return false;

This algorithm does not really perform better than the brute force once, as the time complexity is O(n * t) , and the space complexity is O(k * t), which is bad when t becomes big enough.

Dig deeper into the problem statement, we can see that this is kind of a range search in each of sliding window. Which data structure is one of best choices for searching in range? It should be B+ tree.

In Java core language, there're 2 classes that implements similar to B tree, are TreeMap and TreeSet. So, I just pick TreeSet as the problem is only for exist checking, not counting.

The approach would be almost similar with the HashMap approach above, we would use sliding window technique, in each sliding window, we will need to check if the there's an item in the bTree with value within [nums[j] - t, nums[j] + t] inclusively. In order to do this check, the TreeSet class has 2 methods that we can use: 

- higher(E entry) which returns the least element in this set strictly greater than the given element

- lower(E entry) which returns the greatest element in this set strictly less than the given element

So, if we want to check if there is item in bTree has value  in [nums[j] - t, nums[j] + t] then we can either:

- get least element which is strictly higher than nums[j] - t and check if this element is smaller or equal to  nums[j] + t

- get greatest element which is strictly less than nums[j] + t and check if this element is greater or equal to  nums[j] - t

The two checking here are almost identical, so I would pick the first one. Here is the algorithm:

       TreeSet<Long> btree = new TreeSet<Long>();

        int start = 0;

        while(start < nums.length) {

            if(btree.contains((long)nums[start] - t)) return true;

            Long leastHigher = btree.higher((long)nums[start] - t);

            if(leastHigher != null && leastHigher <= (long) nums[start] + t) {

                return true;

            }

            btree.add((long)nums[start]);

            if(start >= k) {

                btree.remove((long)nums[start - k]);

            }

            start ++;

        }

        return false;

This algorithm is better than the first two, as the cost of each method in TreeMap for add/ remove/ higher is O(log(k)), so the time complexity of this algorithm is O(n * log(k)) and space complexity is O(k). And as a result, this algorithm pass all test cases.

Happy coding.

Note: originally I tried to tackle this problem using javascript language, however, in javascript language there's no native data structure for B-tree, so I switch to java language. I am empathy with the frustration causing by this problem, first is it a real 'medium' problem? and secondly, it seems this problem required some advance data structure, which unfortunately not all language supports on its own.  



Comments

Popular Posts