Data Structure 1 - Binary Search
Summary of Binary Search (also called half-interval search)
Given an array A = [a0, a1,….an], where elements are in increasing order, that is, a1< a2 <…< an. We are going to find the index of an element ak inside the array.
Let consider the array below, where a0=1, a1=3…
index | 0 | 1 | … | k | … | n |
---|---|---|---|---|---|---|
a[i] | a[0] = 1 | a[1] =3 | … | a[k] = ak =10 | … | a[n]=an= 20 |
let set left variable L store index of a1 and right variable R stores index of an. Then
- Let L=0, R=n
- Check if a[L]=ak or a[R]=ak. If yes, return index of ak. Otherwise, step3
- Find middle index M =$\frac{(L+R)}{2}$ and check if a[M]=ak. If no, continue
- if a[M]>ak, let R=M and search subarray between index L~ M
Reason:
we know array is in increasing order and elements between M and R must be larger than a[M] and ak. Hence we only need to search elements between L and M - if a[M]<ak, let L=M search subarray between index M~ R
The logic is similar to step 4 - Back to Step 3 until ak is found, Or R = L+1, that is, ak is not found.
Analysis
- Computational Complexity: O($log_2(n)$) since it makes $log_2 n$ iterations.
- Space complexity of binary search is O(1)
My Thoughts
Its main idea is to reduce the searching space by half each time by using mid-intervel
Assumption
- It requires we know L and R
- It assumes the array has been sorted. If the array has not been sorted, we can use Binary-Search-Tree to sort and store data and then use in-order traversal to obtain index of each element
When to Apply
- It is good to search specific values that can be found by comparison of values
- However, in my opinion, It is not good enough to explore the combination of elements. For example, Finding the max sum of N numbers in an array. Then the searching space would be too large to find when using binary search.
Extension
- Extensive Problem 1:
Given a very very large (may be infinite) array and we don’t know how large it is. Use binary search to find the index of a given element ak.
Example: Given an array like this,
a1 | a2 | … | ak | …(we don’t know the end of the array) |
---|---|---|---|---|
1 | 3 | … | a[k] | … |
Analysis:
- since we don’t know the size of array, we are hardly to find the right boundary R
- we only know the left boundary L=0
- Idea:
- Jump Out to find R: We can first find the smallest integer X, such that a[$z^X$]$\geq$ ak, where z is positive integer, maybe 2, 10…
- Set R= $z^X$
- Jump in to find ak: Apply binary search on the region between $z^{X-1}$ and $z^X$.
- Evaluate
Time Complexity- since jump out step is $z^X$, then the time required to find the region is O($log_Z$(n)).
- Time required to search ak insdie $z^{X-1}$ and $z^X$ is O($log_2$(n))
- Total complexity is O($log_Z$(n) + $log_2$(n))
- Compare the value of Z to find the best performance. __Draw the picture of $log_z(X)$ !__
Reference
[1] “https://www.cdn.geeksforgeeks.org/wp-content/uploads/binary-tree-to-DLL.png"