Searching data involves determining whether a search key is present in the data and, if so, finding its location.

Sorting involves arranging data into order.

The linear search algorithm searches each element in the vector sequentially until it finds the correct element. If the element is not in the vector, the algorithm tests each element in the vector, and when the end of the vector is reached, informs the user that the element is not present. If the element is in the vector, linear search tests each element until it finds the correct one.

[Page 993]A major difference among searching algorithms is the amount of effort they require in order to return a result.

One way to describe the efficiency of an algorithm is with Big O notation (O), which indicates how hard an algorithm may have to work to solve a problem.

For searching and sorting algorithms, Big O describes how the amount of effort of a particular algorithm varies depending on how many elements are in the data.

An algorithm that is O(1) is said to have a constant runtime. This does not mean that the algorithm requires only one comparison. It just means that the number of comparisons does not grow as the size of the vector increases.

An O(n) algorithm is referred to as having a linear runtime.

Big O is designed to highlight dominant factors and ignore terms that become unimportant with high values of n.

Big O notation is concerned with the growth rate of algorithm runtimes, so constants are ignored.

The linear search algorithm runs in O(n) time.

The worst case in linear search is that every element must be checked to determine whether the search element exists. This occurs if the search key is the last element in the vector or is not present.

The binary search algorithm is more efficient than the linear search algorithm, but it requires that the vector first be sorted. This is only worthwhile when the vector, once sorted, will be searched a great many timesor when the searching application has stringent performance requirements.

The first iteration of binary search tests the middle element in the vector. If this is the search key, the algorithm returns its location. If the search key is less than the middle element, binary search continues with the first half of the vector. If the search key is greater than the middle element, binary search continues with the second half of the vector. Each iteration of binary search tests the middle value of the remaining vector and, if the element is not found, eliminates half of the remaining elements.

Binary search is more efficient than linear search because, with each comparison it eliminates from consideration half of the elements in the vector.

Binary search runs in O(log n) time because each step removes half of the remaining elements from consideration.

If the size of the vector is doubled, binary search requires only one extra comparison to complete successfully.

Selection sort is a simple, but inefficient, sorting algorithm.

The first iteration of selection sort selects the smallest element in the vector and swaps it with the first element. The second iteration of selection sort selects the second-smallest element (which is the smallest remaining element) and swaps it with the second element. Selection sort continues until the last iteration selects the second-largest element and swaps it with the second-to-last index, leaving the largest element in the last index. At the ith iteration of selection sort, the smallest i elements of the whole vector are sorted into the first i indices.

The selection sort algorithm runs in O(n^{2}) time.

The first iteration of insertion sort takes the second element in the vector and, if it is less than the first element, swaps it with the first element. The second iteration of insertion sort looks at the third element and inserts it in the correct position with respect to the first two elements. After the ith iteration of insertion sort, the first i elements in the original vector are sorted. Only n 1 iterations are required.

[Page 994]The insertion sort algorithm runs in O(n^{2}) time.

Merge sort is a sorting algorithm that is faster, but more complex to implement, than selection sort and insertion sort.

The merge sort algorithm sorts a vector by splitting the vector into two equal-sized subvectors, sorting each subvector and merging the subvectors into one larger vector.

Merge sort's base case is a vector with one element. A one-element vector is already sorted, so merge sort immediately returns when it is called with a one-element vector. The merge part of merge sort takes two sorted vectors (these could be one-element vectors) and combines them into one larger sorted vector.

Merge sort performs the merge by looking at the first element in each vector, which is also the smallest element in the vector. Merge sort takes the smallest of these and places it in the first element of the larger, sorted vector. If there are still elements in the subvector, merge sort looks at the second element in that subvector (which is now the smallest element remaining) and compares it to the first element in the other subvector. Merge sort continues this process until the larger vector is filled.

In the worst case, the first call to merge sort has to make O(n) comparisons to fill the n slots in the final vector.

The merging portion of the merge sort algorithm is performed on two subvectors, each of approximately size n/2. Creating each of these subvectors requires n/2 1 comparisons for each subvector, or O(n) comparisons total. This pattern continues, as each level works on twice as many vectors, but each is half the size of the previous vector.

Similar to binary search, this halving results in log n levels, each level requiring O(n) comparisons, for a total efficiency of O(n log n).