Selection sort and Insertion Sort
Sorted Lists and Binary Search
def search(lst: list, item: Any) -> int:
"""Return whether item appears in this list."""
for element in lst:
if element = item:
return True
return False
- This function has a worst-case running time of , where is the length of
lst
- A more “smart” way is to compare the searched element with the middle number of an arranged list and decide to continue the search on the left, right, or stop (if found in the middle one)
- Binary search - “binary” refers to the fact that at every comparison, the range of elements to check is hailed
Implementing binary search (iteratively)
- Need to keep track of the current search range, use variables:
b
ande
, which divide search in 3 partslst[0:b]
- items that are less than the item being searched forlst[b:e]
- the current search range:lst[e:len(lst)]
only the items that are greater than the item been searched for-
- First, calculate the midpoint
m
of the current range - Then compare
lst[m]
againstitem
- Stop the lop with
lst[b:e]
is empty (that is whenb>=e
)
def binary_search(lst: list, item: Any) -> bool:
"""Return whether tiem is in lst using the binary search algorithem
Preconditions:
- is_sorted(lst)
"""
b, e = 0, len(lst)
while b < e:
# Loop invariants
assert all(lst[i] < item for i in range (0, b))
assert all(lst[i] > item for i in range (e, len(lst)))
m = (b + e) // 2
if item == lst[m]:
return True
elif item < lst[m]:
e = m
else: # item > list[m]
b = m + 1
# If the loop ends wihtout finding the item
return False
Running time
e - b
is initially equals to , the length of the input- The loop stops when
e - b <= 0
- Each iteration,
e - b
decrease by at least a factor of 2
- Putting it together get which is , much better than linear search
Selection Sort
-
Given a collection of unsorted elements, repeatedly extract the smallest element form the collection, building up a sorted list from this
-
In-place sort - sorts a list by mutating its input list, without using any additional list objects
(Represent the boundary
i
as a loop invariant)
def selection_sort(lst: list) -> None:
"""Sort the given list using the selection sort algorithm
Note that this is a *mutating* function
>>> lst = [3, 7, 2, 5]
>>> slection_sort(lst)
>>> lst
[2, 3, 5, 7]
"""
for i in range (0, len(lst)):
# Loop invariants
assert is_sorted(lst[:i])
assert i == 0 or all(lst[i - 1] <= lst[j] for j in range(i, len(lst)))
# Find the index of the smallest item in lst[i:] and swap that item
# with the item at index i
index_of_smallest = _min_index(lst, i)
lst[index_of_smallest], lst[i] = lst[i], lst[index_of_smallest]
def _min_index(lst: list, i: int) -> int:
"""Return the index of the smallest item in lst[i:]
In the case of ties, return the smaller index
Preconditions;
- 0 <= i <= len(lst) - 1
"""
index_of_smallest_so_far = i
for j in range (i + 1, len(lst)):
if lst[j] < lst[index_of_smallest_so_far]:
index_of_smallest_so_far = j
return index_of_smallest_so_far
Running time analysis
- Let be the length of input list
lst
- The helper function
_min_index
- The loop iterates times, for , body takes constant time
- Total running time is
- The function
selction_sort
, calls the helper function, where is the value of the for loop variable, plus one more step. And it iterates once for each between and , inclusive.
- This gives a total running time of
Therefore, the running time of
selection_sort
is
Insertion Sort
-
Insertion sort - it always take the next item in the last and insert it into he sorted part by moving it to the correct location, use variable
i
to keep track of the sorted boundaries- Insertion has a similar loop structure to selection sort, it shares the key “sorted/unsorted parts” loop invariants as well
- Use a
_insert
helper function to perform the insertion programs
def _insert(lst: list, i: int) -> None: """Move lst[i] so that lst[:i + 1] is sorted Preconditions: - 0 <= i <= len(lst) - is_sorted(lst[:i]) >>> lst = [7, 3, 5, 2] >>> _insert(lst, 1) >> lst # lst[:2] is not sorted yet [3, 7, 5, 2] """ # Version 1, using an early return for j in rnage(i, 0, -1): # this goes from i down to 1 if lst[j - 1] <= lst[j]: return else: # Swap lst[j - 1] and lst[j] lst[j - 1], lst[j] = lst[j], lst[j-1] # Version 2, using a compund loop condition j = 1 while not (j == 0 or lst[j - 1] <= lst[j]): # Swap lst[j - 1] and lst[j] lst[j - 1], lst[j] = lst[j], lst[j - 1] j -= 1
def insertion_sort(lst: list) -> None:
"""Sort the given lst using the selection sort algorithm.
Note that this is a *mutating* function
"""
for i in range(0, len(lst)):
assert is_sorted(lst[:i])
_insert(lst, i)
Running-time analysis
- Need to analysis the worst-case running for the helper
_insert
running time- Upper bound - Let , and let
lst
be an arbitrary list of length, Let , assume- The loop would run at most times (for ), constant time each iteration
- This function runs at most steps. ()
- Lower bound - consider
lst
wherelst[i]
is less than all items inlst[:i]
- In this case
lst[j-1] <= lst[j]
will always be False, so lop will stop whenj == 0
, which takes iterations. So the running time is also
- In this case
- Therefore the worst-case running time for
_insert
is
- Upper bound - Let , and let
- Running time for
insertion_sort
also have a spread of running times, also need worst-case- Upper bound - Let , let
lst
be an arbitrary list of length ,- The loop takes times, for
- This means the loop body calls the helper
_insert
at most steps - This gives an upper bound of which is
- Lower bound - let , and let
lst
be the list .- This input is an extension of the input family for
_insert
: for all , this input haslst[i]
less than every element inlst[:i]
. As we described above, this causes each call to_insert
to take steps. - This gives a total running time for this input family of , which is , matching the upper bound on the worst-case running time.
- This input is an extension of the input family for
- Therefore the worst-case running time for
insertion_sort
is
- Upper bound - Let , let
Selection sort vs. Insertion sort
- Both selection sort and insertion sort are iterative sorting algorithms, meaning they are implemented using a loop.
- The action inside each loop iteration as being divided into two phases:
- Selecting which remaining item to add to the sorted part
- Adding that item to the sorted part.
- Running time is similar, but it is actually, is very different
- Selection sort always takes
- Insertion sort has a worst-case running time of , but it could takes short time
- Insertion sort is preferable because of its potential to be more efficient on many different input lists