Bubble sort
Encyclopedia
Bubble sort, also known as sinking sort, is a simple sorting algorithm
that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swap
ping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of the list. Because it only uses comparisons to operate on elements, it is a comparison sort
. Although the algorithm is simple, it is not efficient for sorting large lists; other algorithms are better.
, tend to have better performance than bubble sort. Therefore, bubble sort is not a practical sorting algorithm when n is large.
The only significant advantage that bubble sort has over most other implementations, even quicksort, but not insertion sort
, is that the ability to detect that the list is sorted is efficiently built into the algorithm. Performance of bubble sort over an already-sorted list (best-case) is O(n). By contrast, most other algorithms, even those with better average-case complexity
, perform their entire sorting process on the set and thus are more complex. However, not only does insertion sort
have this mechanism too, but it also performs better on a list that is substantially sorted (having a small number of inversions).
Various efforts have been made to eliminate turtles to improve upon the speed of bubble sort. Cocktail sort
achieves this goal fairly well, but it retains O(n2)
worst-case complexity. Comb sort
compares elements separated by large gaps, and can move turtles extremely quickly before proceeding to smaller and smaller gaps to smooth out the list. Its average speed is comparable to faster algorithms like quicksort.
Three passes will be required.
First Pass:
( 5 1 4 2 8 ) ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps them.
( 1 5 4 2 8 ) ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them.
Second Pass:
( 1 4 2 5 8 ) ( 1 4 2 5 8 )
( 1 4 2 5 8 ) ( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
More generally, it can happen that more than one element is placed in their final position on a single pass. In particular, after every pass, all elements after the last swap are sorted, and do not need to be checked again. This allows us to skip over a lot of the elements, resulting in about a worst case 50% improvement in comparison count (though no improvement in swap counts), and adds very little complexity because the new code subsumes the "swapped" variable:
To accomplish this in pseudocode
we write the following:
Alternate modifications, such as the cocktail shaker sort attempt to improve on the bubble sort performance while keeping the same idea of repeatedly comparing and swapping adjacent items.
complexity means it is far too inefficient for use on lists having more than a few elements. Even among simple O(n2) sorting algorithms, algorithms like insertion sort
are usually considerably more efficient.
Due to its simplicity, bubble sort is often used to introduce the concept of an algorithm, or a sorting algorithm, to introductory computer science
students. However, some researchers such as Owen Astrachan
have gone to great lengths to disparage bubble sort and its continued popularity in computer science education, recommending that it no longer even be taught.
The Jargon file
, which famously calls bogosort
"the archetypical perversely awful algorithm", also calls bubble sort "the generic bad algorithm". Donald Knuth
, in his famous book The Art of Computer Programming
, concluded that "the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems", some of which he then discusses.
Bubble sort is asymptotically equivalent in running time to insertion sort
in the worst case, but the two algorithms differ greatly in the number of swaps necessary. Experimental results such as those of Astrachan have also shown that insertion sort
performs considerably better even on random lists. For these reasons many modern algorithm textbooks avoid using the bubble sort algorithm in favor of insertion sort.
Bubble sort also interacts poorly with modern CPU hardware. It requires at least twice as many writes as insertion sort, twice as many cache misses, and asymptotically more branch mispredictions. Experiments by Astrachan sorting strings in Java show bubble sort to be roughly 5 times slower than insertion sort
and 40% slower than selection sort
.
In computer graphics it is popular for its capability to detect a very small error (like swap of just two elements) in almost-sorted arrays and fix it with just linear complexity (2n). For example, it is used in a polygon filling algorithm, where bounding lines are sorted by their x coordinate at a specific scan line (a line parallel to x axis) and with incrementing y their order changes (two elements are swapped) only at intersections of two lines.
To clarify we can also observe the behaviour of the two algorithms. In bubble sort, the larger bubbles (higher values) bubble up displacing the smaller bubbles (lower values). Insertion on the other hand, sinks each successive value down to its correct location in the sorted portion of the collection.
Sorting algorithm
In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order...
that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swap
Swap (computer science)
In computer programming, the act of swapping two variables refers to mutually exchanging the values of the variables. Usually, this is done with the data in memory...
ping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of the list. Because it only uses comparisons to operate on elements, it is a comparison sort
Comparison sort
A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list...
. Although the algorithm is simple, it is not efficient for sorting large lists; other algorithms are better.
Analysis
Performance
Bubble sort has worst-case and average complexity both О(n2), where n is the number of items being sorted. There exist many sorting algorithms with substantially better worst-case or average complexity of O(n log n). Even other О(n2) sorting algorithms, such as insertion sortInsertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort...
, tend to have better performance than bubble sort. Therefore, bubble sort is not a practical sorting algorithm when n is large.
The only significant advantage that bubble sort has over most other implementations, even quicksort, but not insertion sort
Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort...
, is that the ability to detect that the list is sorted is efficiently built into the algorithm. Performance of bubble sort over an already-sorted list (best-case) is O(n). By contrast, most other algorithms, even those with better average-case complexity
Average-case complexity
Average-case complexity is a subfield of computational complexity theory that studies the complexity of algorithms on random inputs.The study of average-case complexity has applications in the theory of cryptography....
, perform their entire sorting process on the set and thus are more complex. However, not only does insertion sort
Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort...
have this mechanism too, but it also performs better on a list that is substantially sorted (having a small number of inversions).
Rabbits and turtles
The positions of the elements in bubble sort will play a large part in determining its performance. Large elements at the beginning of the list do not pose a problem, as they are quickly swapped. Small elements towards the end, however, move to the beginning extremely slowly. This has led to these types of elements being named rabbits and turtles, respectively.Various efforts have been made to eliminate turtles to improve upon the speed of bubble sort. Cocktail sort
Cocktail sort
Cocktail sort, also known as bidirectional bubble sort, cocktail shaker sort, shaker sort , ripple sort, shuffle sort, shuttle sort or happy hour sort, is a variation of bubble sort that is both a stable sorting algorithm and a comparison sort...
achieves this goal fairly well, but it retains O(n2)
Big O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...
worst-case complexity. Comb sort
Comb sort
Comb sort is a relatively simplistic sorting algorithm originally designed by Włodzimierz Dobosiewicz in 1980. Later it was rediscovered and popularized by Stephen Lacey and Richard Box with a Byte Magazine . Comb sort improves on bubble sort, and rivals algorithms like Quicksort...
compares elements separated by large gaps, and can move turtles extremely quickly before proceeding to smaller and smaller gaps to smooth out the list. Its average speed is comparable to faster algorithms like quicksort.
Step-by-step example
Let us take the array of numbers "5 1 4 2 8", and sort the array from lowest number to greatest number using bubble sort algorithm. In each step, elements written in bold are being compared.Three passes will be required.
First Pass:
( 5 1 4 2 8 ) ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps them.
( 1 5 4 2 8 ) ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them.
Second Pass:
( 1 4 2 5 8 ) ( 1 4 2 5 8 )
( 1 4 2 5 8 ) ( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
( 1 2 4 5 8 ) ( 1 2 4 5 8 )
Pseudocode implementation
The algorithm can be expressed as:
procedure bubbleSort( A : list of sortable items )
repeat
swapped = false
for i = 1 to length(A) - 1 inclusive do:
if A[i-1] > A[i] then
swap( A[i-1], A[i] )
swapped = true
end if
end for
until not swapped
end procedure
Optimizing bubble sort
The bubble sort algorithm can be easily optimized by observing that the n-th pass finds the n-th largest element and puts it into its final place. So, the inner loop can avoid looking at the last n-1 items when running for the n-th time:
procedure bubbleSort( A : list of sortable items )
n = length(A)
repeat
swapped = false
for i = 1 to n-1 inclusive do
if A[i-1] > A[i] then
swap(A[i-1], A[i])
swapped = true
end if
end for
n = n - 1
until not swapped
end procedure
More generally, it can happen that more than one element is placed in their final position on a single pass. In particular, after every pass, all elements after the last swap are sorted, and do not need to be checked again. This allows us to skip over a lot of the elements, resulting in about a worst case 50% improvement in comparison count (though no improvement in swap counts), and adds very little complexity because the new code subsumes the "swapped" variable:
To accomplish this in pseudocode
Pseudocode
In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading...
we write the following:
procedure bubbleSort( A : list of sortable items )
n = length(A)
repeat
newn = 0
for i = 1 to n-1 inclusive do
if A[i-1] > A[i] then
swap(A[i-1], A[i])
newn = i
end if
end for
n = newn
until n = 0
end procedure
Alternate modifications, such as the cocktail shaker sort attempt to improve on the bubble sort performance while keeping the same idea of repeatedly comparing and swapping adjacent items.
In practice
Although bubble sort is one of the simplest sorting algorithms to understand and implement, its O(n2)Big O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...
complexity means it is far too inefficient for use on lists having more than a few elements. Even among simple O(n2) sorting algorithms, algorithms like insertion sort
Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort...
are usually considerably more efficient.
Due to its simplicity, bubble sort is often used to introduce the concept of an algorithm, or a sorting algorithm, to introductory computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
students. However, some researchers such as Owen Astrachan
Owen Astrachan
Owen Astrachan is Professor of the Practice of Computer Science at Duke University where he is also the department's director of undergraduate studies...
have gone to great lengths to disparage bubble sort and its continued popularity in computer science education, recommending that it no longer even be taught.
The Jargon file
Jargon File
The Jargon File is a glossary of computer programmer slang. The original Jargon File was a collection of terms from technical cultures such as the MIT AI Lab, the Stanford AI Lab and others of the old ARPANET AI/LISP/PDP-10 communities, including Bolt, Beranek and Newman, Carnegie Mellon...
, which famously calls bogosort
Bogosort
In computer science, bogosort is a particularly ineffective sorting algorithm based on the generate and test paradigm. It is not useful for sorting, but may be used for educational purposes, to contrast it with other more realistic algorithms; it has also been used as an example in logic programming...
"the archetypical perversely awful algorithm", also calls bubble sort "the generic bad algorithm". Donald Knuth
Donald Knuth
Donald Ervin Knuth is a computer scientist and Professor Emeritus at Stanford University.He is the author of the seminal multi-volume work The Art of Computer Programming. Knuth has been called the "father" of the analysis of algorithms...
, in his famous book The Art of Computer Programming
The Art of Computer Programming
The Art of Computer Programming is a comprehensive monograph written by Donald Knuth that covers many kinds of programming algorithms and their analysis....
, concluded that "the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems", some of which he then discusses.
Bubble sort is asymptotically equivalent in running time to insertion sort
Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort...
in the worst case, but the two algorithms differ greatly in the number of swaps necessary. Experimental results such as those of Astrachan have also shown that insertion sort
Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort...
performs considerably better even on random lists. For these reasons many modern algorithm textbooks avoid using the bubble sort algorithm in favor of insertion sort.
Bubble sort also interacts poorly with modern CPU hardware. It requires at least twice as many writes as insertion sort, twice as many cache misses, and asymptotically more branch mispredictions. Experiments by Astrachan sorting strings in Java show bubble sort to be roughly 5 times slower than insertion sort
Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort...
and 40% slower than selection sort
Selection sort
Selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O time complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort...
.
In computer graphics it is popular for its capability to detect a very small error (like swap of just two elements) in almost-sorted arrays and fix it with just linear complexity (2n). For example, it is used in a polygon filling algorithm, where bounding lines are sorted by their x coordinate at a specific scan line (a line parallel to x axis) and with incrementing y their order changes (two elements are swapped) only at intersections of two lines.
Variations
- Odd-even sortOdd-even sortIn computing, an odd–even sort or odd–even transposition sort is a relatively simple sorting algorithm, developed originally for use on parallel processors with local interconnections. It is a comparison sort related to bubble sort, with which it shares many characteristics...
is a parallel version of bubble sort, for message passing systems. - Cocktail sortCocktail sortCocktail sort, also known as bidirectional bubble sort, cocktail shaker sort, shaker sort , ripple sort, shuffle sort, shuttle sort or happy hour sort, is a variation of bubble sort that is both a stable sorting algorithm and a comparison sort...
is another parallel version of the bubble sort - In some cases, the sort works from right to left (the opposite direction), which is more appropriate for partially sorted lists, or lists with unsorted items added to the end.
Misnomer
Bubble sort has incorrectly been called sinking sort. Sinking sort is correctly an alias for insertion sort. This error is largely due to the National Institute of Standards and Technology listing sinking sort as an alias for bubble sort. http://xlinux.nist.gov/dads/HTML/bubblesort.html In Donald Knuth's The Art of Computer Programming, Volume 3: Sorting and Searching he states in section 5.2.1 'Sorting by Insertion', that [the value] "settles to its proper level" this method of sorting has often been called the sifting or sinking technique.To clarify we can also observe the behaviour of the two algorithms. In bubble sort, the larger bubbles (higher values) bubble up displacing the smaller bubbles (lower values). Insertion on the other hand, sinks each successive value down to its correct location in the sorted portion of the collection.
External links
- Bubble Sort implemented in 34 languages
- Softpanorama bubblesort page
- Illustrated explanation of Bubble sort
- Pictorial step by step procedure of Bubble sort
- Animated Sorting Algorithms: Bubble Sort – graphical demonstration and discussion of bubble sort
- Lafore's Bubble Sort (Java applet animation)
- A colored graphical Java applet which allows experimentation with initial state and shows statistics