Quicksort algorithm: how data distribution affect execution time (Golang & PHP research)
Quicksort is a commonly used algorithm for sorting data. The algorithm was created by British scientist Tony Hoare in 1961. It is typically one of the best algorithms in terms of performance.
In the real world, we can face data with a different distribution, so there is a good question: is quicksort executes with the same duration for different data distributions?
This article provides the answer to the question based on the PHP and Golang implementation of quicksort.
Which data distributions are used in the research?
To do this research I used three kinds of distributions, which, in my opinion, in general, describe the most possible way in which data can be distributed. Let’s take a closer look at them.
1) Normal distribution:
2) Chi-square distribution (or similar distributions):
3) Beta distribution:
Please don’t pay attention to the quality of the above images, I’m not an
artist :D
How the distribution affects execution time?
The answer is YES, the data distribution affects the execution time!
Based on the results of running scripts on PHP and Golang, I can say that sorting of normal distribution requires more time than other distributions.
The best results in terms of performance usually shows Chi-square distribution.
As I mentioned earlier I did the research based on PHP & Golang scripts I build myself. You can find them here:
1) Golang:
https://github.com/OleksiiFedorchak/quicksort/blob/master/quick.go
2) PHP:
https://github.com/OleksiiFedorchak/quicksort/blob/master/quick.php
References:
1) https://en.wikipedia.org/wiki/Quicksort