Performance Analysis of Sorting Algorithms in Big Data Environments: Efficiency, Scalability, and Practical Applications

Authors

  • Muhammad Rayyan Zikri Software engineering, Monash University Malaysia, Malaysia

Keywords:

Sorting Algorithms, Big Data, Performance Analysis, Distributed Computing, Scalability

Abstract

The exponential growth of big data has underscored the need for efficient sorting algorithms capable of handling massive datasets in diverse and distributed environments. Sorting, a fundamental operation in computer science, plays a critical role in data organization, retrieval, and analysis. However, traditional studies on sorting algorithms primarily focus on their theoretical efficiency, often neglecting the practical challenges posed by big data, such as scalability, parallelization, and adaptability to heterogeneous datasets. This research aims to analyze the performance of widely-used sorting algorithms, including QuickSort, MergeSort, HeapSort, and RadixSort, in the context of big data. The study evaluates these algorithms based on metrics such as execution time, memory usage, and scalability across varying dataset sizes, distributions, and types. Additionally, their performance on distributed computing platforms, such as Apache Hadoop and Spark, is analyzed to assess their compatibility with modern data processing frameworks. The findings highlight the strengths and limitations of each algorithm, providing insights into their suitability for specific big data applications. By bridging the gap between theoretical analysis and real-world performance, this research contributes to the development of optimized sorting solutions tailored for big data environments. The results offer valuable guidance for practitioners and researchers in selecting or designing sorting algorithms that meet the demands of contemporary data-driven industries.

Downloads

Download data is not yet available.

References

Asch, M., Moore, T., Badia, R., Beck, M., Beckman, P., Bidot, T., Bodin, F., Cappello, F., Choudhary, A., & De Supinski, B. (2018). Big data and extreme-scale computing: Pathways to convergence-toward a shaping strategy for a future software and data ecosystem for scientific inquiry. The International Journal of High Performance Computing Applications, 32(4), 435–479.

Bello-Orgaz, G., Jung, J. J., & Camacho, D. (2016). Social big data: Recent achievements and new challenges. Information Fusion, 28, 45–59.

Bibri, S. E. (2018). The IoT for smart sustainable cities of the future: An analytical framework for sensor-based big data applications for environmental sustainability. Sustainable Cities and Society, 38, 230–253.

Boppiniti, S. T. (2020). Big Data Meets Machine Learning: Strategies for Efficient Data Processing and Analysis in Large Datasets. International Journal of Creative Research In Computer Technology and Design, 2(2).

Brady, H. E. (2019). The challenge of big data and data science. Annual Review of Political Science, 22(1), 297–323.

Chang, R. M., Kauffman, R. J., & Kwon, Y. (2014). Understanding the paradigm shift to computational social science in the presence of big data. Decision Support Systems, 63, 67–80.

Chen, C. L. P., & Zhang, C.-Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences, 275, 314–347.

Hariri, R. H., Fredericks, E. M., & Bowers, K. M. (2019). Uncertainty in big data analytics: survey, opportunities, and challenges. Journal of Big Data, 6(1), 1–16.

Khan, N., Yaqoob, I., Hashem, I. A. T., Inayat, Z., Mahmoud Ali, W. K., Alam, M., Shiraz, M., & Gani, A. (2014). Big data: survey, technologies, opportunities, and challenges. The Scientific World Journal, 2014.

Kleppmann, M. (2017). Designing data-intensive applications: The big ideas behind reliable, scalable, and maintainable systems. “ O’Reilly Media, Inc.”

Maier, H. R., Kapelan, Z., Kasprzyk, J., Kollat, J., Matott, L. S., Cunha, M. C., Dandy, G. C., Gibbs, M. S., Keedwell, E., & Marchi, A. (2014). Evolutionary algorithms and other metaheuristics in water resources: Current status, research challenges and future directions. Environmental Modelling & Software, 62, 271–299.

Mehlhorn, K. (2013). Data structures and algorithms 1: Sorting and searching (Vol. 1). Springer Science & Business Media.

Oza, N. C., & Tumer, K. (2008). Classifier ensembles: Select real-world applications. Information Fusion, 9(1), 4–20.

Salloum, S., Dautov, R., Chen, X., Peng, P. X., & Huang, J. Z. (2016). Big data analytics on Apache Spark. International Journal of Data Science and Analytics, 1, 145–164.

Sezer, O. B., Dogdu, E., & Ozbayoglu, A. M. (2017). Context-aware computing, learning, and big data in internet of things: a survey. IEEE Internet of Things Journal, 5(1), 1–27.

Sivarajah, U., Kamal, M. M., Irani, Z., & Weerakkody, V. (2017). Critical analysis of Big Data challenges and analytical methods. Journal of Business Research, 70, 263–286.

Sovacool, B. K., Axsen, J., & Sorrell, S. (2018). Promoting novelty, rigor, and style in energy social science: Towards codes of practice for appropriate methods and research design. Energy Research & Social Science, 45, 12–42.

Tantalaki, N., Souravlas, S., & Roumeliotis, M. (2020). A review on big data real-time stream processing and its scheduling techniques. International Journal of Parallel, Emergent and Distributed Systems, 35(5), 571–601.

Verma, A., Kaur, I., & Arora, N. (2016). Comparative analysis of information extraction techniques for data mining. Indian Journal of Science and Technology, 9(11), 1–18.

Zhang, Y., Cao, T., Li, S., Tian, X., Yuan, L., Jia, H., & Vasilakos, A. V. (2016). Parallel processing systems for big data: a survey. Proceedings of the IEEE, 104(11), 2114–2136.

Zopounidis, C., & Doumpos, M. (2002). Multicriteria classification and sorting methods: A literature review. European Journal of Operational Research, 138(2), 229–246.

Downloads

Published

2023-10-30

How to Cite

Zikri, M. R. (2023). Performance Analysis of Sorting Algorithms in Big Data Environments: Efficiency, Scalability, and Practical Applications. Idea: Future Research, 1(3), 132–139. Retrieved from https://idea.ristek.or.id/index.php/idea/article/view/8