Informations générales
Maître Assistant
Nader KOLSI is an Assistant Professor in the high school of commerce of Tunis. He has a PHD in computer science for management from the higher school of management of Tunis. Since 2004, he has been teaching at some of the most prestigious universities in Tunisia, where he has developed extensive expertise in databases, business intelligence, data warehousing, and artificial intelligence. His research focuses on data management and intelligent systems, with applications in decision support and scalable computing. He has published numerous research papers in the fields of big data, multi-criteria decision making, databases and data warehousing, multi-agent systems and mobile agents, and distributed and scalable data structures. His academic work bridges both theoretical contributions and practical applications in information systems and artificial intelligence. He also participated in the foundation of the research laboratory SOIE (now SMART).
Équipes
Axes de recherche
Publications
-
2022Emna Hosni, Wided Lejouad Chaari, Nader Kolsi, Khaled Ghedira
Effective Resource Utilization in Heterogeneous Hadoop Environment Through a Dynamic Inter-cluster and Intra-cluster Load Balancing
Asian Conference on Intelligent Information and Database Systems (ACIIDS), Ho Chi Minh City, Vietnam, part 2 669-681., 2022
Résumé
Apache Hadoop is one of the most popular distributed computing systems, used largely for big data analysis and processing. The Hadoop cluster hosts multiple parallel workloads requiring various resource usage (CPU, RAM, etc.). In practice, in heterogeneous Hadoop environments, resource-intensive tasks may be allocated to the lower performing nodes, causing load imbalance between and within clusters and and high data transfer cost. These weaknesses lead to performance deterioration of the Hadoop system and delays the completion of all submitted jobs. To overcome these challenges, this paper proposes an efficient and dynamic load balancing policy in a heterogeneous Hadoop YARN cluster. This novel load balancing model is based on clustering nodes into subgroups of nodes similar in performance, and then allocating different jobs in these subgroups using a multi-criteria ranking. This policy ensures the most accurate match between resource demands and available resources in real time, which decreases the data transfer in the cluster. The experimental results show that the introduced approach allows reducing noticeably the completion time s by 42% and 11% compared with the H-fair and a load balancing approach respectively. Thus, Hadoop can rapidly release the resources for the next job which enhance the overall performance of the distributed computing systems. The obtained finding also reveal that our approach optimizes the use of the available resources and avoids cluster over-load in real time.


