Marwa Chabbouh

Informations générales

Marwa Chabbouh
Grade

Maître Assistant

Biographie courte

Maître Assistante en Informatique de Gestion à l’Institut Supérieur d’Administration des Entreprises de Gafsa (ISAEG) et chercheuse au laboratoire SMART (Strategies for Modeling and Artificial Intelligence), Université de Tunis – ISG-Campus. Ses travaux portent sur la fouille de données, l’intelligence artificielle, l’apprentissage automatique et les problèmes de classification déséquilibrée. Elle a publié plusieurs articles dans des revues internationales de renom, notamment Swarm and Evolutionary Computation où elle a co-signé l’article « Multi-objective evolution of oblique decision trees for imbalanced data binary classification » (2019), Neural Computing and Applications avec « Imbalanced multi-label data classification as a bi-level optimization problem: application to miRNA-related diseases diagnosis » (2023), ainsi que Journal of Heuristics avec « Evolutionary optimization of the area under precision-recall curve for classifying imbalanced multi-class data » (2025)

Publications

  • 2025
    Marwa Chabbouh, Slim Bechikh, Lamjed Ben Said, Efrén Mezura-Montes

    Evolutionary optimization of the area under precision-recall curve for classifying imbalanced multi-class data

    J. Heuristics 31(1): 9 (2025), 2025

    Résumé

    Classification of imbalanced multi-class data is still so far one of the most challenging issues in machine learning and data mining. This task becomes more serious when classes containing fewer instances are located in overlapping regions. Several approaches have been proposed through the literature to deal with these two issues such as the use of decomposition, the design of ensembles, the employment of misclassification costs, and the development of ad-hoc strategies. Despite these efforts, the number of existing works dealing with the imbalance in multi-class data is much reduced compared to the case of binary classification. Moreover, existing approaches still suffer from many limits. These limitations include difficulties in handling imbalances across multiple classes, challenges in adapting sampling techniques, limitations of certain classifiers, the need for specialized evaluation metrics, the complexity of data representation, and increased computational costs. Motivated by these observations, we propose a multi-objective evolutionary induction approach that evolves a population of NLM-DTs (Non-Linear Multivariate Decision Trees) using the -NSGA-III (-Non-dominated Sorting Genetic Algorithm-III) as a search engine. The resulting algorithm is termed EMO-NLM-DT (Evolutionary Multi-objective Optimization of NLM-DTs) and is designed to optimize the construction of NLM-DTs for imbalanced multi-class data classification by simultaneously maximizing both the Macro-Average-Precision and the Macro-Average-Recall as two possibly conflicting objectives. The choice of these two measures as objective functions is motivated by a recent study on the appropriateness of performance metrics for imbalanced data classification, which suggests that the mAURPC (mean Area Under Recall Precision Curve) satisfies all necessary conditions for imbalanced multi-class classification. Moreover, the NLM-DT adoption as a baseline classifier to be optimized allows the generation non-linear hyperplanes that are well-adapted to the classes ‘boundaries’ geometrical shapes. The statistical analysis of the comparative experimental results on more than twenty imbalanced multi-class data sets reveals the outperformance of EMO-NLM-DT in building NLM-DTs that are highly effective in classifying imbalanced multi-class data compared to seven relevant and recent state-of-the-art methods.

  • Marwa Chabbouh, Slim Bechikh, Lamjed Ben Said, Efrén Mezura-Montes

    Imbalanced multi-label data classification as a bi-level optimization problem: application to miRNA-related diseases diagnosis

    Neural Comput. Appl. 35(22): 16285-16303 (2023), 2023

    Résumé

    In multi-label classification, each instance could be assigned multiple labels at the same time. In such a situation, the relationships between labels and the class imbalance are two serious issues that should be addressed. Despite the important number of existing multi-label classification methods, the widespread class imbalance among labels has not been adequately addressed. Two main issues should be solved to come up with an effective classifier for imbalanced multi-label data. On the one hand, the imbalance could occur between labels and/or within a label. The “Between-labels imbalance” occurs where the imbalance is between labels however the “Within-label imbalance” occurs where the imbalance is in the label itself and it could occur across multiple labels. On the other hand, the labels’ processing order heavily influences the quality of a multi-label classifier. To deal with these challenges, we propose in this paper a bi-level evolutionary approach for the optimized induction of multivariate decision trees, where the upper-level role is to design the classifiers while the lower-level approximates the optimal labels’ ordering for each classifier. Our proposed method, named BIMLC-GA (Bi-level Imbalanced Multi-Label Classification Genetic Algorithm), is compared to several state-of-the-art methods across a variety of imbalanced multi-label data sets from several application fields and then applied on the miRNA-related diseases case study. The statistical analysis of the obtained results shows the merits of our proposal.

  • Marwa Chabbouh, Slim Bechikh, Lamjed Ben Said, Chih-Cheng Hung

    Multi-objective evolution of oblique decision trees for imbalanced data binary classification

    Swarm Evol. Comput. 49: 1-22 (2019), 2019

    Résumé

    Imbalanced data classification is one of the most challenging problems in data mining. In this kind of problems, we have two types of classes: the majority class and the minority one. The former has a relatively high number of instances while the latter contains a much less number of instances. As most traditional classifiers usually assume that data is evenly distributed for all classes, they may considerably fail in recognizing instances in the minority class due to the imbalance problem. Several interesting approaches have been proposed to handle the class imbalance issue in the literature and the Oblique Decision Tree (ODT) is one of them. Nevertheless, most standard ODT construction algorithms use a greedy search process; while only very few works have addressed this induction problem using an evolutionary approach and this is done without really considering the class imbalance issue. To cope with this limitation, we propose in this paper a multi-objective evolutionary approach to find optimized ODTs for imbalanced binary classification. Our approach, called ODT-Θ-NSGA-III (ODT-based-Θ-Nondominated Sorting Genetic Algorithm-III), is motivated by its abilities: (a) to escape local optima in the ODT search space and (b) to maximize simultaneously both Precision and Recall. Thanks to these two features, ODT-Θ-NSGA-III provides competitive and better results when compared to many state-of-the-art classification algorithms on commonly used imbalanced benchmark data sets.