2022
Conférence
In Proceedings of the Genetic and Evolutionary Computation Conference Companion (pp. 284-287)
Data imbalance is still so far a challenging issue in data classification. In literature, cost-sensitive approach has been used to deal with such a challenge. Despite its interesting results, the manual design of cost matrices is still the main shortcoming of this approach. The data engineer is still facing a great difficulty in defining the misclassification costs, especially with the absence of domain specific knowledge. Recent works suggest the use of genetic programming as an effective tool to design classification trees with automatically learned costs. Although promising results were obtained, evaluating a classification tree with a single cost matrix is not a wise choice. Indeed, the tree quality evaluation requires trying several misclassification cost matrices to be more precise and fair. Motivated by this observation, we propose in this paper a bi-level modeling of the cost-sensitive classification tree induction problem where the upper level evolves the classification trees, while the cost matrix of each tree is optimized at the lower level. Our bi-level modeling is solved using an existing co-evolutionary algorithm, and the resulting method is named Bi-COS. The obtained comparative experimental results on several imbalanced benchmark datasets show the merits of Bi-COS with respect to the state-of-the art.
@inproceedings{said2022cost, title={Cost-sensitive classification tree induction as a bi-level optimization problem}, author={Said, Rihab and Elarbi, Maha and Bechikh, Slim and Coello, Carlos A Coello and Said, Lamjed Ben}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, pages={284--287}, year={2022} }