Multilabel Classification. Problem Analysis, Metrics and Techniques
18 February, 2020
Francisco Herrera, Francisco Charte, Antonio J. Rivera, Maria J. del Jesús
Classification of data patterns into different categories that are not mutually exclusive, through a model usually generated by means of supervised learning techniques, is a problem that has generated a large volume of publications in the last decade.
The first three chapters of this book begin with a comprehensive introduction to the problem of multi-label classification and the most common techniques for addressing this task, as well as a detailed description of the metrics used to characterize this type of datasets and to evaluate the results produced by classifiers. In addition, a list of the multi-label datasets most frequently used in scientific studies is provided.
After the first chapters, which introduce the problem on which this title focuses, an exhaustive review of the methods published in the literature is carried out. By grouping them into three chapters according to the approach they take to the task, they describe algorithms based on data transformation techniques, on the adaptation of classical methods and also on ensembles.
A third block contains monographic chapters dealing with specific aspects such as the use of correlation between labels, high dimensionality, which in multi-label data affects not only input variables but also output ones, or the obstacle represented by the imbalance between the labels assigned to data samples. Each of these topics is analyzed and accompanied by one or more solutions designed by the authors, including algorithms published in leading journals.
The final part of the book describes software for the familiar R environment, developed by the authors, designed to facilitate work with multi-label datasets. The package’s functionality, accessible for the most part both from the command line and from a graphical interface, facilitates the exploration of this type of data, as well as the application of basic transformations.
The book, whose table of contents is given below, has a website associated with it https://github.com/fcharte/SM-MLC, which offers different resources: links to data repositories, software tools, implementation of algorithms, etc.
- Multilabel Classification
- Case Studies and Metrics
- Transformation-Based Classifiers
- Adaptation-Based Classifiers
- Ensemble-Based Classifiers
- Dimensionality Reduction
- Imbalance in Multilabel Datasets
- Multilabel Software