Name: VICTOR FREITAS ROCHA
Type: MSc dissertation
Publication date: 25/08/2022
Advisor:
Name | Role |
---|---|
FLÁVIO MIGUEL VAREJÃO | Advisor * |
Examining board:
Name | Role |
---|---|
FLÁVIO MIGUEL VAREJÃO | Advisor * |
THIAGO OLIVEIRA DOS SANTOS | Internal Examiner * |
Summary: In the context of machine learning, classification is the task of identifying which class an
instance belongs according to the knowledge obtained through a training set, that is, a
set of instances whose classification is previously known. Single-label classification, one of
the most traditional versions of the classification problem, allows an instance to belong
to only one class, thus making them mutually exclusive. However, real-world problems
WHERE intersections between classes often occur can be better modeled as multi-label
classification problems, whose task is to allow multiple labels to be assigned to the same
instance. Multiple classifiers, both single-label and multi-label, can be trained on the
same classification problem and generate a combined result. This technique, known as
classifier ensembles, is commonly used to improve classification performance. Several
approaches have already been proposed to perform the combination of the individual
classifiers results. In this work, an approach for combining multiple-label classification sets
based on the Decision Templates for Ensemble of Classifier Chains technique is presented
that incorporates the exploration of correlations between the labels in the classifiers fusion
process. In the Decision Templates technique, originally proposed for merging single-label
classifiers, a per-class decision model is estimated using the same training set that is
used for the set of classifiers. The classification for each unseen instance is obtained by
measuring the similarity between its decision profile and the decision templates. The
proposed method estimates two decision templates per class, one representing the presence
of the class and the other representing its absence. For each new instance, a new decision
profile is created and the similarity between the decision templates and the decision profile
determines the resulting set of labels. For each label analyzed, information about correlated
labels is incorporated. The proposed fusion method is used in a traditional and proven
algorithm of multiple-label classifier committee: Ensemble of Classifier Chains. Empirical
evidence indicates that the use of the proposed Decision Templates adaptation can improve
performance over traditionally used fusion schemes on most of the evaluated metrics.