Name: Thiago Meireles Paixão
Type: PhD thesis
Publication date: 10/05/2022
Advisor:

Namesort descending Role
Thiago Oliveira dos Santos Advisor *

Examining board:

Namesort descending Role
ALCEU DE SOUZA BRITTO JR. External Examiner *
Francisco de Assis Boldt External Examiner *
Lucia Catabriga Internal Examiner *
Maria Claudia Silva Boeres Co advisor *
ROBERTO HIRATA JR. External Examiner *
Thiago Oliveira dos Santos Advisor *

Summary: The reconstruction of shredded documents is a relevant task in various domains, such as
forensic investigation and history reconstruction. As an alternative for the manual reconstruction, researchers have been investigating ways to perform (semi-)automatic digital
reconstruction. Despite the several works on this topic, dealing with real-shredded data
is a very sensitive issue in the current literature. Two research directions are addressed in
this thesis to face this scenario: properly evaluating the fitting of shreds (the bulk of this
work) and integrating the human into the reconstruction process.
Regarding the fitting (compatibility) evaluation, it was verified that traditional pixelbased approaches are not robust to real shredding, while more sophisticated techniques
compromise significantly time performance. This thesis presents two deep learning selfsupervised approaches that have achieved state-of-the-art accuracy in more realistic/complex scenarios involving several real-shredded documents WHERE the shreds are mixed
(multi-page reconstruction or multi-reconstruction). The first approach models the compatibility evaluation as a two-class (valid or invalid) pattern recognition problem. The
second approach, based on deep metric learning, proposes decoupling feature extraction
from compatibility evaluation to improve scalability (time performance) for large reconstruction instances.
Human interaction is explored to improve the accuracy of automatic methods. A critical issue regarding this topic is that the proposed methods do not scale well for large
instances (real scenario), either because the user has the entire responsibility of arranging
the shreds, or because he/she has to visualize the reconstruction and designate the shreds
to be analyzed. In face of this challenge, we propose a human-in-the-loop framework that
automatically selects potential mistakes (wrong pairings) in the solution for user analysis.

Access to document

Acesso à informação
Transparência Pública

© 2013 Universidade Federal do Espírito Santo. Todos os direitos reservados.
Av. Fernando Ferrari, 514 - Goiabeiras, Vitória - ES | CEP 29075-910