Thesis details | Informática

Name: THIAGO MEIRELES PAIXÃO
Type: PhD thesis
Publication date: 10/05/2022
Advisor:

Name	Role
THIAGO OLIVEIRA DOS SANTOS	Advisor *

Examining board:

Name	Role
FRANCISCO DE ASSIS BOLDT	External Examiner *
LUCIA CATABRIGA	Internal Examiner *
MARIA CLAUDIA SILVA BOERES	Co advisor *
THIAGO OLIVEIRA DOS SANTOS	Advisor *

Summary: The reconstruction of shredded documents is a relevant task in various domains, such as
forensic investigation and history reconstruction. As an alternative for the manual reconstruction, researchers have been investigating ways to perform (semi-)automatic digital
reconstruction. Despite the several works on this topic, dealing with real-shredded data
is a very sensitive issue in the current literature. Two research directions are addressed in
this thesis to face this scenario: properly evaluating the fitting of shreds (the bulk of this
work) and integrating the human into the reconstruction process.
Regarding the fitting (compatibility) evaluation, it was verified that traditional pixelbased approaches are not robust to real shredding, while more sophisticated techniques
compromise significantly time performance. This thesis presents two deep learning selfsupervised approaches that have achieved state-of-the-art accuracy in more realistic/complex scenarios involving several real-shredded documents WHERE the shreds are mixed
(multi-page reconstruction or multi-reconstruction). The first approach models the compatibility evaluation as a two-class (valid or invalid) pattern recognition problem. The
second approach, based on deep metric learning, proposes decoupling feature extraction
from compatibility evaluation to improve scalability (time performance) for large reconstruction instances.
Human interaction is explored to improve the accuracy of automatic methods. A critical issue regarding this topic is that the proposed methods do not scale well for large
instances (real scenario), either because the user has the entire responsibility of arranging
the shreds, or because he/she has to visualize the reconstruction and designate the shreds
to be analyzed. In face of this challenge, we propose a human-in-the-loop framework that
automatically selects potential mistakes (wrong pairings) in the solution for user analysis.

Access to document

Search form

You are here