Thesis details | Informática

Publication date: 26/02/2025

Examining board:

Name	Role
GIOVANNI VENTORIM COMARELA	Examinador Interno
MARCELO ZANCHETTA DO NASCIMENTO	Examinador Externo
RENATO ANTONIO KROHLING	Presidente
RENATO TINÓS	Examinador Externo
VINICIUS FERNANDES SOARES MOTA	Examinador Interno

Summary: Cancer is one of the leading causes of death worldwide, and early diagnosis of the disease is one of the most important factors in reducing mortality or increasing lifespan. Computer-aided diagnosis for cancer using artificial intelligence techniques has been under research for several years and has been greatly boosted by major advancements in computer vision in recent years through deep neural networks. Traditionally, healthcare experts use various sources of information to determine a diagnosis, often including some form of imaging (e.g., X-rays, histopathology, dermoscopy) along with clinical and demographic data. The approach of analyzing data in different formats (e.g., images, text, and graphs) within the context of artificial neural networks is seen as multimodal data fusion, and recent studies indicate that this analysis is also crucial for improving diagnosis using artificial neural networks. In this work, we propose a new way to extract features from complementary patient information, evaluate the best method for image feature extraction, and assess the most effective way to fuse this information. In this process, we include an interaction mechanism for multi-field complementary data with data enhancement by Poincaré transformation. The evaluation is conducted using datasets for the diagnosis of skin cancer (PAD-UFES-20) and oral cavity cancer (NDB-UFES). The results demonstrate the feasibility and strong performance of transformer-based architectures for the evaluated models to extract features from medical images. Obtained results indicate a statistically significant improvement of 3.37% in the performance of the proposed architecture in terms of balanced accuracy compared to the state-of-the-art on PAD-UFES-20 dataset and an improvement not statistically significant on NDB-UFES dataset. Additionally, a mixed fusion architecture was investigated, which favored the analysis of model interpretability using SHAP.

Access to document

Search form

You are here