Download PDFOpen PDF in browser

Towards Data-Centric Approaches to Lung Cancer Classification

EasyChair Preprint 10688

13 pagesDate: August 14, 2023

Abstract

There is an ever-growing need to review Artificial Intelligence and its corresponding implementation methodology in medical image analysis. The discussion of optimizing code versus improving data is of prime importance when maximizing model performance in medical image classification. Recently, a majority of studies have been model-centric. It is crucial to investigate data-centric methodologies and how medical image quality impacts a model's learning capabilities. This study opts toward data-related modifications for model improvement in lung cancer classification, acting as a proof of concept for developing data-centric AI. The proposed data-centric approach (DCA) modifies CT-scan images of the lung through 3 stages; image preprocessing, image segmentation, and feature extraction. The modified images were used to train a simple Convolutional Neural Network (CNN) for the classification task. We evaluate the performance of the proposed method using a publicly available real-world dataset of lung CT scans. Our method achieves a classification score (F1 score) of up to 0.889. This performance is superior to that reported using a model-centric approach on the same dataset, which conducted automatic hyperparameter optimization using the random search algorithm.

Keyphrases: Lung Cancer Classification, Medical Image Analysis., data-centric AI

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:10688,
  author    = {Mark Movh and Isah A. Lawal},
  title     = {Towards Data-Centric Approaches to Lung Cancer Classification},
  howpublished = {EasyChair Preprint 10688},
  year      = {EasyChair, 2023}}
Download PDFOpen PDF in browser