Download PDFOpen PDF in browserTowards Data-Centric Approaches to Lung Cancer ClassificationEasyChair Preprint 1068813 pages•Date: August 14, 2023AbstractThere is an ever-growing need to review Artificial Intelligence and its corresponding implementation methodology in medical image analysis. The discussion of optimizing code versus improving data is of prime importance when maximizing model performance in medical image classification. Recently, a majority of studies have been model-centric. It is crucial to investigate data-centric methodologies and how medical image quality impacts a model's learning capabilities. This study opts toward data-related modifications for model improvement in lung cancer classification, acting as a proof of concept for developing data-centric AI. The proposed data-centric approach (DCA) modifies CT-scan images of the lung through 3 stages; image preprocessing, image segmentation, and feature extraction. The modified images were used to train a simple Convolutional Neural Network (CNN) for the classification task. We evaluate the performance of the proposed method using a publicly available real-world dataset of lung CT scans. Our method achieves a classification score (F1 score) of up to 0.889. This performance is superior to that reported using a model-centric approach on the same dataset, which conducted automatic hyperparameter optimization using the random search algorithm. Keyphrases: Lung Cancer Classification, Medical Image Analysis., data-centric AI
|