Heart Disease Detection Using Extra Trees Classifier

Universiti Teknologi Petronas

2023

Title
Title
Title

Background

This mini research project focused on developing an early prediction model for heart disease, motivated by the fact that heart disease remains the number one cause of death globally responsible for 1 in 6 deaths in 2019. Early diagnosis can significantly reduce fatal outcomes, and machine learning provides a powerful approach to building reliable, data-driven prediction systems. The dataset used in this study comes from Kaggle, combining records from Hungary, Switzerland, Cleveland, and Long Beach, consisting of 76 attributes, although only 14 were publicly available.

Responsibilities

I conducted all parts of the analysis independently, starting from data preprocessing, exploratory analysis, model selection, hyperparameter optimization, and evaluation. I handled the entire technical pipeline, ensuring the workflow was well-documented and the results were interpretable and aligned with the study objectives.

Key Features

The project began with comprehensive data preprocessing, confirming no missing values, no major outliers, and no class imbalance, while removing duplicate records for data integrity. Exploratory Data Analysis was performed using boxplots and bar charts, revealing moderate variance without significant anomalies. The dataset was then split into an 80% training set and 20% testing set. For modeling, the Extra Trees Classifier was chosen due to its efficiency in handling high-dimensional spaces and strong performance in classification tasks. Hyperparameter tuning was performed using Optuna.

Result

The optimized model achieved strong performance, with a training accuracy of 94.19% and a testing accuracy of 85.25%, indicating a reliable level of generalization. The results show that Extra Trees, combined with systematic parameter tuning, can effectively support preliminary detection of heart disease.

Result
Result
Result

Create a free website with Framer, the website builder loved by startups, designers and agencies.