Skip to content

๐Ÿ”ด Customer Churn Prediction (Bank Customers) ๐Ÿ”ด In this project, I analyzed bank customer data to predict who might leave the bank. I cleaned and prepared the dataset by handling missing values and encoding categorical features. I trained machine learning models to classify customers based on churn risk.

Notifications You must be signed in to change notification settings

Abdullah321Umar/DevelopersHub-DataScience-Analytics_Internship-TASK3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

30 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿฆ Data Science & Analytics Internship Task 3 | ๐Ÿ” Customer Churn Prediction โ€” Decoding Why Bank Customers Leave

Welcome to my Customer Churn Prediction Project! ๐Ÿš€๐Ÿ“Š This project dives deep into the world of banking, customer behavior, and predictive intelligence โ€” where data reveals the unseen patterns behind customer loyalty and attrition.


๐ŸŒŸ Prelude: The Pulse of Banking & the Science of Customer Retention

In todayโ€™s hyper-competitive financial landscape, retaining customers is far more impactful than acquiring new ones. Banks thrive when customers stay โ€” and suffer when they silently walk away. ๐Ÿ‘ฃ๐Ÿฆ This project transforms raw, real-world banking data into strategic intelligence. Through statistical analysis, machine learning, and visual storytelling, I uncover why customers leave, who is at risk, and how banks can prevent churn before it happens. Just like detective work, churn prediction reveals the hidden signals inside customer behavior โ€” turning data into decisions, and decisions into retention power. ๐Ÿ’ก๐Ÿ“ˆ


๐ŸŽฏ Project Synopsis

The Customer Churn Prediction Project is an end-to-end machine learning initiative where I explore, preprocess, analyze, model, and interpret customer data from a real bank โ€” with the goal of predicting which customers are likely to exit and why. From encoding features to training ML models and visualizing patterns, this project showcases the full workflow of predictive analytics applied to financial data.๐Ÿ’ป๐Ÿ“Š


๐Ÿงฉ1๏ธโƒฃ Data Origin: The Bank Churn Modelling Dataset

The dataset used is the widely respected Churn_Modelling Dataset, representing real customer profiles from a global bank.

๐Ÿ“Š Dataset Composition

  • Total Records: ~10,000
  • Total Features: 14
  • Target Variable: Exited (0 = Stayed, 1 = Left)

๐Ÿ”‘ Key Features:

  • ๐ŸŽฏ Credit Score
  • ๐ŸŒ Geography
  • ๐Ÿ‘ค Gender
  • ๐ŸŽ‚ Age
  • ๐Ÿ“… Tenure
  • ๐Ÿ’ฐ Balance
  • ๐Ÿ“ฆ Number of Products
  • ๐Ÿ’ณ Has Credit Card
  • ๐Ÿ”„ Is Active Member
  • ๐Ÿงพ Estimated Salary

๐ŸŒŸ Insight:

This dataset is ideal for understanding how demographics, financial health, and product engagement influence whether a customer stays loyal or decides to churn.

๐Ÿงน2๏ธโƒฃ Data Refinement & Preprocessing

Before building powerful ML models, the raw data undergoes careful preparation to ensure accuracy, reliability, and fairness.

๐Ÿ”ง Operations Executed

  • Checked for missing values & duplicates
  • Dropped irrelevant IDs
  • Encoded categorical features (Geography, Gender)
  • Standardized numerical features
  • Created train-test split
  • Explored distributions & patterns through EDA

Insight:

Proper preprocessing ensures that the model learns from clean, unbiased, well-structured data โ€” enhancing prediction quality and interpretability.

๐ŸŽจ3๏ธโƒฃ Exploratory Data Visualization

Visualization breathes life into data โ€” and in this project, dark-themed charts illuminate hidden patterns behind customer churn. ๐ŸŒ‘โœจ

  • ๐ŸŒˆ Key Visual Insights (20+ Visuals Created)
  • ๐Ÿ“Š Churn Distribution โ€” Understanding the imbalance
  • ๐ŸŽ‚ Age vs Churn โ€” Which age groups leave most?
  • ๐ŸŒ Geography vs Churn โ€” Regions with higher attrition
  • ๐Ÿ‘ค Gender Patterns โ€” Comparative churn behavior
  • ๐Ÿ’ณ Active Member Status โ€” Engagement vs loyalty
  • ๐Ÿ’ฐ Balance Distribution โ€” Does money influence churn?
  • ๐Ÿ“ฆ Products Held vs Churn โ€” The loyalty power of product bundles
  • ๐ŸŽฏ Credit Score Analysis โ€” Risk profiles
  • ๐Ÿ’ฑ Salary vs Churn โ€” Income dynamics
  • ๐Ÿงฒ Correlation Heatmap โ€” How numerical features relate
  • ๐Ÿฆ Customer Tenure Trends โ€” Experience vs loyalty
  • ๐Ÿšฆ Confusion Matrix Heatmap โ€” Model performance
  • ๐Ÿ“ˆ ROC Curve โ€” Prediction strength
  • ๐Ÿ“‰ Feature Importance Bar Plot โ€” What drives churn
  • ๐Ÿ”ฎ Probability Distribution of Predictions
  • ๐Ÿ”— Pairwise Relationships (Pairplot)
  • ๐Ÿ“Š Box Plots, Count Plots, Histograms & Violin Charts
  • ๐ŸŽ› Model Comparison Charts

๐Ÿ’ก Insight:

Visualization reveals behavioral clues โ€” showing that age, geography, credit score, and activity level play major roles in churn behavior.

๐Ÿค–4๏ธโƒฃ Machine Learning Models & Prediction

This project implements multiple classification models to predict churn:

๐Ÿ” Models Trained

  • Logistic Regression
  • Random Forest Classifier
  • XGBoost Classifier
  • Decision Tree
  • Support Vector Machine
  • K-Nearest Neighbors

๐Ÿ“ Evaluation Metrics

  • โœ” Accuracy
  • โœ” Precision
  • โœ” Recall
  • โœ” F1 Score
  • โœ” ROC-AUC
  • โœ” Confusion Matrix

๐Ÿ’ก Insight:

Tree-based models (Random Forest & XGBoost) emerged as top performers โ€” offering high interpretability and strong predictive power.

๐Ÿ”5๏ธโƒฃ Interpretative Insights

๐Ÿง  Core Findings:

  • ๐Ÿ”บ Customers aged 40โ€“60 show significantly higher churn.
  • ๐Ÿ”บ Customers from Germany churn more than other regions.
  • ๐Ÿ”บ Inactive customers have a much higher probability of leaving.
  • ๐Ÿ”บ Lower credit score customers are more likely to churn.
  • ๐Ÿ”บ Users with only 1 product churn more, showing reduced loyalty.
  • ๐Ÿ”บ Higher balance does not necessarily mean higher retention.

๐Ÿ’ก Inference:

Churn is influenced by a combination of financial, behavioral, and demographic attributes โ€” making predictive modeling essential for proactive retention strategies.

๐Ÿงฐ6๏ธโƒฃ Tools, Technologies & Workflow

๐Ÿ Programming Language

  • Python

๐Ÿ“Š Libraries Used

  • Pandas, NumPy
  • Matplotlib
  • Seaborn
  • Scikit-learn
  • XGBoos
  • Imbalanced-learn (if needed)

โš™๏ธ Workflow Integration

From data cleaning to ML modeling and visualization, the project follows a structured, professional data science pipeline.

๐ŸŒŸ7๏ธโƒฃ Concluding Reflections

This Customer Churn Prediction Project highlights the transformative power of data analytics in the banking sector. By predicting churn before it happens, banks can:

  • Strengthen customer relationships
  • Build loyalty programs
  • Increase revenue retention
  • Offer personalized services This project is not just about prediction โ€” itโ€™s about understanding human behavior, financial patterns, and the strategies that help businesses connect better with customers. ๐ŸŒ๐Ÿ’™๐Ÿ“Š

โœจ8๏ธโƒฃ Epilogue: Beyond Machine Learning

Every customer carries a unique story โ€” and churn prediction helps banks listen to those stories before they lose valuable relationships. Data doesnโ€™t just inform decisions; it empowers businesses to evolve.

๐ŸŒŸ โ€œRetention begins with understanding โ€” and understanding begins with data.โ€

โ€” Author โ€” Abdullah Umar, Data Science & Analytics Intern at DevelopersHub Corporation


๐Ÿ”— Let's Connect:-

๐Ÿ“ง Email: umerabdullah048@gmail.com


Task 3 Statement:-

Preview


TASK 3 Plots Preview:-

Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview Preview


About

๐Ÿ”ด Customer Churn Prediction (Bank Customers) ๐Ÿ”ด In this project, I analyzed bank customer data to predict who might leave the bank. I cleaned and prepared the dataset by handling missing values and encoding categorical features. I trained machine learning models to classify customers based on churn risk.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published