Skip to content

Prediction of Traders' Activity & Behavior on the Ronin Blockchain using Binary Classification (Supervised Machine Learning)

License

Notifications You must be signed in to change notification settings

joshuatochinwachi/Ronin-Users-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Ronin Blockchain User Classification

Predicting which Ronin blockchain users will remain active based on their historical transaction behavior using Machine Learning.


πŸ“‹ Table of Contents


🎯 Problem Description

Business Problem

Identifying which blockchain users will remain active vs churn is critical for:

  • User retention strategies - Focus resources on at-risk users
  • Community growth - Understand what drives engagement
  • Resource allocation - Prioritize high-value, consistent users

Technical Problem

Binary Classification: Predict if a user will be active (5+ transactions) in the next 90 days based on their past 365 days of behavior.

Target Variable:

  • Good Trader (1): User with β‰₯5 transactions in next 90 days
  • Bad Trader (0): User with <5 transactions in next 90 days

πŸ“Š Dataset

Data Source

  • Blockchain: Ronin Network
  • Platform: Dune Analytics - Query ID 6221750
  • Time Period:
    • Training: 455 days ago β†’ 90 days ago (365-day window)
    • Prediction: Last 90 days

Dataset Statistics

  • Total Records: 5,000 users
  • Class Distribution: Perfectly balanced (2,500 Good, 2,500 Bad)

Features (5 numerical features)

Feature Description Type
tx_count_365d Total transactions in past 365 days Integer
total_volume Total transaction volume in USD Float
active_weeks Number of weeks user was active Integer
avg_tx_value Average value per transaction Float
tx_per_active_week Transactions per active week Float

πŸ“ Project Structure

ronin-users-classification/
β”œβ”€β”€ data/
β”‚   └── ronin_traders_dataset.csv          # Training dataset
β”œβ”€β”€ notebooks/
β”‚   └── 01_eda_and_training.ipynb          # EDA & model training
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ best_model_random_forest.pkl       # Trained model
β”‚   β”œβ”€β”€ feature_names.pkl                  # Feature names
β”‚   └── model_comparison_results.csv       # Performance metrics
β”œβ”€β”€ visualizations/
β”‚   β”œβ”€β”€ eda_analysis.png                   # EDA visualizations
β”‚   β”œβ”€β”€ model_comparison.png               # Model performance
β”‚   β”œβ”€β”€ confusion_matrix.png               # Confusion matrix
β”‚   └── feature_importance.png             # Feature importance
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ predict.py                         # Prediction script
β”‚   β”œβ”€β”€ app.py                             # Flask API
β”‚   └── test_api.py                        # API tests
β”œβ”€β”€ Dockerfile                              # Docker configuration
β”œβ”€β”€ requirements.txt                        # Python dependencies
└── README.md                               # This file

πŸ” Features

Numerical Features

  1. tx_count_365d - Transaction frequency indicator
  2. total_volume - Economic activity measure
  3. active_weeks - Consistency/engagement metric
  4. avg_tx_value - Transaction size indicator
  5. tx_per_active_week - Activity intensity measure

Feature Importance (from Random Forest)

Feature Importance Interpretation
total_volume 30% πŸ’° Most critical factor
active_weeks 28% πŸ“… Consistency matters
tx_count_365d 22% πŸ”’ Activity level important
avg_tx_value 11% πŸ’΅ Transaction size relevant
tx_per_active_week 9% ⚑ Frequency moderately important

πŸ† Model Performance

Models Trained

  1. Random Forest ⭐ (Best)
  2. XGBoost
  3. Decision Tree
  4. Logistic Regression

Best Model: Random Forest

Metric Score Interpretation
ROC-AUC 0.9646 Outstanding discriminative ability
Accuracy 91.4% Correct 914/1000 predictions
Precision 90.4% 90.4% of "Good" predictions are correct
Recall 92.6% Catches 92.6% of actual good traders
F1-Score 91.5% Balanced precision/recall

Confusion Matrix (Test Set)

Predicted Bad Predicted Good
Actual Bad 451 βœ… 49 ❌
Actual Good 37 ❌ 463 βœ…

Key Insight: Only 86 misclassifications out of 1,000 test samples (8.6% error rate)

Model Comparison

Model Accuracy ROC-AUC
Random Forest 91.4% 0.9646
XGBoost 90.4% 0.9619
Decision Tree 87.8% 0.9468
Logistic Regression 83.0% 0.8889

πŸ“ˆ Results ---> Check RESULTS.md for full details

Key Findings

  1. Transaction volume is the strongest predictor (30% importance)

    • Higher volume β†’ Higher retention
  2. Consistency matters more than intensity

    • Active weeks (28%) > Transactions per week (9%)
  3. Model achieves 96.46% ROC-AUC

    • Production-ready performance
    • Balanced precision and recall

Business Impact

  • Churn Prediction: Identify 92.6% of users at risk of leaving
  • Resource Optimization: Focus retention efforts on predicted churners
  • Early Warning: 90-day advance notice for intervention
  • Accuracy: 91.4% correct predictions

πŸ› οΈ Installation

Prerequisites

  • Python 3.10+
  • pip or conda
  • Docker (optional, for containerization)

Local Setup

  1. Clone the repository
git clone https://github.com/yourusername/ronin-trader-classification.git
cd ronin-trader-classification
  1. Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt
  1. Verify installation
python predict.py

πŸ’» Usage

1. Standalone Prediction Script

from predict import RoninTraderPredictor

# Initialize predictor
predictor = RoninTraderPredictor()

# Single prediction
trader = {
    'tx_count_365d': 150,
    'total_volume': 25.5,
    'active_weeks': 20,
    'avg_tx_value': 0.17,
    'tx_per_active_week': 7.5
}

result = predictor.predict_single(trader)
print(f"Prediction: {result['prediction']}")
print(f"Confidence: {result['confidence']:.2%}")

Run the example:

python predict.py

2. Flask API Service

Start the API:

python app.py

API will be available at http://localhost:5000

3. Run Tests

python test_api.py

πŸ“‘ API Documentation

Base URL

http://localhost:5000

Endpoints

1. Home - GET /

Get API information

Response:

{
  "service": "Ronin Trader Classification API",
  "version": "1.0",
  "model": "Random Forest",
  "model_performance": {
    "accuracy": "91.4%",
    "roc_auc": "0.9646"
  }
}

2. Health Check - GET /health

Check service status

Response:

{
  "status": "healthy",
  "model_loaded": true,
  "features_loaded": true
}

3. Get Features - GET /features

Get required feature information

Response:

{
  "required_features": ["tx_count_365d", "total_volume", ...],
  "descriptions": {...},
  "example": {...}
}

4. Single Prediction - POST /predict

Request:

curl -X POST http://localhost:5000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "tx_count_365d": 150,
    "total_volume": 25.5,
    "active_weeks": 20,
    "avg_tx_value": 0.17,
    "tx_per_active_week": 7.5
  }'

Response:

{
  "prediction": "Good Trader",
  "will_remain_active": true,
  "confidence": 0.92,
  "probability_good_trader": 0.92,
  "probability_bad_trader": 0.08,
  "input_features": {...}
}

5. Batch Prediction - POST /predict_batch

Request:

curl -X POST http://localhost:5000/predict_batch \
  -H "Content-Type: application/json" \
  -d '{
    "traders": [
      {"tx_count_365d": 500, "total_volume": 100.0, ...},
      {"tx_count_365d": 10, "total_volume": 0.5, ...}
    ]
  }'

Response:

{
  "predictions": [
    {
      "index": 0,
      "prediction": "Good Trader",
      "confidence": 0.95,
      ...
    }
  ],
  "summary": {
    "total": 2,
    "good_traders": 1,
    "bad_traders": 1,
    "percentage_good": 50.0
  }
}

🐳 Docker Deployment

Build Docker Image

docker build -t ronin-trader-classifier .

Run Container

docker run -p 5000:5000 ronin-trader-classifier

Test Dockerized API

curl http://localhost:5000/health

Docker Compose (Optional)

Create docker-compose.yml:

version: '3.8'
services:
  api:
    build: .
    ports:
      - "5000:5000"
    environment:
      - PORT=5000
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:5000/health"]
      interval: 30s
      timeout: 3s
      retries: 3

Run with:

docker-compose up

πŸ“š Technologies Used

  • Python 3.10 - Programming language
  • Pandas & NumPy - Data manipulation
  • Scikit-learn - Machine learning
  • XGBoost - Gradient boosting
  • Flask - Web framework
  • Docker - Containerization
  • Dune Analytics - Blockchain data
  • Jupyter - Interactive development

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


🀝 Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Submit a pull request

πŸ“§ Contact


πŸ™ Acknowledgments

  • DataTalksClub - ML Zoomcamp course
  • Dune Analytics - Blockchain data platform
  • Ronin Network - Blockchain infrastructure
  • Sky Mavis - Ronin creators

Built with ❀️ for the Ronin Community

About

Prediction of Traders' Activity & Behavior on the Ronin Blockchain using Binary Classification (Supervised Machine Learning)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published