ModelArena

Competitive modeling with uncertainty you can trust

By Lance Jepsen & ChatGPT

👤 Author

Lance Jepsen
Data Science · Machine Learning · AutoML Systems

🔗 LinkedIn: https://www.linkedin.com/in/lance-jepsen/
🎥 Project walkthrough: https://www.youtube.com/watch?v=vGuiYdUlMI8

ModelArena (modelarena-automl) is a competitive AutoML system with conformal prediction and uncertainty-aware modeling.

ModelArena is a professional AutoML playground where machine learning models compete, uncertainty is quantified, and predictions become explainable.

ModelArena helps you compare models, understand what drives predictions, and make decisions with uncertainty-aware outputs (locally adaptive conformal prediction intervals for regression).

It’s built to be:

Beginner-friendly (learn ML by doing)
Professional-grade (tournament leaderboard + diagnostics + what-if)
Practical (works on your own CSVs in minutes)

🚀 Quick Start (Run with Streamlit)

1) Prerequisites

Python 3.10+ recommended
Windows / macOS / Linux

2) Install

Open a terminal in the project folder:

# (optional) create & activate a virtual environment
python -m venv .venv

# Windows (PowerShell)
.venv\Scripts\Activate.ps1

# Windows (cmd)
.venv\Scripts\activate.bat

# macOS/Linux
source .venv/bin/activate

# install dependencies
python -m pip install --upgrade pip
pip install -r requirements.txt

3) Run the app

streamlit run app.py

Streamlit will print a local URL (usually http://localhost:8501). Open it in your browser.

📂 Included Sample Datasets (Start Here)

1️⃣ `sample_rent_regression.csv` (Regression)

Goal: Predict monthly rent (a number)

Target column

target_monthly_rent_usd

Features

unit_size_sqft
bedrooms
bathrooms
year_built
distance_to_downtown_miles
crime_index
school_rating

2️⃣ `sample_tenant_renewal_classification.csv` (Classification)

Goal: Predict tenant renewal (yes/no)

Target column

renewed_lease   (0 = No, 1 = Yes)

Features

monthly_rent
income_usd
tenure_months
late_payments
maintenance_requests
unit_size_sqft
satisfaction_score

🧠 Educational Walkthrough (Learn ML by Using ModelArena)

Step 1 — Load a CSV

Upload one of the sample CSVs (or your own).

Tip: The target is the column you want to predict.

Regression target example: target_monthly_rent_usd
Classification target example: renewed_lease

Step 2 — Choose Task

ModelArena works for both:

Task	Predicts	Examples
Regression	a number	rent, price, time, cost
Classification	a category	renewal, churn, fraud

Step 3 — Choose Metric (This controls the “winner”)

Regression

RMSE: penalizes large errors more
MAE: average absolute error, easy to interpret

Classification

Accuracy: % correct (simple baseline)
F1: better when classes are imbalanced
ROC-AUC: ranking quality (requires probabilities)

Important: If you choose ROC-AUC, ModelArena will only use models that can produce probabilities.

Step 4 — Run Tournament

Click Run Tournament.

ModelArena will:

train multiple models
tune them (if tuning is enabled)
rank them on your chosen metric
crown a winner

You’ll see a leaderboard with the scores.

🔍 Diagnostics (How to Interpret Results)

✅ Most Predictive Features

ModelArena ranks columns by permutation importance (model-agnostic):

Higher = more predictive of the outcome
Works for regression and classification

This is the “Which columns matter most?” chart.

📈 Regression: Predicted vs Actual

Points: predictions vs true values
Diagonal line: perfect predictions
Uncertainty band: prediction interval summary (adaptive conformal PI)

🧩 Classification: Confusion Matrix

Shows:

True positives / negatives
False positives / negatives

This helps you see what kind of mistakes the model is making.

📐 Prediction Intervals (Regression)

ModelArena provides locally adaptive conformal prediction intervals:

Distribution-free (doesn’t assume normality)
Works with any winning model
Interval width adjusts by row (heteroskedastic)

Instead of only:

Predicted rent = $2,100

You also get:

95% interval ≈ [$1,920, $2,280]

🔮 Quick Prediction

After the tournament:

Enter feature values
Get an instant prediction
See uncertainty (regression) or class outcome (classification)

🔁 What-If / Counterfactual Simulator

Move sliders to answer:

“What if unit size increases?”
“What if income drops?”
“What if crime index improves?”

Predictions update live to make ML intuitive.

🧩 Supported Models (current)

Linear Regression / Logistic Regression
Random Forest
ExtraTrees
HistGradientBoosting
XGBoost
LightGBM
CatBoost

🛠 Troubleshooting

Streamlit command not found

If streamlit run app.py fails, reinstall:

pip install -r requirements.txt

Switching between regression & classification

If you switch datasets and see odd UI behavior, refresh the page to clear Streamlit state (or use the app’s reset button if present).

👤 Authors

Lance Jepsen – product vision, architecture, ML direction
ChatGPT – co-developer, ML engineering, education & documentation

📜 License

MIT License — free to use, modify, and learn from.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
core		core
reports		reports
ui		ui
utils		utils
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
sample_rent_regression.csv		sample_rent_regression.csv
sample_tenant_renewal_classification.csv		sample_tenant_renewal_classification.csv
state.py		state.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ModelArena

Competitive modeling with uncertainty you can trust

👤 Author

🚀 Quick Start (Run with Streamlit)

1) Prerequisites

2) Install

3) Run the app

📂 Included Sample Datasets (Start Here)

1️⃣ `sample_rent_regression.csv` (Regression)

2️⃣ `sample_tenant_renewal_classification.csv` (Classification)

🧠 Educational Walkthrough (Learn ML by Using ModelArena)

Step 1 — Load a CSV

Step 2 — Choose Task

Step 3 — Choose Metric (This controls the “winner”)

Regression

Classification

Step 4 — Run Tournament

🔍 Diagnostics (How to Interpret Results)

✅ Most Predictive Features

📈 Regression: Predicted vs Actual

🧩 Classification: Confusion Matrix

📐 Prediction Intervals (Regression)

🔮 Quick Prediction

🔁 What-If / Counterfactual Simulator

🧩 Supported Models (current)

🛠 Troubleshooting

Streamlit command not found

Switching between regression & classification

👤 Authors

📜 License

About

Uh oh!

Releases

Packages

Languages

lancejepsen/ModelArena-AutoML

Folders and files

Latest commit

History

Repository files navigation

ModelArena

Competitive modeling with uncertainty you can trust

👤 Author

🚀 Quick Start (Run with Streamlit)

1) Prerequisites

2) Install

3) Run the app

📂 Included Sample Datasets (Start Here)

1️⃣ sample_rent_regression.csv (Regression)

2️⃣ sample_tenant_renewal_classification.csv (Classification)

🧠 Educational Walkthrough (Learn ML by Using ModelArena)

Step 1 — Load a CSV

Step 2 — Choose Task

Step 3 — Choose Metric (This controls the “winner”)

Regression

Classification

Step 4 — Run Tournament

🔍 Diagnostics (How to Interpret Results)

✅ Most Predictive Features

📈 Regression: Predicted vs Actual

🧩 Classification: Confusion Matrix

📐 Prediction Intervals (Regression)

🔮 Quick Prediction

🔁 What-If / Counterfactual Simulator

🧩 Supported Models (current)

🛠 Troubleshooting

Streamlit command not found

Switching between regression & classification

👤 Authors

📜 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1️⃣ `sample_rent_regression.csv` (Regression)

2️⃣ `sample_tenant_renewal_classification.csv` (Classification)

Packages