Full-Stack Sports Analytics Dashboard powered by R, Shiny, and Machine Learning
Author: Jack Motta
NFL Player Stat Predictor is an interactive R Shiny web application that predicts weekly NFL player performance for Quarterbacks (QBs), Running Backs (RBs), and Wide Receivers/Tight Ends (WR/TE). It leverages advanced data engineering, machine learning pipelines, and a custom-themed frontend to deliver real-time player stat forecasts in a production-ready format.
This project demonstrates full-stack R development — from data ingestion and modeling to UI/UX design — and showcases my ability to build end-to-end data products in a real-world sports analytics context.
-
Weekly Player Predictions
Forecasts passing, rushing, and receiving stats using historical performance and contextual in-game features. -
Advanced Modeling Pipeline
Uses multi-response LASSO regression (glmnet) and rolling-origin cross-validation to handle time-series data and tune model hyperparameters. -
Custom Shiny Dashboard
Clean, mobile-friendly interface with player headshots, dynamic tables, dark mode styling (bslib), and fully interactive components. -
Metrics Tab
Explains RMSE and R-squared for non-technical users while displaying model diagnostics across each position group. -
Extensive Data Engineering
Aggregates, merges, and cleans player and game-level data from:nflreadr(core player stats)nflfastR(game context)- ESPN QBR
- NextGen Stats
- Pro Football Reference (advanced stats)
Note: The link to the Shiny app and final deliverable can be found at the bottom of this file.
Each position group is trained on its own dataset with engineered features tailored to how that position contributes on the field.
Performance Metrics Used:
-
RMSE (Root Mean Squared Error):
Measures average prediction error in the unit of the stat. In this domain, an RMSE that’s ~10–20% of a stat’s range is often solid. -
R-squared (R²):
Indicates how much variation is explained by the model. In real-world settings like NFL performance prediction, R² values between 0.3–0.5 (or lower) are common due to high game-to-game variance and external factors like game flow, weather, and injuries.
The app’s "Metrics" tab includes both the numbers and beginner-friendly explanations to help contextualize model performance.
| Layer | Tools Used |
|---|---|
| Frontend | shiny, bslib, reactable, shinyWidgets, custom HTML/CSS |
| Backend | tidymodels, glmnet, doParallel, foreach, gt, zoo |
| Data Ingestion | nflreadr, nflfastR, arrow |
| Modeling | LASSO (glmnet) with rolling-origin CV for robust time-aware tuning |
| Visualization | ggplot2 |
| File/Folder | Description |
|---|---|
QB_Stat_Pred.qmd, RB_Stat_Pred.qmd, WR_TE_Stat_Pred.qmd |
The primary Quarto notebook containing end-to-end analysis, feature engineering, modeling, and evaluation for predicting NFL player stats. Designed for reproducibility and readability. |
QB_Stat_Pred.html, RB_Stat_Pred.html, WR_TE_Stat_Pred.html |
The rendered HTML output of qmd files, viewable in-browser. Use this for quick viewing of the project without needing to run code. |
app.R |
The Shiny application that allows users to interactively explore predictions, player trends, and model outputs. Launches a dynamic web app using the trained models and feature outputs. |
qbr_missing_cleaned.csv |
A manually cleaned dataset of missing QBR records not returned by nflreadr::load_espn_qbr(). This file supplements incomplete ESPN QBR data and ensures full model coverage. |
www/ |
Contains image assets used in the Shiny app, including visuals generated by OpenAI's DALL·E. This folder is automatically used by Shiny to serve static content like logos or background images. |
This project simulates a real-world workflow in data science and applied modeling:
-
End-to-End Ownership
Covers data collection, preprocessing, modeling, evaluation, and interactive presentation. -
Real-World Constraints
Handles noisy and volatile data in a domain (sports) where perfect prediction is unrealistic — and embraces it with transparent metric explanations. -
Clear Communication
Bridges the gap between technical output and human interpretation with a polished user interface and intuitive insights. -
Production-Ready Workflow
Designed with maintainability in mind — modularized data cleaning, modeling, and deployment-ready.parquetoutputs.
This project serves as both a practical tool and a demonstration of applied modeling, reproducibility, and front-end integration in R.
https://jmotta31.shinyapps.io/NFL_Performance_Final/
If you're interested in this project or want to discuss more:
Jack Motta
📧 jgmotta2000@gmail.com
🔗 LinkedIn
🐙 GitHub
"In predictive modeling, especially in sports analytics, insight often lies in improving over noise—not eliminating it entirely."