Skip to content

jgmotta731/NFL_Stat_Pred

Repository files navigation

NFL Player Stat Predictor 🏈

Full-Stack Sports Analytics Dashboard powered by R, Shiny, and Machine Learning
Author: Jack Motta

🔍 Overview

NFL Player Stat Predictor is an interactive R Shiny web application that predicts weekly NFL player performance for Quarterbacks (QBs), Running Backs (RBs), and Wide Receivers/Tight Ends (WR/TE). It leverages advanced data engineering, machine learning pipelines, and a custom-themed frontend to deliver real-time player stat forecasts in a production-ready format.

This project demonstrates full-stack R development — from data ingestion and modeling to UI/UX design — and showcases my ability to build end-to-end data products in a real-world sports analytics context.


🎯 Key Features

  • Weekly Player Predictions
    Forecasts passing, rushing, and receiving stats using historical performance and contextual in-game features.

  • Advanced Modeling Pipeline
    Uses multi-response LASSO regression (glmnet) and rolling-origin cross-validation to handle time-series data and tune model hyperparameters.

  • Custom Shiny Dashboard
    Clean, mobile-friendly interface with player headshots, dynamic tables, dark mode styling (bslib), and fully interactive components.

  • Metrics Tab
    Explains RMSE and R-squared for non-technical users while displaying model diagnostics across each position group.

  • Extensive Data Engineering
    Aggregates, merges, and cleans player and game-level data from:

    • nflreadr (core player stats)
    • nflfastR (game context)
    • ESPN QBR
    • NextGen Stats
    • Pro Football Reference (advanced stats)

Note: The link to the Shiny app and final deliverable can be found at the bottom of this file.


📊 Model Metrics

Each position group is trained on its own dataset with engineered features tailored to how that position contributes on the field.

Performance Metrics Used:

  • RMSE (Root Mean Squared Error):
    Measures average prediction error in the unit of the stat. In this domain, an RMSE that’s ~10–20% of a stat’s range is often solid.

  • R-squared (R²):
    Indicates how much variation is explained by the model. In real-world settings like NFL performance prediction, R² values between 0.3–0.5 (or lower) are common due to high game-to-game variance and external factors like game flow, weather, and injuries.

The app’s "Metrics" tab includes both the numbers and beginner-friendly explanations to help contextualize model performance.


🧰 Tech Stack

Layer Tools Used
Frontend shiny, bslib, reactable, shinyWidgets, custom HTML/CSS
Backend tidymodels, glmnet, doParallel, foreach, gt, zoo
Data Ingestion nflreadr, nflfastR, arrow
Modeling LASSO (glmnet) with rolling-origin CV for robust time-aware tuning
Visualization ggplot2

📂 File Descriptions

File/Folder Description
QB_Stat_Pred.qmd, RB_Stat_Pred.qmd, WR_TE_Stat_Pred.qmd The primary Quarto notebook containing end-to-end analysis, feature engineering, modeling, and evaluation for predicting NFL player stats. Designed for reproducibility and readability.
QB_Stat_Pred.html, RB_Stat_Pred.html, WR_TE_Stat_Pred.html The rendered HTML output of qmd files, viewable in-browser. Use this for quick viewing of the project without needing to run code.
app.R The Shiny application that allows users to interactively explore predictions, player trends, and model outputs. Launches a dynamic web app using the trained models and feature outputs.
qbr_missing_cleaned.csv A manually cleaned dataset of missing QBR records not returned by nflreadr::load_espn_qbr(). This file supplements incomplete ESPN QBR data and ensures full model coverage.
www/ Contains image assets used in the Shiny app, including visuals generated by OpenAI's DALL·E. This folder is automatically used by Shiny to serve static content like logos or background images.

💼 Why This Project Stands Out

This project simulates a real-world workflow in data science and applied modeling:

  • End-to-End Ownership
    Covers data collection, preprocessing, modeling, evaluation, and interactive presentation.

  • Real-World Constraints
    Handles noisy and volatile data in a domain (sports) where perfect prediction is unrealistic — and embraces it with transparent metric explanations.

  • Clear Communication
    Bridges the gap between technical output and human interpretation with a polished user interface and intuitive insights.

  • Production-Ready Workflow
    Designed with maintainability in mind — modularized data cleaning, modeling, and deployment-ready .parquet outputs.

This project serves as both a practical tool and a demonstration of applied modeling, reproducibility, and front-end integration in R.


Shiny App Link

https://jmotta31.shinyapps.io/NFL_Performance_Final/

📬 Contact

If you're interested in this project or want to discuss more:

Jack Motta
📧 jgmotta2000@gmail.com
🔗 LinkedIn
🐙 GitHub


"In predictive modeling, especially in sports analytics, insight often lies in improving over noise—not eliminating it entirely."

About

Predicting NFL stats for QBs, WR/TEs, and RBs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages