Category:
Data Analysis
Client:
N/A
NBA Analytics & Visualization Project
Vision
The project aimed to uncover insightful trends and performance patterns in NBA player and team statistics from 2012 to 2018. By leveraging machine learning, statistical modeling, and dynamic visualizations, the goal was to better understand player dynamics, performance forecasting, and team strategies over time.
Approach
Data Preparation
Collected and processed NBA box score data (2012–2018) using custom Python scripts.
Aggregated player performance monthly and yearly with
sum_player.py
, outputting clean CSVs for modeling and visualization.
Player & Team Comparisons
Conducted comparative analysis on Steph Curry vs. other point guards, similarly built players (within 1 standard deviation for height and weight), and all-stars.
Utilized
curculator.py
to drive these analyses, integrating regression and feature engineering pipelines.
Machine Learning & Simulation
Applied classification, clustering, and regression models (via
team_change.py
) to identify key patterns in player/team performance.Built time-series models to evaluate momentum and simulate team dynamics under hypothetical scenarios.
Visualization
Delivered a polished PowerBI dashboard to present insights on player comparisons, clustering groupings, and regression outcomes in an interactive format.
Challenges
Managing large datasets with inconsistent player identifiers and missing values.
Ensuring accurate aggregation across years and teams.
Balancing interpretability with model complexity in time series and simulation outputs.
Conclusion
This project highlighted the power of data science in sports analytics, combining Python-based machine learning with real-time data exploration through PowerBI. It demonstrated advanced use of statistical models, clustering, and simulation techniques to deliver actionable insights into NBA performance trends.