Hi all,

I’ve seen a lot of people talk about the preseason as irrelevant in terms of how a team actually performs during the regular season, so I wanted to see whether or not a correlation existed or not. This project was pretty simple, and the hardest part was just getting the data(there is very little preseason data, and most of it requires copying and pasting from website tables).

Methodology

I was looking at correlation purely from a “Win %” perspective, so I just gathered data on the last ~10 regular seasons and preseasons and had them in separate tables. I then merged the tables together based on both the year of the season and the team itself. With my final data frame, I created a scatterplot that plotted preseason winning percentages against regular season winning percentages. I also built a simple linear regression model and found the correlation between the two.

Conclusions

In terms of the linear regression model, the equation for the line of best fit was calculated to be (Predicted Regular Season Win %) = .405 + .178(Preseason Win %), which indicates that a 1% increase in preseason winning percentages correlates to a 0.178% increase in regular season winning percentages. The coefficient of preseason winning percentages was found to be statistically significant, which indicates that, at least to some degree, preseason performance CAN be used to predict regular season performance. The R^2, however, was only .088, indicating that very little variability of the regular season can be predicted by the preseason.

The graph shows results similar to what the models predict, with the data being scattered all over the place. The graph can be accessed through this link.

Next Steps

This project was really simple, but I think there are some other applications. For one, you could try looking at whether preseason statistics are indicative of regular season statistics(i.e. FG%, 3P%, etc.) for both teams and players. You could also look at the correlation between preseason and regular season for extremely good preseason performances and extremely poor preseason performances, as there may be stronger correlations there. I think a lot of it boils down to the preseason being a place for teams to test what they’ve worked on in the offseason instead of treating it like the actual league.