Soccer is the most popular sport in today’s world. Its enormous fan base presents a wealth of economic and risk prospects, necessitating extensive data analysis to reduce risk and ambiguity while closing deals. With the help of our project, we hope to learn if there is a correlation between a player's overperformance of xG (anticipated goals) and a different player's overperformance of xA (expected assists) on the same team. Our study will allow us to compare xG to actual goals scored and xA to actual assists in order to identify players who performed better than average or poorly than average as well as their overall efficiency. Some additional criteria we'll use is comparing G to xG, A to xA, xG to xG per 90, xA to xA per 90.
The data collected represents every male football player who has played in the 2021-2022 season of the Premier League and their statistics for that year. Some of the key statistics in this dataset includes Minutes Played, Goals, Assists, Expected Goals (xG), Expected Assists (xA), Expected Goals x90 (xG90), Expected Assists x90 (xA90). Expected Goals is a metric that calculates a player's likelihood of scoring based off of the chance he's received. On the other hand, Expected Assists is a metric that calculates the quality of chances he is creating for his teammates. xG90 and xA90 are these stats rounded to a per 90 minute basis as football games are played for 90 minutes this shows a player's statistics per 90. The purpose of the dataset is to make comparisons between these key metrics to rate players based on their efficiency. Furthermore, we are trying to prove the prowess of statistics such as xA which we consider to be a more reliable metric to rate a player's playmaking. The data is sourced through Understat, a football data site that tracks and records all statistics available throughout different championships and combines the data with the predictive values to allow further data analysis between the actual and expected values. The data was collected through the records of the games played and player records over the past few years. It was then further built upon by the contributors of this data set to allow further exploration.
Person 1 (Ananya Singh): Coming from a football loving country, I am very interested in football. However, the statistical analysis of football players' performance interests me even more because now I get to understand the underlying numbers behind their contributions.
Person 2 (Ifaz Chowdhury): I want to examine the efficiency for top players and am personally fascinated by player performance in the sense of how well they should have performed vs how much they have performed.
Person 3 (Imtiaz Nasif): As a computer science major I've always been interested in working with data analysis, manipulate and how to handle big data. This course seems the perfect opportunity to dive deeper into my interest as I'm a football fan.