This repository contains an Exploratory Data Analysis (EDA) of the Indian Premier League (IPL) dataset. The EDA helps in understanding the distribution, trends, and hidden patterns within the IPL matches and player performances.
The goal of this project is to perform an in-depth Exploratory Data Analysis (EDA) on the IPL dataset. We aim to uncover insights regarding team performance, individual player statistics, venue advantages, and factors influencing match outcomes.
The dataset contains information on IPL matches, teams, and individual performances. The main features of the dataset include:
- Match data: Date, venue, teams, results
- Player data: Runs scored, wickets taken, and strike rates
- Venue data: City, stadium
- Other: Toss details, match result (win by runs/wickets)
Initial steps in the analysis include:
- Handling missing values
- Removing or fixing duplicates
- Converting data types
We use various libraries such as Matplotlib and Seaborn for visualizing the dataset.
- Teams winning the toss tend to choose batting or bowling based on the venue's conditions.
- The highest number of runs are typically scored at specific venues.
- Certain players consistently perform well across different IPL seasons.
The EDA revealed various important aspects of IPL matches, such as:
- Winning strategies: Factors influencing a team’s win, like toss decision and home ground advantage.
- Player performance: Identifying top performers by runs, wickets, and other critical metrics.
- Venue analysis: Key grounds where teams tend to perform better.
Contributions are welcome! Feel free to open an issue or submit a pull request for any feature suggestions, issues, or data improvements.