📊 Projects

Main page

Welcome to my portfolio of data projects. This repository includes a selection of hands-on projects using SQL and Python for real-world data analysis tasks. These projects cover exploratory analysis, data cleaning, and correlation analysis—fundamental skills in the data analytics and engineering domains.

📁 Projects Overview

1. 🔍 Data cleaning and Exploratory Data Analysis (MySQL)

Dataset

Description:
This project demonstrates exploratory data analysis using SQL on a structured dataset. It includes operations such as:
- Handling and replacing NULL values, Standardizing formats (e.g., date and text case), Removing duplicates, Creating temporary clean views and applying transformation logic, Renaming columns for clarity and better readability
- Identifying trends through grouping and filtering
- Calculating statistical aggregates like averages, counts, and percentages
- Using CASE statements to classify data
- Applying conditional filtering to uncover insights
The goal was to simulate a typical data exploration workflow within a relational database before building dashboards or models.
Skills Practiced : Data cleaning, SQL scripting, formatting, NULL handling, view creation, aggregations, conditional logic, pattern recognition, CTE and window functions
STACK: MySQL (PostgreSQL-compatible)

2. 🎬 Movie Correlation Project (Python)

Dataset
REPORT
Jupyter notebook with code
Description:
This Jupyter Notebook investigates how different variables (budget, gross revenue, runtime, etc.) relate to each other in a movie dataset. Steps included:
- Data cleaning and preprocessing using pandas
- Visualizing distributions and relationships with plotly.express
- Calculating correlation and plotting a heatmap
- Identifying which variables strongly affect a movie’s box office success
The notebook helps answer business questions like: "Do bigger budgets lead to higher gross?" or "Which factors most influence revenue?"
Skills Practiced: Exploratory data analysis, feature correlation, data visualization, Python scripting
STACK: Python, Pandas, Plotly.express, Jupyter Notebook

3. 🧪 A/B Test – Marketing Campaign Analysis (Python)

Description:
This project evaluates the effectiveness of a new landing page through A/B testing. The analysis includes:
- Comparing conversion rates between control (Group A) and treatment (Group B)
- Conducting a z-test to assess statistical significance
- Visualizing results and drawing business conclusions
The goal was to simulate a real-world decision-making process using hypothesis testing.
Skills Practiced: A/B testing, hypothesis testing, statistical analysis, data visualization
STACK: Python, Pandas, SciPy, Seaborn, Matplotlib

4. 📈 Marketing Cohort Analysis (Python)

Dataset
Jupyter notebook with code
Description:
This project analyzes the retention and purchase behavior of customers for an e-commerce company using cohort analysis. The notebook walks through:
- Cleaning and transforming user transaction data with pandas
- Assigning cohort groups based on user acquisition month
- Calculating retention rates across cohorts
- Visualizing cohort retention heatmaps with seaborn and matplotlib
- Deriving insights about user engagement and drop-off patterns
This type of analysis is commonly used by product teams and marketing analysts to understand customer lifecycle, evaluate loyalty, and improve retention strategies.
Skills Practiced: Cohort analysis, retention analysis, data cleaning, data visualization, pandas transformations
STACK: Python, Pandas, Matplotlib, Seaborn, Jupyter Notebook

5. 📂 CSV Processor – Automated Data Cleaning (Python)

Description:
This project automates the processing of raw CSV files to prepare them for analysis. It includes:
- Reading input CSVs and handling encoding issues
- Cleaning data by removing empty rows and standardizing column formats
- Filtering and transforming specific fields (e.g., renaming, converting values)
- Saving the cleaned output to a new CSV file
The goal was to streamline repetitive preprocessing steps in data workflows.
Skills Practiced: Data preprocessing, automation, file handling, basic ETL logic
STACK: Python, Pandas, CSV module

6. 📚 Wikipedia Parser – Async Scraper + AI Summarizer

Description:
Asynchronous API that recursively scrapes Wikipedia articles (up to 5 levels deep) and generates AI-powered summaries using DeepSeek API. Parsed articles, their relationships, and summaries are stored in a PostgreSQL database.
Features:
- Recursive Wikipedia parsing with parent-child relations
- AI-generated summaries for articles
- Async FastAPI backend with PostgreSQL
- Fully Dockerized for easy deployment
Skills Practiced:
Async Python, API development, web scraping, AI API integration, PostgreSQL, Docker
STACK:
Python, FastAPI, SQLAlchemy (async), PostgreSQL, BeautifulSoup, DeepSeek API, Docker

7. 🍽️ Restaurant Dish Detection – YOLOv11 Object Detector

Description:
End-to-end object detection pipeline for recognizing 6 classes of dishes and tableware in restaurant settings. Built using YOLOv11 with custom training, evaluation, and video visualization. Data preparation includes manual annotation, augmentation, and class balancing.
Features:
- Frame extraction from video (1 frame / 5 seconds)
- Manual annotation with LabelImg (YOLO format)
- 6 dish categories: tea, salads, kebab, chicken steak, soup, empty dishes
- Data augmentation with Albumentations
- Two-stage training with YOLOv11n and evaluation via mAP, Precision, Recall, F1
- Output visualization in labeled video
- Result analysis with per-class metrics and error matrix
Skills Practiced:
Object detection, dataset creation, model training & tuning, metrics analysis, OpenCV, Albumentations, YOLOv11
STACK:
Python, OpenCV, LabelImg, Albumentations, YOLOv11, PyTorch

⭐ If you find these projects helpful, please give the repository a star and feel free to connect!

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
EDA_MySQL		EDA_MySQL
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊 Projects

Main page

📁 Projects Overview

1. 🔍 Data cleaning and Exploratory Data Analysis (MySQL)

2. 🎬 Movie Correlation Project (Python)

3. 🧪 A/B Test – Marketing Campaign Analysis (Python)

4. 📈 Marketing Cohort Analysis (Python)

5. 📂 CSV Processor – Automated Data Cleaning (Python)

6. 📚 Wikipedia Parser – Async Scraper + AI Summarizer

7. 🍽️ Restaurant Dish Detection – YOLOv11 Object Detector

About

Uh oh!

Releases

Packages

Almas1989/Projects

Folders and files

Latest commit

History

Repository files navigation

📊 Projects

Main page

📁 Projects Overview

1. 🔍 Data cleaning and Exploratory Data Analysis (MySQL)

2. 🎬 Movie Correlation Project (Python)

3. 🧪 A/B Test – Marketing Campaign Analysis (Python)

4. 📈 Marketing Cohort Analysis (Python)

5. 📂 CSV Processor – Automated Data Cleaning (Python)

6. 📚 Wikipedia Parser – Async Scraper + AI Summarizer

7. 🍽️ Restaurant Dish Detection – YOLOv11 Object Detector

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages