Skip to content

zanyarsalami/ecommerce-analytic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

E-Commerce Analytics Case Study (Brazilian Marketplace Dataset)

This project is a comprehensive SQL + Python case study using a real-world e-commerce dataset to showcase business intelligence, data analysis, and storytelling skills. It is designed to demonstrate end-to-end capabilities for potential clients on UpWork or hiring managers.

📦 Dataset

🎯 Goals

  • Analyze growth trends: revenue, order volume, and average order value
  • Segment customers using RFM analysis
  • Assess retention and churn behavior over time
  • Deliver visual insights and business-ready metrics

🧰 Tech Stack

  • PostgreSQL
  • Python (Pandas, Matplotlib, Seaborn)
  • Jupyter Notebook
  • VSCode (SQLTools Extension)

🔍 Key Analyses

1. Growth Trends

  • Tracked total revenue and order volume by month
  • Calculated and visualized average order value over time
  • Output saved as monthly_revenue.csv, monthly_order_count.csv, and monthly_avg_order_value.csv
SELECT
    TO_CHAR(o.order_purchase_timestamp, 'YYYY-MM') AS order_year_month,
    ROUND(SUM(oi.price + oi.freight_value), 2) AS total_revenue
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
WHERE o.order_status = 'delivered'
GROUP BY order_year_month
ORDER BY order_year_month;

Monthly RevenueFigure 1: This line chart displays the total revenue generated from delivered orders over time. It highlights seasonal fluctuations, growth patterns, and potential dips in sales performance.

2. Customer Segmentation (RFM)

  • Recency, Frequency, and Monetary scores calculated using PostgreSQL
  • Customers classified into business segments:
RFM Profile Segment
R ≥ 3, F ≥ 3, M ≥ 3 Champions
R ≥ 3, F ≥ 2 Loyal
R = 4, F = 1 New Customers
R ≤ 2, F ≥ 3 At Risk
M ≤ 2, F ≤ 2 Low Value
else Other
  • Segment-level metrics saved and visualized

  • Outputs: customer_segments.csv, segment_metrics.csv, and segment_analysis.png

Customer SegmentationFigure 2: The chart illustrates how different customer groups contribute to business value — helping identify high-volume vs. high-value targets like Champions and Loyal Customers, while spotlighting low-revenue or at-risk groups for strategic re-engagement.

This visual combines:

  • 🔷 Blue bars: total number of customers in each segment
  • 🟧 Orange bars: total revenue (in thousands of BRL) generated by that segment
  • 🟢 Green line: average spend per customer in the segment

3. Churn & Retention

  • Built cohort-based retention analysis by month
WITH customer_orders AS (
    SELECT
        c.customer_unique_id,
        o.order_id,
        TO_CHAR(o.order_purchase_timestamp, 'YYYY-MM') AS order_month
    FROM orders o
    JOIN customers c ON o.customer_id = c.customer_id
    WHERE o.order_status = 'delivered'
),

cohorts AS (
    SELECT
        customer_unique_id,
        MIN(order_month) AS cohort_month
    FROM customer_orders
    GROUP BY customer_unique_id
)

SELECT
    c.customer_unique_id,
    ch.cohort_month,
    cu.order_month
FROM customer_orders cu
JOIN cohorts ch ON cu.customer_unique_id = ch.customer_unique_id
JOIN customers c ON c.customer_unique_id = cu.customer_unique_id;
  • Created a retention heatmap showing repeat behavior over time

HeatmapFigure 3: This heatmap visualizes customer retention by monthly cohorts. The low-intensity values across most cohorts confirm that a majority of customers made only a single purchase and did not return in subsequent months. This insight highlights a key business challenge: limited repeat engagement, suggesting potential for targeted retention strategies or loyalty initiatives.

📈 Visualizations

All key trends and patterns are visualized using matplotlib and seaborn for easy interpretation and presentation.

📂 Folder Structure

project-root/
│
├── data/                     # Raw CSV files from Kaggle
├── sql/
│   ├── create_tables.sql     # SQL schema creation
│   └── query_tables/         # Output CSVs and visual assets
├── notebooks/
│   └── ecommerce_analysis.ipynb  # Full end-to-end notebook
└── README.md
│
└── figs/                     # Saved plotted figures

✅ Key Skills Demonstrated

  • SQL-based data modeling and analytics
  • Python data wrangling and visualization
  • RFM segmentation and cohort analysis
  • Business insights communication
  • Reproducible project structure for clients or employers

🧠 Use This Project To:

  • Showcase data storytelling on UpWork
  • Submit a polished example in data analyst interviews
  • Customize for future marketplace or CRM analytics projects

Created with 💼 job-readiness in mind.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published