Supermarket Store Branches Sales Analysis

Correlation Heatmap of Supermarket Data
Top 5 Stores by Sales: Sales & Customer Count Comparison
Bottom 5 Stores by Sales: Sales & Customer Count Comparison
Top 5 Stores by Customer Count: Sales & Customer Count Comparison
Bottom 5 Stores by Customer Count: Sales & Customer Count Comparison
Scatter Plot: Store Sales vs. Daily Customer Count
Distribution of Key Supermarket Metrics (Histograms)

Project Overview & Problem Statement

This project focuses on analyzing sales data from various supermarket store branches to understand the factors influencing their performance. The primary objective was to investigate the dependence of store sales on specific attributes like store area, number of items available, and daily customer count. The ultimate goal is to identify areas for improvement and leverage successful strategies to boost overall profitability for the supermarket company.

Dataset

The analysis utilized the "Stores.csv" dataset, comprising 896 entries and 5 columns. Each entry represents a unique supermarket store branch, providing information on:

  • Store_ID: Unique identifier for each store.
  • Store_Area: Physical area of the store in square yards.
  • Items_Available: Number of different items stocked in the store.
  • Daily_Customer_Count: Average number of customers visiting the store daily over a month.
  • Store_Sales: Total sales generated by the store in US dollars.

Initial data inspection confirmed no missing values, ensuring a clean dataset for analysis.

Methodology & Approach

The analysis followed a structured approach:

  1. Data Loading & Initial Inspection: Loaded the dataset and performed basic checks (.head(), .info(), .describe()) to understand data types, structure, and statistical summaries.
  2. Correlation Analysis: Calculated and visualized the correlation matrix to understand linear relationships between key numerical variables, particularly focusing on sales drivers.
  3. Relationship Visualizations: Generated scatter plots with regression lines to visually inspect the relationships between Store_Sales and Store_Area, Items_Available, and Daily_Customer_Count.
  4. Distribution Analysis: Created histograms for all key numerical features to understand their spread, central tendency, and identify outliers.
  5. Performance Extremes Identification: Identified and analyzed the top and bottom 10 stores based on Store_Sales and Daily_Customer_Count to pinpoint specific successes and challenges.
  6. Comparative Visualizations: Created dual-axis bar charts to visually compare sales and customer counts for these extreme performing stores.

Key Findings

The analysis yielded several crucial insights, some of which challenge common assumptions:

  • Weak Link Between Customer Count and Sales: Our analysis consistently revealed a very weak to almost non-existent linear correlation (correlation coefficient of ~0.009) between Daily_Customer_Count and Store_Sales. This was evident in the correlation matrix, scatter plots, and the comparison of top/bottom performing stores. This is a critical finding that challenges the intuitive assumption that more customers automatically mean higher sales.
  • Limited Impact of Area/Items on Sales: Both Store_Area and Items_Available showed only a very weak positive correlation with Store_Sales (approx. 0.097 and 0.099 respectively). This suggests that simply increasing store size or the number of items available is not a primary driver of higher sales. Notably, Store_Area and Items_Available are extremely highly correlated (~0.999), indicating that a larger area almost perfectly corresponds to more items.
  • The Prominence of Average Transaction Value: The most significant finding is that the average transaction value per customer (how much each customer spends per visit) appears to be a far more critical determinant of sales performance than the sheer volume of daily visitors. This was evidenced by stores with low customer counts achieving high sales, and stores with high customer counts showing only average or even low sales.
  • Outlier Identification: Store 40 was identified as a significant outlier with a remarkably low daily customer count (10 customers), warranting specific investigation. Conversely, some stores achieved high sales with similarly low customer numbers, indicating a unique success model.

Actionable Recommendations

Based on these findings, here are strategic recommendations for the supermarket company to increase profits and improve store performance:

  1. Shift Focus from Volume to Value (for low-sales, high-customer stores):

    For stores with High Daily Customer Count but Low Sales (e.g., Store 32, Store 373), implement strategies to increase average transaction value. This could involve staff training on upselling/cross-selling, optimizing product placement for impulse buys, offering bundle deals, or loyalty programs rewarding larger purchases.

  2. Investigate and Replicate High-Sales, Low-Customer Models:

    For stores with High Sales despite Low Daily Customer Count (e.g., Store 877, Store 888), conduct case studies to understand what makes them successful (e.g., product mix, catering to high-value segments, operational efficiencies). Identify transferable best practices to replicate elsewhere.

  3. Address Extreme Underperformance (Store 40):

    A direct, on-site investigation into Store 40 is crucial. Determine if its exceptionally low customer count is due to data anomaly, a new store, severe operational issues, or location challenges. Develop a specific rectification plan or consider strategic alternatives.

  4. Optimize Store Area and Item Availability (Strategic, Not Primary):

    Recognize that increasing store area or items available alone is not a primary sales driver. Focus on maximizing sales efficiency within existing spaces and inventory, avoiding unnecessary expansion, and ensuring optimal stock to meet demand without overstocking.

  5. Focus on Marketing for Value, Not Just Volume:

    Tailor marketing efforts to attract customers who are likely to spend more, or to incentivize existing customers to increase their average basket size, rather than just focusing on increasing general footfall.

By shifting the focus from merely counting customers to understanding and enhancing the value of each customer interaction, the supermarket company can develop more targeted and effective strategies to increase its overall profitability.

For the full code and to explore the project structure, please visit the GitHub repository.

Project Information

Tools & Technologies

  • Python
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Jupyter Notebook

Contact

Location

Lagos, Nigeria

Call me

+(234) 916 709 1342

+(234) 802 554 5280

Email me

Onoriose1@outlook.com