Customer, Product and Sales Analysis with T-SQL (Part 1)

Quang Nguyen
8 min readAug 1, 2023

--

Source: https://www.winacc.com/sales-analysis-report-benefits/

Introduction:

With the aim of reinforcing and enhancing my SQL expertise, I embarked on this project with a primary goal of discovering insights within raw sales data of an Ecommerce company and delivering results that drive informed business decisions. By assuming the role of a data analyst, I harnessed the expressive power of SQL queries to conduct thorough analyses on transactional records, marketing channels, product and and customer details. As a result, I provided upper meaningful values and actionable findings to upper management, empowering them to propel the business to new heights of success.

For this project, all datasets utilized are sourced from AdventureWorks, curated and provided by the Maven Analytics team. The overall action items that need to be completed are outlined as follows:

> Build a relation model from different data files.

> Formulate queries to track KPIs (sales, revenue, profit, return).

> Compare regional performance, identify product-level trend.

> Identify top customers based on relevant metrics.

> Create procedures for managers to quickly retrieve desiredresults.

Data Description:

The seven tables above were the foundation of my work. To create a well-structured database, I employed the Star Schema method, with AW_Sales and AW_Returns serving as the central tables. The other tables acted as lookup tables, facilitating connections between diverse data entities. This approach provided me with the flexibility to work with all the fields effortlessly, allowing for precise calculations and accurate results.

Another important aspect to highlight is that AW_Sales was composed of three smaller sales tables, each representing sales activities in the years 2020, 2021, and 2022. The decision to merge them into a single table significantly improved data management and query performance. However, it is crucial to exercise caution as the consolidated table may become excessively large and challenging to handle. Below is the query I executed to create the AW_Sales table:

SQL Queries/ Analysis/ Results and Findings:

[Task 1]: show the top 10 products by number of orders and order quantity. For each product, provide its full name and indicate the specific category to which it belongs.

[Solution]:

[Findings/Insights]: the products above have demonstrated remarkable performance for the company consistently over a span of 3 years. Notably, the majority of these products fell under the “Accessories” category, and their order quantity was twice the number of individual orders placed. This indicated that these products werefrequently sold in bundles or packs containing more than one item. This valuable insight provided the team with a comprehensive understanding of customer preferences and opened up possibilities for identifying potential revenue streams through cross-selling or upselling strategies.

[SQl Tips]: for the query above, I utilized sub-query and multiple join statements to link three tables together. Establishing a join between the Sales and Product tables was relatively straightforward, given their shared key identifier. However, the process became more difficult when I needed to incorporate the category field from the Product Category table, which lacked direct communication with either the Sales or Product table. To resolve this, I had to leverage the connection that the Product Category table shares with the Product Subcategory table to establish the necessary links with Sales and Product.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

[Task 2]: in comparison to top products, identify products that have the highest return rate over the years. Calculate return rate by dividing the number of units returned by the number of units sold, multiplying by 100.

[Solution]:

[Findings/Insights]: the featured products exhibited the highest return rate, indicating that their quality fell short of meeting customers’ expectations. As expected, their total order quantities were comparatively low. With this information, the company could implement essential measures to mitigate these rates. These steps may involved assessing the overall product quality and design, engaging in customer communication to understand their experiences, and promoting alternative product options.

[SQL Tips]: once again, I implemented a sub-query method to generate a distinct view containing details about each product and their respective return quantities. After completing this step, I linked it to the main Sales table to extract the order quantity values. When computing the return rate, I appropriately converted the data type of the calculated fields using the CONVERT()and CAST() functions to ensure the final return rate was displayed in a readable format. This conversion was necessary to avoid a scenario where integer values are divided by each other, resulting in a return rate of 0.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

[Task 3]: The 30oz Water-Bottle has been the company’s top-selling product in both 2020 and 2021. Currently in 3rd quarter of 2022, the team aims to identify the months when this product has generated the highest revenue and predict the sales trend for the remaining of the year. The information would enable the development of a viable sales strategy to maximize sales opportunities .

[Solution]:

[Findings/Insights]: the revenue of the 30oz Water-Bottle has been consistently growing throughout 2022. The most significant increase in revenue occurred when transitioning from the first quarter (January-March) to the second quarter (April-June), with a nearly 6% rise in revenue. Notably, June contributed the most to the overall revenue, accounting for 17.95%. With this positive trend, it was anticipated that revenue would continue to expand in the upcoming third and fourth quarters of 2022.

[SQL Tips]: In this query, I utilized the functions OVER()and PARTITION BY() to add a new column called SumTotal Revenue. This column accumulated the revenue generated by the 30oz Water-Bottle product for each month and displayed the value in every row. This approach allowed me to examine the individual monthly contributions to the total revenue, helping me identify peak seasonality and make sales trend forecasts for the rest of the year.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

[Task 4]: Present the monthly revenue trend and illustrate the cumulative total for each year. The finance team seeks to observe the revenue distribution to make suitable asset allocations.

[Solution]:

[Findings/Insights]: the Running Total column represented the cumulative sum of Total Revenue up to each corresponding month, demonstrating a consistent upward trend in sales revenue over the three-year period. Notably, the revenue growth rate was more pronounced in 2021 and 2022 compared to 2020, suggesting an accelerated pace of revenue generation in these subsequent years. Additionally, the data revealed a recurring seasonal pattern, with revenue peaking around mid-year (June and July) across all three years. Such trends could be attributed to factors like higher consumer spending during the summer months or specific marketing campaigns during that time.

[SQL Tips]: Using OVER() with PARTITION BY() and ORDER BY() is an effective approach for calculating running totals. In this specific query, my goal was to compute the running total of revenue up to the corresponding month and reset it at the beginning of a new year. To achieve this, I partitioned by by Year and then ordered by both Year and Month

However, if the objective is to display the total revenue for each month across different years (e.g., 01/2020, 01/2021, 01/2022) and its corresponding running total, I needed to make a slight adjustment. The partitioning then would be changed to PARTITION BY Month to group the data for each specific month while, the order by clause can remain the same to maintain the correct ordering within each month.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

[Task 5]: the marketing team leaders are looking for an efficient way to swiftly retrieve customer information using their CustomerKey. The desired information includes details such as first name, last name, gender, email address, occupation, and the total number of orders they have made within the last six months.

[Solution]:

[Finding/Insights]: Thanks to the implemented procedure, the company could now effortlessly access customer data without the need for complicated SELECT statements. By simply running the EXECUTE function, the company could retrieve essential customer information. This standardized approach offered both convenience and flexibility, enabling the integration of valuable data in various operations and applications.

The procedure proved particularly valuable for customer segmentation, categorizing customers into groups based on their order frequency. This, in turn, facilitated targeted marketing campaigns and strategies to maximize profits. Customers with lower order numbers could receive specific promotions to encourage increased sales, while those with higher order numbers may be engaged differently to maintain their loyalty and continued business.

[SQL Tips]: PROCEDURE() serves as an optimal means of implementing a precompiled and reusable SQL block with logic. This facilitates easy access for diverse users and applications, allowing them to obtain desired results with a single line of code. Personally, I found it to be one of the most valuable tools in T-SQL, and I have consistently utilized it. However, it is still important to be mindful when creating a procedure, ensuring the declaration of appropriate variables with accurate data types. Failure to declare variables correctly may result in the procedure not running at all or yielding inaccurate results.

CONCLUSION:

With part 1 of the project now completed, the 2nd article will come out soon addressing even more business inquiries with T_SQL. I must say that I am thoroughly enjoying this project. It has provided me with a unique experience, allowing me to immerse myself in the business’s journey as if I were a genuine member of the team. The analyses I conducted have provided valuable insights into the company’s evolution and operations. Moreover, this project has afforded me the opportunity to apply and refine my SQL skills in a simulated real-world context.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Quang Nguyen
Quang Nguyen

Written by Quang Nguyen

Data enthusiast who loves telling stories with data. Eager to acquire new knowledge and share expertise with others!

No responses yet

Write a response