Sunday, November 2, 2008

Adventure works 2005- faulty report samples - fixes/solution

Solution for faulty Product line sales:

The Product line sales report has currently for "top employees" the following incorrect query in the dataset TopEmployees:

SELECT TOP 5 C.LastName, C.FirstName, E.EmployeeID, SUM(SOH.SubTotal) AS SaleAmount
FROM Sales.SalesPerson SP INNER JOIN HumanResources.Employee E ON SP.SalesPersonID = E.EmployeeID INNER JOIN Person.Contact C ON E.ContactID = C.ContactID INNER JOIN Sales.SalesOrderHeader SOH ON SP.SalesPersonID = SOH.SalesPersonID INNER JOIN Sales.SalesOrderDetail SOD ON SOH.SalesOrderID = SOD.SalesOrderID INNER JOIN Production.Product P ON SOD.ProductID = P.ProductID INNER JOIN Production.ProductSubcategory PS ON P.ProductSubcategoryID = PS.ProductSubcategoryID INNER JOIN Production.ProductCategory PC ON PS.ProductCategoryID = PC.ProductCategoryID
WHERE (PC.ProductCategoryID = @ProductCategory) AND (PS.ProductSubcategoryID IN (@ProductSubcategory)) AND (SOH.OrderDate > @StartDate) AND (SOH.OrderDate < @EndDate)
GROUP BY C.LastName, C.FirstName, E.EmployeeID, PC.ProductCategoryID, PS.ProductSubcategoryID
ORDER BY SUM(SOH.SubTotal) DESC

The problem is that it is adding up the Sales Order header - for every line on the order - so an order with 10 lines has the Order total multiplied 10 times. The "SUM(SOH.SubTotal)" bits need to be substituted with SUM(SOD.LineTotal).

So the top 5 report should use the following query:
SELECT TOP (100) PERCENT C.LastName, C.FirstName, E.EmployeeID, SUM(SOD.LineTotal) AS SaleAmount
FROM Sales.SalesPerson AS SP INNER JOIN HumanResources.Employee AS E ON SP.SalesPersonID = E.EmployeeID INNER JOIN Person.Contact AS C ON E.ContactID = C.ContactID INNER JOIN Sales.SalesOrderHeader AS SOH ON SP.SalesPersonID = SOH.SalesPersonID INNER JOIN Sales.SalesOrderDetail AS SOD ON SOH.SalesOrderID = SOD.SalesOrderID INNER JOIN Production.Product AS P ON SOD.ProductID = P.ProductID INNER JOIN Production.ProductSubcategory AS PS ON P.ProductSubcategoryID = PS.ProductSubcategoryID INNER JOIN Production.ProductCategory AS PC ON PS.ProductCategoryID = PC.ProductCategoryID
WHERE (PC.ProductCategoryID = @ProductCategory) AND (PS.ProductSubcategoryID IN (@ProductSubcategory)) AND (SOH.OrderDate > @StartDate) AND (SOH.OrderDate < @EndDate)
GROUP BY C.LastName, C.FirstName, E.EmployeeID, PC.ProductCategoryID, PS.ProductSubcategoryID
ORDER BY SUM(SOD.LineTotal) DESC

For the dataset TopCustomers there is an identical mistake
SELECT TOP 5 S.Name AS StoreName, SUM(SOH.SubTotal) AS SaleAmount, PS.ProductSubcategoryID,nPS.ProductCategoryID
FROM Production.Product P INNER JOIN Production.ProductSubcategory PS ON P.ProductSubcategoryID = PS.ProductSubcategoryID INNER JOIN Sales.SalesOrderDetail SOD ON P.ProductID = SOD.ProductID INNER JOIN Sales.Customer CU INNER JOIN Sales.SalesOrderHeader SOH ON CU.CustomerID = SOH.CustomerID INNER JOIN Sales.Store S ON CU.CustomerID = S.CustomerID ON SOD.SalesOrderID = SOH.SalesOrderID
WHERE (SOH.OrderDate > @StartDate) AND (SOH.OrderDate < @EndDate) AND (PS.ProductCategoryID = @ProductCategory) AND (PS.ProductSubcategoryID IN (@ProductSubcategory))
GROUP BY S.Name, PS.ProductSubcategoryID, PS.ProductCategoryID
ORDER BY SUM(SOH.SubTotal) DESC

The correct query is:
SELECT TOP 5 S.Name AS StoreName, SUM(SOD.LineTotal) AS SaleAmount, PS.ProductSubcategoryID,nPS.ProductCategoryID
FROM Production.Product P INNER JOIN Production.ProductSubcategory PS ON P.ProductSubcategoryID = PS.ProductSubcategoryID INNER JOIN Sales.SalesOrderDetail SOD ON P.ProductID = SOD.ProductID INNER JOIN Sales.Customer CU INNER JOIN Sales.SalesOrderHeader SOH ON CU.CustomerID = SOH.CustomerID INNER JOIN Sales.Store S ON CU.CustomerID = S.CustomerID ON SOD.SalesOrderID = SOH.SalesOrderID

WHERE (SOH.OrderDate > @StartDate) AND (SOH.OrderDate < @EndDate) AND (PS.ProductCategoryID = @ProductCategory) AND (PS.ProductSubcategoryID IN (@ProductSubcategory)) GROUP BY S.Name, PS.ProductSubcategoryID, PS.ProductCategoryIDORDER BY SUM(SOD.LineTotal) DESC

I think these work :)

As for the sales territory drilldown the only fix I can see is to do an update query of the Sales Order Header file of the correct line totals from the Sales Order Detail files.

Friday, October 31, 2008

Problems we are having with Adventure works

Product Line Sales:

Top 5 Customers - sales are inflated significantly. Adventure works sales in total (from the Sales order header) are approx $97M

Do a report (Bikes, road bikes, Calendar year 2003) and the Top 5 Sales people sold over $158 millions - does anyone notice this as a problem ? Just a slight reconciliation difference I would think

Or is this report deliberately broken so people can fix it?


Now my next favourite report:

Sales Territory Drill down

The Sales by Order by Header does agree to the sales order total from the 'Drill down' - hmm housten we have a problem

Australia , Lynn Tsoflias Sales order SO51150 $40,274.01 =/= the drilldown $26,328 (after discount of $1,996)

In fact the Adventure works database is broken - Sales order Header subtotal =/= sum(Sales Order Detail . line total) - total discrepancy a cool $17 mill or so. A redememing fact is that the dbo.FactResellerSales table does indeed have the 'correct' sales number.(It reconciles to the Sales order Detail)

ALSO this report is missing sales made by managers - or dont managers count? (they are also missing from the view Sales.vSalesPerson in the

=======================================
So is there something I am completely missing?

Seriously though I am using Adventure works for SQL server 2005 - are these problems that are fixed in the Adventure works 2008? (havent installed yet as I dont have SQL 2008 yet)

Please feel free to contribute to the discussion

Tuesday, September 2, 2008

Our aim of this Blog

Once we have published our work we aim to provide support and answers to all your queries regarding the use of our model on Microsoft SQL Server and reporting in Microsoft Excel and we will also provide more technical details behind our methods for the curious.

In the meantime we are throwing up some topics for discussion that we are passionate about in our journey of building and designing a data warehouse

Computing Efficiency vs Economic Efficiency

Is there still any point to building “Efficient” computer models. Today’s computers are infinitely more powerful than when many of the design guidelines were set.

Which is better – an inefficient computer model that is easy to understand, implement and use or an efficient computer model that is cumbersome, complicated and takes an inordinate amount of time to implement?

I.E. Computing Efficiency vs Economic Efficiency

This is Jane's ask the Experts Question for Tech'ed this year