Good example of MDX vs SQL for analytical queries

MDX and SQL are in no way the same, and often not even comparable, as they are querying multidimensional and relational databases respectively. You cannot query your existing relational database with MDX.

The main advantage of using a multidimensional model and using MDX to query it is that you are querying pre-aggregated data and that MDX is optimized to query in a statistical way rather than a relational way. You no longer query rows and tables to produce a flat result set but you are using tuples and sets to slice and aggregate a multidimensional cube.

Think of it like this: if you use a SQL query to get the total sales amount for a particular item group you would need to write a query that sums up all invoice lines for all items in the item group. If you are using a cube and have aggregations on the item group level the result is calculated during processing and the aggregations are stored for each item group, making queries instantaneous.

Multidimensional and MDX is an entirely different concept from relational set-based SQL.

Your example might become a lot simpler because you would be doing the transformations such as the date parsing during your data load process and your last month comparison could be a calculated measure. Your seoul average and today could be calculated members

If your cubes are well designed for your requirements I believe you could be slicing and dicing your example's data set without even needing to write queries but do it in a pivottable or another analysis tool.

Then again there is no "just rewriting SQL in MDX". It requires a fair bit of knowledge to do it right and a different mindset. Think venn-diagrams instead of result sets.

To provide you with an example using the adventureworks database, imagine the requirement to list the number of sales orders by customer in the category bikes.

If you did that using SQL you would need to write a query that counts the number of sales orders containing a line with a product that happens to be of the category bikes and join that to the customers table, so that would become a fairly complex query.

-- need distinct count, we're counting orders, not order lines
SELECT count(DISTINCT soh.salesorderid)
    ,pers.FirstName + ' ' + pers.LastName
FROM sales.SalesOrderDetail sod
-- we need product details to get to the category
INNER JOIN Production.Product p ON sod.ProductID = p.ProductID
-- but we need to pass via subcategories
INNER JOIN Production.ProductSubcategory psc ON p.ProductSubcategoryID = psc.ProductSubcategoryID
-- we finally get to the category
INNER JOIN Production.ProductCategory pc ON psc.ProductCategoryID = pc.ProductCategoryID
-- we also need the headers because that's where the customer is stored
INNER JOIN sales.SalesOrderHeader soh ON sod.SalesOrderID = soh.SalesOrderID
-- finally the customer, but we don't have his name here
INNER JOIN sales.Customer c ON soh.CustomerID = c.CustomerID
-- customers
INNER JOIN Person.Person pers ON c.PersonID = pers.BusinessEntityID
-- filter on bikes
WHERE pc.Name = 'bikes'
-- but the customers table doesn't contain the concatenated name
GROUP BY pers.FirstName + ' ' + pers.LastName;

In MDX (provided your cube is well designed for this requirement) you could just write because the logic and complexity has moved elsewhere:

SELECT [Measures].[Internet Order Count] ON COLUMNS,
[Customer].[Customer].Members ON ROWS
FROM [Adventure Works]
WHERE [Product].[Product Categories].[Category].[Bikes]

OLAP Cubes/databases have the following characteristics:

  • Obtain already aggregated information according to the needs of the user.
  • Easy and fast access
  • Ability to manipulate the aggregate data in different dimensions
  • A cube uses classical aggregation functions min, max, count, sum, avg, but can also use specific aggregation functions.

MDX versus SQL :

MDX is made to navigate the multidimensional databases and to define queries on all their objects (dimensions, hierarchies, levels, members, and cells) to obtain (simply) a representation of pivot tables.

MDX uses many identical as SQL keywords, like SELECT, FROM, WHERE. The difference is that SQL produces relational views while MDX produces multidimensional views of data.

The difference is also seen in the general structure of the two languages:

SQL query: SELECT column1, column2, ..., column FROM table
MDX query: SELECT axis1 ON COLUMNS, axis2 ON ROWS FROM cube

FROM specifies the data source:
In SQL: one or more tables
In MDX: a cube

SELECT indicates the results desired to recover by the query:

In SQL:

  • A view data in two dimensions (rows and columns)
  • Rows have the same structure defined by columns

In MDX:

  • Any number of dimensions to form the query results.
  • The term axis used to avoid confusion with the cube dimensions.
  • No special meaning to the rows and columns, but you have to define each axis: axe1 defines the horizontal axis and axis 2 defines the vertical axis.

MDX query example : enter image description here

Measures : Unit Price, Quantity, Discount, SalesAmount, Freight
Dimension : Time
hierarchy : Year > Quarter > Month > with members :

  • Year: 2010, 2011, 2012, 2013, 2014

  • Quarter: Q1, Q2, Q3, Q4

  • Month: January, February, March, …

Dimension : Customer
hierarchy : Continent > Country > State > City with members :

  • City: Paris, Lyon, Berlin, Köln, Marseille, Nantes …

  • State: Loire atlantique, Bouches du Rhône, Bas Rhin, Torino…

  • Country: Austria, Belgium, Danmark, France, ...

  • Continent level: Europe, North America, Sud America, Asia

Dimension : Product
hierarchy : Category > Subcategory > product with members :

  • Category : Food, Drink …
  • Food category: Baked_food …

Tags:

Olap

Mdx