ADC - Innovative Data Solutions
The strange nature of the XMLA Drop Command
During the past few weeks I have spent a great deal of time writing XMLA. I have come across some unexpected issues and behaviours, some which can be explained and some which I currently have no explanation for.
Generally I have to admit that I am fairly impressed with XMLA, although I do find some aspects of it frustrating, perhaps some this is due to the fact that XMLA has been around for so long although some of the design seems illogical.
My first encounter with Analysis Services (SSAS) was with the 2000 version. I recall encountering so many issues... we were running on 32 bit machines and found that it was impossible to process our account dimension, which held around 3 million members. When we first hit this issue we were able to use the 2 gb switch to give a temporary reprieve but ultimately it became apparent that the product did not scale in terms of it’s MOLAP offering. The issue in this case being the fact that incremental dimensional process was really just a marketing term. In reality it pulled all of the records from the relational dimension table and then compared these with the current olap dimension and inserted / updated and deleting the data where necessary. In effect it said “select * from sourceTable”.
Since we realised that the product couldn’t scale the decision was made that we would only hold a fixed amount of data and create stored procedures which would purge older facts and no longer required dimension members.
Things have improved a great deal in the 2005 and 2008 offerings and 64 bit servers and now also much more ubiquitous. On the current project we will be deploying our analysis services cubes on a server with 256Gb of RAM and 32 cores. With this kind of computing power you could hold a vast amount of data in memory, unfortunatley there is now easy way to pin SSAS partitions in memory, indeed Microsoft chose to remove the pin table feature from recent version of SQL Server.
So generally things have improved a great deal, however when you have a very demanding user base and tight SLAs to adhere to there is always the need for compromise. One of the unfortunate issues with using MOLAP cubes is that any structural change to a dimension requires that dimension needs to be fully processed, which has the knock on effect of invalidating (deleting) any cube partitions associated with that dimension.
The fact that all of your historical data is wiped out until you can reprocess really is not ideal. Due to this it is fairly common to build a system which is a compromise. Recent records are held in MOLAP and historical data is held in ROLAP.
One other annoyance is the fact that page level dimension members are treated as separate entities in relation to pivot tables. This is obviously done for performance reasons, however it does mean that many members are displayed which have absolutely no relation to other filtering attributes.
Therefore if you are going to have a mixed architecture of MOLAP and ROLAP then it might be worth splitting these into separate cubes, so that when you display dimension members in the MOLAP pivots you are displaying the more relevant members, yes they still will not be inter-related but the very fact that the dimension is smaller means less combinations are shown.
Given this requirement how can you delete dimension members? Well you could do this act the relational level, and process the dimension but this would also affect your relational ROLAP model and in any case would be a compromise, since the set of dimension members required to support the last month of data will be less than that required to support the historic data.
Within XMLA there is a delete command, so the command below can be used to delete the elements from an Analysis Services Dimension.
<Drop xsi:type="Drop" xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2"
xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"
xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Object>
<Database>Warehouse</Database>
<Cube>Sales</Cube>
<Dimension>Client</Dimension>
</Object>
<Where>
<Attribute>
<AttributeName>Account Number</AttributeName>
<Keys>
<Key xsi:type="xsd:integer">5000</Key>
</Keys>
</Attribute>
</Where>
<DeleteWithDescendants>true</DeleteWithDescendants>
</Drop>
So what is so weird about this command ? Well firstly the fact that is was necessary to specify a cube. A dimension can be shared across multiple cubes so it seems kind of illogical that it is necessary to specify a dimension. However this is the only way I could get the command to work, without a cube being specified the command refused to execute. Perhaps this is some kind of check to validate if data is held in the cube which could be orphaned by deleting the dimension key, that would seem like a reasonable explanation.
The other thing is that the command actually deleted the element from the underlying dimension table in an Oracle database. This is exactly the kind of behaviour which we were trying to avoid. This is indeed surprising since the dimension is configured to be held as MOLAP and is not a write-back enabled dimension, so why was the data removed from the warehouse ?
At some point in the next week or two I will come back and investigate the subject a little further, perhaps then I will be able to shed some light on the subject.
Friday, 23 October 2009
XMLA Drop Command - Deleting Dimension Members