Chris Webb is an independent consultant specializing in SQL Server Analysis Services, in particular the MDX query language. He’s coauthor of MDX Solutions with Microsoft SQL Server Analysis Services 2005. Chris blogs regularly on BI topics at cwebbi.space.live.com.
Greg Low: Introducing Show 37 with guest Chris Webb.
Our guest today is Chris Webb. Chris is an independent consultant specializing in SQL Server Analysis Services, in particular the MDX query language. He’s coauthor of MDX Solutions with Microsoft SQL Server Analysis Services 2005. Chris blogs regularly on BI topics at cwebbi.space.live.com. Welcome Chris.
Chris Webb: Hi Greg.
Greg Low: First, how did you come to get involved with SQL Server?
Chris Webb: Happened about ten years ago. I’m in my early 30s now. About ten years ago I was in my first IT job. I messed around with DB6 like others did then. I got put on a project. I was working for a large market research company in London. Market research always interested in BI type topics. This company has long history using OLAP databases. They had many proprietary in house systems. As all IT departments do, they were wondering about getting rid of those solutions, moving to off the shelf. Project I was put on was to evaluate every OLAP database on the market. Cognos, MicroStrategy, all the usual names. We’d been working with Nigel Pendse, a well known guy in OLAP world. Guy behind the OLAP reports and BI survey. He said that Microsoft was coming out with OLAP database bundled with SQL Server 7. We extended the project and got onto the beta program for SQL Server 7 and started looking at OLAP services. This company had long OLAP history and had stringent requirements. We had 60 page document of things we wanted OLAP tools to do that all preceding tools hadn’t been great at. We spent six months learning this to produce queries we wanted. It passed more or less. Good enough. We carried on with it and when SQL Server 2000 had come out, it was the tool we wanted to work w.
Greg Low: In Version 7, what were the main things missing?
Chris Webb: Version 7 was a long time ago. Hard to remember what was missing in OLAP services. Things missing like parent child hierarchies. That was key. Ability to load data in its higher and least level dimensions. Non aggregatable data only loaded with parent child hierarchy. We had a lot of that. That was a key feature of why we hung on for Analysis Services. I’m sure other things were missing. As soon as we came on scene, we were in in a big way and developed our own client tools. I used my VB6 skills. Wrote client tool for Excel. Built production system that when I left was spitting out 3,000 keys a month. I was lucky to get a lot of good Analysis Services and MDX experience from the days when very few people knew MDX.
Greg Low: Adam Cogan, a regional director locally jokes that how few people have learned MDX. He lists two or three on his hand as the ones that really know. Small number.
Chris Webb: One thing I say is it’s impossible to know MDX and SQL at same time. Mutually exclusive. I’m fortunate I learned MDX at beginning of my career. I don’t know that much SQL now. I’ve grown up with MDX brain rather than SQL.
Greg Low: When I first looked at Analysis Services, 2000 version, first take I had was tooling didn’t look like a Microsoft tool. Not Microsoft look and feel. In 2005 had rewritten tooling and has much more Microsoft feel.
Chris Webb: Yes. I guess that’s true. You might be able to say that was bought in late 1990s. Substantially developed before released as OLAP. The core team from Panorama and already worked on until Analysis Services 2005 were ex-Panorama people. Tight knit team. Clever people.
Greg Low: I get the feeling that when products were bought in, might be so many years of using Microsoft products, always have look and feel to them. Analysis services stood out when I first used tooling. Didn’t feel like Microsoft thing. There are others. I thought same on network protocol analyzer. Bought in that had non-Microsoft feel. Today whenever I go to Visio, I think it’s the interaction of user interface, I get non-Microsoft feel. In 2005 they did good job on tooling. Certainly in 2008 tooling has had another facelift, particularly best practice design tools.
Chris Webb: Certainly true. Coming from SQL Server world, might be frustrating about Analysis Services, you live in parallel universe to SQL Server. Always looking at what the SQL Server guys have. Couple years more advanced in terms of product maturity run. Example. I was looking at RC Zero Analysis Services today, SQL Management Studio. I right clicked and saw policy based management for Analysis Services. I’ve read about in SQL Server. I thought great, I didn’t know was there. Checked it out. SQL Server has list of things you can create policies about. Properties like tables, servers, databases. I was looking at facets for Analysis Services. One. This was a list of five options you could change in old surface area configured for SQL Server 2005. Maybe next release we’ll get something good.
Greg Low: What are your thoughts on that? In 2008, removal of the surface area configuration tool. Discussion seems to be that people will be able to do via policy based management instead. I think people won’t look at management via policy who would have found surface area configuration useful.
Chris Webb: I’ve never come across anyone that realized surface area configuration worked for Analysis Services. I’m not sure will make much difference. From BI point of view, having everything in SQL Management Studio, more will work there.
Greg Low: Topic for today is MDX. Given that is still common and so few people have spent time with MDX, what’s best starting point for someone trying to get their head around it?
Chris Webb: I do training on MDX. I give a lot of thought to this. Best way to learn MDX is on a project. Learning MDX through theory or books is waste of time. You need to get your hands dirty with it, do over long period of time. Way of thinking more than anything else. Most people who have been in IT a couple years think anything new they need to learn is just new info, facts. Nothing difficult to get head around. When first learning programming, had to learn loops, variables. Most people in IT know the concepts. Then they get to MDX and have problem. It’s a whole new way of thinking. People struggle with that. Not something you can learn in class environment. Need to learn through practice, doing things, getting into way of thinking. If you’re interested in Analysis Services, get data that you think would look good in cube. Build cube then build queries and calculations on it and try to get it to do what you want.
Greg Low: Ensure what you get out is what you’re aiming to get out.
Chris Webb: Yes. Most cases, people can write to SQL to get data out. Want to write to MDX and cross check SQL to MDX. I am believer in people needing to get hands dirty with MDX. Practical examples.
Greg Low: Do you think syntax SQL-like was wise?
Chris Webb: I think personally, they might have been better off changing some of the keywords to avoid confusing people with similarities. Select statement isn’t bad. What is confusing is MDX where clause. SQL background people see as way of filtering something. Common question is people trying to use MDX where clause to filter. Catch same structure of query that changed keywords to use slice rather than where, might have made easier to learn MDX.
Greg Low: OK. Let’s look at the best references for someone attempting to learn? George Spofford’s book?
Chris Webb: George is lead author on MDX Solutions which I coauthored with other people. That’s the only book that deals with Analysis Services 2005 MDX in any depth. Most Analysis Services have a chapter or two on MDX. If you want to get into any depth with MDX, MDX Solutions is one place to go.
Greg Low: One that everyone refers to.
Chris Webb: Also a book, Fast Track to MDX. More of a tutorial. Co written by Mosha Pasumansky and Mark Whitehorn. Unfortunately, hasn’t been updated to for Analysis Services 2005. In a couple key areas it is difficult to follow if trying to use Analysis Services 2005. Basic concepts the same, but dimension changes between Analysis Services 2000 and 2005 might confuse people. Also great series of articles by Bill Pearson. He has put out hundreds of MDX articles and its uses. Another great place to start.
Greg Low: Apart from attending class with someone like you.
Chris Webb: Obviously.
Greg Low: Indeed. If we start with basic concepts of MDX, can you describe purpose of language and what it attempts to do?
Chris Webb: Why it exists in first place? Why not use SQL? Deserves discussion. OLAP world, open question about need for separate query language at all. Oracle and OLAP option, they’ve avoided using MDX, likely political reasons. They’ve tried to extend SQL. I argue that there is justification in separate query language. If you’ve built cube, nothing more than layer of extra metadata over data, you want language that allows you to access the metadata. A common BI query or calculation. Previous period growth calculation. Right now, June 2008. Do you want to write calculation that takes current month sales and previous month sales and find difference? Very basic calculation for BI reporting environment. In itself, difficult to do in SQL. Some people make good living showing how to do in SQL. SQL has problem understanding current month, previous month. No layer of metadata in relational database that says this is current month, this is previous month. You have rows in a table. In a cube, when you look at month June, assuming you’ve built cube and dimension correctly, you can ask what the month before June is and always get the answer April. Hierarchy’s, levels, built in language. Reflects metadata you’ve built in cube. In MDX, language allows you to access the metadata, members in order on a level, that they have neighbors, go up and down hierarchy. June 2008 is in Quarter 2 2008, Year 2008. You can go down hierarchy. Dates from first of June 2008. Concepts. Hierarchy built in language that you can access. Big justification for MDX, that you can get at the metadata from in the language without jumping through hoops.
Greg Low: Yes. Other thing is simple concepts like a day, there’s always whose day are we talking about? Might seem like a simple problem. Aligning with physical calendar. Most orgs have concept of point in day when it becomes the next day. Selecting dimensions appropriately or configuring time dimensions might not directly relate to physical calendar as you expect. Access to metadata with concept of what business calls quarter, clearly important.
Chris Webb: Yes. Using concept of dates is off putting. You can do in SQL. If you’re looking at product dimension, category food contains subcategories canned, fruit, vegetables. That you can build product dimension in structure and access in MDX, you could never do in SQL without jumping through many hoops.
Greg Low: That’s the point. Number of hoops you have to jump to do.
Chris Webb: Exactly. Nothing inherently you can do in MDX you can’t do in SQL. Just the right tool for the job. If you’re competent in both SQL and MDX in BI environment, majority of queries and calculations you write, you probably would write in MDX.
Greg Low: That’s the argument as to why MDX exists. Some basic MDX concepts?
Chris Webb: Basic MDX concepts relate to structure of cube. We know about dimensions, that they’re made up of hierarchies, that they’re made of levels, that they’re made of members. If we’re thinking types of dimensions in our cube, might be time, product, customer. Vague ways to group data. Inside each dimension are specific hierarchies. These are specific ways to group. Time dimension might have years, months, dates. On customer dimension, might have address, customer name. Product dimension might have factories, product codes. Of course, multi-level hierarchies where we can group different attributes of dimension into multi-level drill pass. Within hierarchies there are levels. Grouping levels. Time dimension might have hierarchy with year, quarter, month, date. On each level, important MDX concept, concept of member. Examples are individual dates, like 20 June 2008 or year 2008 or customer Chris Webb or country U.K. These are distinct entities you’d find in attributes from dimension table. If you did SQL group query, individual distinct values you’ve grouped by. All of these concepts come from the concepts received from structure. Other two important entities are couples and sets. A couple is like coordinate to a cell in cube. Cube is a big multi-dimensional array. Dimensions are in the array. If you want to get value in cube, you need way to work out address of that cell in array. That’s what a couple is. An address for a cell in the cube. List of member references from different hierarchies. You put together in common delimited list giving you cell within cube with value you can access. Other main concept in MDX is that of sets. A set is an ordered list of members or couples. Could be list of years 2001, 2002, 2003, or could be set of customers Chris Webb, Greg Low. Those are the three main concepts. Members, sets, couples. Have to understand before you can get anywhere with MDX. Everything follows from there. I find a lot of people can read MDX and understand what’s going on, but unless they understand the concepts, they can hack stuff together that works, but don’t really understand what’s going on until you are happy with ideas of members, sets, couples. Then you can progress with MDX.
Greg Low: One question that comes up is do you think any of the standard design tools that output MDX, are you happy with the MDX they generate? If I look at standard relational SQL, many query tools I cringe at. Do you have that reaction with any of the tools in the Microsoft suite that output MDX?
Chris Webb: Common thing. Everyone, when they look at MDX, client tools, rubbish. They think could write better. Being on other end, having written MDX generation tool, I know it’s hard. Now days, current tools often have to write MDX that supports Analysis Services 2005 and also Analysis Services 2000. Tools might try to support other flavors of MDX as well. SAP BW MDX. You end up with design compromises in how things done. Also the human factor. People not really understanding the language. Many cases, MDX generated you think why did they do like that? Reporting services is one common culprit. You look at MDX more often. Look at convoluted MDX generated and wonder why they did.
Greg Low: I’ve seen people learning relational SQL and use a tool to generate query to see what they’re trying to get. They’ll look at SQL and try to pick up how they should have written query. I’m wondering if any tools do good enough job for generated MDX so people can use tool to generate query, make changes, look at code to get feel for it.
Chris Webb: People do that all the time. Problem is with any client tool, MDX generated will always be more complicated than MDX you create. They have to cover eventualities, different possible key designs. Harder to learn MDX like that. MDX generated is much more complicated than needs to be if tool really knew your cube. I tell people not to bother doing. Many tools spit out readable MDX. ProClarity Desktop Professional did good job. Excel 2007 does good job now. But tries to throw you in deep end with type of MDX produced.
Greg Low: Queries. Basic types of queries and basic structure of them.
Chris Webb: MDX you need to know, ones you need to learn. Interesting about MDX is you’d think coming from SQL background select statements most important. In fact, most people write select statements. Most of the time they’re using tool that generates for them. To get data out of cube, must have select statement generated. Similarities between MDX and SQL select statements confusing. They look similar. Similar clauses but all do different things. Basic MDX statement has select and from clause. From is key to query. Only one cube at time. Select specifies data you want to get out of cube. Big difference with SQL and MDX select statements, MDX every axis in query is treated same. In SQL, big difference in specifying columns and rows. MDX, you treat every axis the same. First big difference is you can have multi-dimensional results returned. Results with no axis in it, just a single cell, or up to 128 axis results sets. Most of the time, only 2D sets, but option there for more or less axis in results.
Greg Low: Not necessarily returning table, but potentially cube?
Chris Webb: Not quite. Returning multi-dimensional result set. Not a cube. When you specify axis, need to be specific. Common columns in SQL statement. Specify set. You can either write set out manually, list members or couples to appear, or you can use any of many MDX functions. Often you’ll find you have simple list of members or maybe one or many nested MDX set functions. Filters, sorts. Pass results of that set function to another set function into another set function until you have sets with members you want to return on your axis. Grabs me that people talk about need to write set-based SQL. In MDX can’t avoid writing set-based MDX. Important to write any other way. First place you encounter that with your MDX select statements, what you want in rows or columns. Axis in MDX select statements are treated the same. Anything you can do in columns you can do in rows. MDX select is also output oriented. Getting layouts of results you want is flexible. Problems you have pivoting data, cross-tabbing of data. In MDX, happens naturally. That’s the basic MDX statements. Then the next problem people have is the where clause. Expect it to filter what’s appearing in rows. It doesn’t do. Can filter indirectly, but is extra invisible slice of subset result. Third dimension or third axis in results that you can’t see but slices results. Example, if you have countries in results, products appearing in rows, showing sales by country by product, you might put year 2008 in your where clause. Will slice data by 2008. What sales show are 2008. Doesn’t directly filter what appears, but slices the data. Affects the numbers but not the rows and columns. You can have with clause in select statements. Allow you to define calculated members, names sets you can use in query. Leads to area people have to learn MDX for, writing calculations for cube. Most people avoid writing MDX select statements, but most every cube in production for Analysis Services has to have calculated member on it. Big, powerful thing about Analysis Services is types of calculations you can do. Most common type in MDX is calculated member. Concepts of a member. A member is a real member in a cube, something of a hierarchy thing, customer, product. Can also create calculated members that extend space of cube by one member. Most times, you create members on measures dimension, special dimension that contains numerical values you’ll aggregate data up by. Things like sales or number of units in warehouse, number of customers who bought. If you want to create calculated member like profit. One measure is revenue in; other is costs of creating product. Obviously, profit is revenue minus cost. You can do easily in SQL. Calculated column. Straight forward. Simplest measure you can do in MDX. But as I mentioned earlier, you have time theory calculations, market share calculations where you want to not only see sales by date and product and customer, but also see the year to date sales. Sales from current date back to beginning of year. Might want to know sales today compared to same day in previous year. All the kind of calculations you want to do inside cube. These are the kind you can’t do easily in SQL. Happen relatively easy in MDX. If you’re creating calculated member to show growth based on same period, previous year, just write express that says sales today, sales for same day previous year. Need to construct so always returns current day relative to previous year, then take one from other, and have growth calculation. Accounts for 90 percent of different types of calculations you can do in cube. Simple like that. Time series types of calculations. Can be more complicated as well.
Greg Low: Good point for break. When we come back, we’ll discuss that and look at name sets and things.
Welcome back from break. So, before we go into the second part of the material, Chris, is there life beyond MDX and SQL?
Chris Webb: Yes, the remains of one. I’ve got two fairly young kids. Anyone with children under five know the lists of things you can do is limited if like me you work as a consultant. By the time I’ve been away working, traveling more than I ought to, gone for the week, by the time I get home, I spend time with my family rather than anything else. Putting life on hold for a couple years. My youngest is almost two. My spare time is slowly returning to me.
Greg Low: Last time I saw you Dusseldorf, wasn’t it? The PASS Summits. Another trip. Funny thing I tell people, Heathrow Terminal 5, I was through five times in three weeks. Amazing. Right after the significant meltdown there at the new terminal. Five times in three weeks, but still had luggage. The trip through there was frustrating at times. You know your gate friends. I saw the biggest Windows blue screen in my entire life. Big flight display system and the whole thing was big blue screen of death.
Chris Webb: Sums up Heathrow. Terminal scaled up beyond reasonable demands of any airport terminal. Massive things, massive scale, everything falls down.
Greg Low: That day, got impression they had under control, but had cancelled 90 flights and were handling all baggage manually with about 19,000 bag backlog. Under control?
Chris Webb: I didn’t have specific problems when I went through there. Coming back into Heathrow Terminal 5, I got off plane. We went down four flights of escalators, went on train, went up four flights of escalators, walked half mile to luggage. Distances too big in Terminal 5.
Greg Low: Does take long time to walk anywhere. Big distances, would expect travelators.
Chris Webb: Other problem is I came out and expected to look for taxi. Guy with your name on his card. But there were hundreds of them. How to meet your taxi driver didn’t scale up to size of Terminal 5. Need solution for finding your driver. I walked up and down for ten minutes before realizing he just wasn’t there.
Greg Low: I saw heaps of people standing around a screen where you’d see five or six people all sort of puzzled, pointing at screen, twisting faces. It was remarkable. Struck me as good example of good example of big project just turned on one day and collapsed. Amazes me that seems to be big projects like that that simply don’t work. You’d think enough experience in industry now to build terminal, yet even that, having it all work as one big waterfall approach. Simply didn’t work.
Chris Webb: Yes. Same in IT, world over. Everyone sees government IT projects that try bullying and fail. Problem with government IT. Then you look at Heathrow Terminal 5. It’s nothing to do with government, just projects on grand scale. Potential for disaster massive.
Greg Low: So are there any sports or hobbies you’re involved in?
Chris Webb: Kids have precluded that. I’m just coming to end of building work on house, not that I’ve done the building itself. I’ve taken the garden in hands. At stage in my life where I think gardening fun for a Sunday afternoon. Getting hands dirty, making bon fires, that kind of thing. Learning to grow things.
Greg Low: That’s good. So we were talking calculations and so on. In 2008, included are name sets.
Chris Webb: No. Name sets have been around since beginning of end year. Touted big change is dynamic name sets. Big trap people fall into with MDX is where you have set expression you can use name set. You can create on your cube or clause, name set, set of whatever members or couples, giving name. Wherever you would put that set expression you can use name instead. Problem with name sets is aesthetic. Whenever you use name set, will return same content. Classic problem is people wanting to see top ten sales people. You can get the top ten of anything easily in MDX using top count set function. You can get by sales measure. Write query that brings that back. If you put that top count function in set of what you expect to see on where clause, works nicely. If you change where clause, top ten will change in conjunction with how you slice query. If you put 2008 in where clause, you’ll see top ten for 2008. If you put 2007, you’ll hopefully see different top ten, 2007. Would be cool to let users choose that top ten expression as name set. They created name set on cube and put in same expression. Now rather then set up complex expression, grab name set and will hopefully work. Problem is, doesn’t work like that. When you create name set on cube, you get top ten evaluated in context of cube as whole, not in context of query. Typically you get top ten sales people for all time periods. Whenever you change slice, would still get same top ten. Sales vary by year, but same ten sales people.
Greg Low: The actual people wouldn’t change, just the sales value.
Chris Webb: Yes. Makes no sense. Useless. Name sets not as wide spread as you might think. Great if you could have dynamic name sets, which is what you get in Analysis Services 2008. You can declare name set dynamically. Always get return in context. Able to take top ten sales people from name set and put on where axis and when you change year in where clause, different top ten every year. Kind of does what you want. Problem then is end users see, have, use, think great. Then they would want not the year in the where clause, but in columns. Rather than 2007 in where clause, will drag year hierarchy to columns and have 2007 cross tabbed. Then they will see dynamic name set no longer works. Other times when doesn’t work, breaks. Personally, I don’t see uses of dynamics name sets as being what they’ve been hyped up for. Dynamic name sets will be useful in certain limited scenarios for this kind of query. Also will be useful for more obscure areas of MDX such as if you’re trying to detect where you’ve selected more than one year in the where clause. You’ve multi-selected or sliced. One time dynamic name sets useful. I’m sure will be useful for other different types of optimization and complex calculations. Dynamic name set will be less useful than people expect. Unfortunately.
Greg Low: Other things in 2008 that affect this?
Chris Webb: New in Analysis Services 2008. Big change for MDX is performance. Can’t see, doesn’t demo well. As you move to Analysis Services 2008 you’ll notice common calculation performance is increased greatly. Development team spent much time on this. Frustrating in MDX is not writing calculations they want, but MDX efficiently, getting the calculations to perform well. Complex calculations in cube has potential for writing complex calculations that perform badly now being increased as well. If you have some kind of calculation that does year to date, potentially if you’re looking at one date, might sum up several dates to get one value. If you’re putting that year to date calculation in your where clause and show 100 products and 100 customers, you suddenly have a lot of calculations to do before query rendered. Problem for Analysis Services team is determining common calculations and optimizing. They’ve done good job in 2008. Will be rare people will migrate cube from 2005 to 2008 and not notice some kind of performance benefit.
Greg Low: I’ve seen demos. Sub space calculations seem to be key area.
Chris Webb: Yes. Whole area refers to is mode that Analysis Services can work in to do calculations called block computations. Hard to explain. If you can imagine, for example, calculation that divides one measure by another. Might have sales revenue and number of customers and want average revenue by customer, revenue divided by customers. Imagine big query and you’re looking at this for many cells, would have to go through every cell and look at revenue, customers, divide, and get results. Optimization, basic that Analysis Services can do, and do quicker, is see the revenue divide by customers calculation. I know calculation will be null when number of customers null. Big result set to calculation. If a cell for a combination, I could sell anything to any customers. I don’t need to do that calculation. The way written, results of calculation always null.
Greg Low: Hard to imagine, how sparse the data often is.
Chris Webb: Yes. Exactly. Within cube, there is scarcity of data you have compared to data that could be in cube. You might have individual customers and individual products, but more often each customer has only bought one or two out of millions of product. Scarcity great when you consider that customers only there for one or two days as opposed to hundreds of days you’ve stored data for. Amt of data in cube is less than space of cube would allow you to hold. That’s the big change for MDX. There haven’t been many other changes in MDX in 2008. Partly because development team didn’t want to rock boat. People still coming to grips with all the new stuff in 2005. A lot of concepts in 2005, development team has just come to grips with and worked out how things should behave. Obscure areas of MDX that behave one way with Analysis Services, then differently with SP2. Significant, just settled down in accepted way. Last thing anyone wants to do is have new functions that confuse even more.
Greg Low: Relational side. Many people still on 2000 and will migrate to 2008 directly. Do you get same sense with Analysis Services?
Chris Webb: No. the changes between 2000 and 2005 were so great, potential improvements great. Many migrated from 2000 to 2005 quickly. I saw when 2005 came out, big explosion of interest in Analysis Services. Many people using now started with 2005 and never used 2000. Rare that I see any 2000 installs in my consulting work. Still out there. Left alone, forgotten about. People on 2005, question is whether they’ll migrate to 2008. My suspicion is they won’t.
Greg Low: That’s what I wondered. I see half the sites on 2000. Compelling argument for them to be on 2008. A lot see as best Service Pack for 2005 is 2008. Analysis Services, I can’t remember last time I saw 2000. Wonder if impetuous to upgrade to 2008 will be same. Sounds like maybe not.
Chris Webb: Enough people that did migrate to 2005 are still remembering pain involved.
Greg Low: It was a different product moving to 2005. Do you find those on 2005 now use it like they used 2000, or do they take advantageof the changes in 2005?
Chris Webb: Differences so great, they had to. When you talk 200 to 2005, realistically couldn’t’ migrate but had to start and rebuild cubes. Had to take advantageof new things. Wouldn’t say no one running 2005 in 2000 way.
Greg Low: Relational side, direct contrast, many people running 2005 like they ran 2000. Almost used none of the new features at all in 2005.
Chris Webb: SQL people migrate because they must. System works well, but Microsoft withdrawing support. Migrate because must migrate. No pressing business need. Just they have to.
Greg Low: With 2008, Analysis Services, if performance is sig improved in many areas, on its own, given kind of things you do with Analysis Services, compelling argument?
Chris Webb: Easy to overstate performance problems people have with 2005. Many cases, 2005 just works. Works well. Performs incredibly well. Not unless people have specific performance problems that they’ll migrate to 2008.
Greg Low: I was thinking people always want more, no matter.
Chris Webb: True. In BI, everyone wants more in query performance is real thing. Everyone wants fast queries. Many cases, you find that if you work cube properly in 2005, performance problems would go away. Just a matter of making it easier for cube to run faster, knowing tips and tweaks to make it perform well. 2008 engine just more clever at running same queries faster.
Greg Low: Interesting point. In relational side, often discussion on how affectively you write queries to give engine best chance of running. Converse, you should express what you’re after and it’s the optimizers job to work out. Surprising you can write queries very different and end up with same execution plan as optimizer really good. Same in Analysis Services?
Chris Webb: In terms of writing queries, not so interesting. Question is calculations. In general, when you write MDX select, usually straightforward. Complex logic when you write calculations, definitions to calculated members. Will find with 2005, and less so in 2008, is you can write same calculation in different ways and get very different performance. If you look at 50 percent of entries on Mosha’s blog or my blog, all about writing calculations in most efficient way possible. Some is how you implement algorithm you use for calculation. Algorithm used is big factor in performance of query. More often, knowing how the analysis Services engine treats certain functions or constructs, whether optimize or not. Great example is in use of VIS function compared to case statements in MDX. Analysis Services optimizer can do clever things if you’re using IIS statements, but can’t with other statements. Might write calculation, thinking you’re writing clearly, using case statement. You’d get better performance with IIS.
Greg Low: Limitation of optimizer?
Chris Webb: Yes. Development team hadn’t gotten around to testing it in 2008. One of the things now thought of in 2008. Just a matter of looking at the functions, ensuring Analysis Services can use optimal query for all different ways to express same calculation.
Greg Low: In end, seems disappointing. If clearer to write one way and both ways produce same result, disappointing if one runs significantly faster than other.
Chris Webb: Yes. Frustrated people about MDX. Too easy to write calculation that does what you want but not in most optimal way possible. For me, that’s great. That’s how I make my living. But from point of view of MDX language, bad thing. That’s why it has priority in 2008, to stop these from happening.
Greg Low: Apart from case and IIS, other common ones?
Chris Webb: Nothing that sticks out. Alost of cases, gray area. Need to think algorithm, integration. Many cases, not clear cut, bray area where algorithm seems equivalent, just that good gut feel of what’s acceptable, what Analysis Services likes and can optimize. It’s good game, looking at calculations and seeing how to optimize or write. Recent blog entry, Mosha wrote he was looking at a set expression to find the product dimension in a database. Something like 27,000 products. Set expression that found any number of distinct first letters in names of products. Went through product dimension to get the distinct letter in the names. There were just 17 distinct letters in names. He blogged that he looks at different ways to approach problem. His first attempt takes 30 seconds to run. By end of his blog, down to one-fifth of a second to get same results. Partly unique to think algorithm, but also partly unique to think implementation and partly a bit of both.
Greg Low: That’s pretty much up to time. What have you got coming up? Where will people see you?
Chris Webb: I would like to plug, if you’re in U.K., I’m involved organizing the SQL Bits Conference.
Greg Low: Happened not long ago. I saw glowing responses.
Chris Webb: We’ve had two. Last one was in March. Another one September 13 in Hatfield, just north of London. Hope to see U.K. people there. But also anyone in Europe who wants to fly over. I’ll speak at PASS U.S. later this year as well. First time. November in Seattle. I’ll be there.
Greg Low: Excellent. Have you been doing any more writing?
Chris Webb: I do kind of want to do more writing. Turns out that I’m under contract and can’t write MDX books for anyone else. Branched out to doing software development as well. A pet area of frustration is support of Analysis Services within Reporting Services. You think as they’re two big BI products from Microsoft in SQL Server group, they’d work well together. In fact, integration between them is poor. End of last year, I had idea of way to work around some of the problems. In partnership with IT Workplace in U.K., I’ve started putting out products that are custom data extensions for Reporting Services plus better MDX query builder that takes away pain of working with Analysis Services and Reporting Services. It’s called Intelligential Query.
Greg Low: Excellent. Well thank you for your time Chris. I will see you in Seattle in November.
Chris Webb: Great! Thanks for taking the time for me.