Jimmy May is a Principal Database Architect at Microsoft and a SQL Server Master.
Jimmy (also known as Aspiring Geek) has his blog here: (http://blogs.msdn.com/b/jimmymay/
Greg Low: Introducing Show 63 with guest Jimmy May.
Welcome, our guest today is Jimmy May, an old friend from the SQL team. Jimmy is a principal architect, working with SQL and databases so welcome Jimmy!
Jimmy May: Thanks for inviting me, I really appreciate it. It’s a pleasant surprise.
Greg Low: It’s all a lovely. I wanted to have a show where we did some talking about the clustered column store index technology in SQL Server 2014 and I thought Jimmy is the man. What I get you to do first, which I get everyone to do, is how an earth did you ever have become to be involved in SQL Server in the first place?
Jimmy May: let’s see whether I can make this brief and maybe a little bit entertaining. I was taking a respite to the left coast. Let’s refer to that period as my misspent youth, came back with my tail between my legs and start working in a hospital doing some miscellaneous work that the department was run on paperwork. This was a time of Windows 3.1, I thought we could do better. I would enter stuff into a spreadsheet, the computer with that Excel was busy is so I’ve got another computer and it had something called Access on it. So I saw these grids, and said that looks like a spreadsheet and I said I could make this work. Next thing you know I’m taking the access 2.0 hardcopy user’s guide to bed with me every night and ended up, dog-eared and bookmarked and highlighted and I became an Access geek. So the transition to databases was serendipitous as was my following into the SQL world.
I was getting full-time work, and I applied for a new job and walked into this financial institution, a small start-up. They showed me this thing called SQL Server and I said where is the UI, I don’t get it. But I could spell is SQL and pronounce it and they hide me.
Greg Low: awesome!
Jimmy May: and before you know it, I was leading the helpdesk team and the next thing you know I got a job offer to become with one certification under my belt to become a full-time DBA senior analysts at another start-up and these were the times when again if you could spell SQL and pronounce it you were hired. It was a great leap and the rest is history.
Greg Low: awesome! So where I first got to see you in person was of course part of the Masters program and he is also one of the MCMs. The bunch that will not grow any more in the future and that is awesome as well. Listen Jimmy one of the things I wanted to talk about was the clustered column store details, now whenever I look at the marketing materials for SQL server 2014. It almost seems to barely rate a mention yet I have this feeling like it is one of the best things in the box and so I’m just interested in your take on it and maybe we can start where has this come from? And where are we heading?
Jimmy May: Greg I can talk about this for quite a while. You and I share, we are very aligned with regard to the this disparity between the marketing teams approach, bless their hearts versus reality. It is in fact in my opinion one of the greatest features in 2014. Lastly improved over 2012, which by the way although there is room for improvement it really was a game changer. Column store was a game changer for many many applications. 2014 with the improvements to the optimiser and the fact it is writable although I will expand upon this later for those who want to hear. The fact that the column store is writable in 2014 is not the big deal. By far the most important improvement to the column store in 2014, are there enhancements to the optimizer, things just work better.
Greg Low: yes certainly, one of the biggest challenges which I found were 2012 was it was a hassle getting queries into batch mode in particular.
Jimmy May: yes correct.
Greg Low: maybe we should start with just column stores
Jimmy May: ok
Greg Low: so if we start, for those that have a really looked at them because a lot of people looked at and went that is not an updatable and not for me so let’s start there. What is a column store? Why does it matter?
Jimmy May: column store is a relatively new term. We have had the row store for quite a while in SQL. SQL Server has historically especially in the days in SQL 7, 2005 have done a great job. If you needed to access just a few rows, or a single row and you are properly index. SQL Server has done a superlative job. We have never had a good answer especially as databases have gotten bigger, millions and millions of rows. We have never had a good answer for returning, result set that require access to large number of rows. Hundreds of millions even millions for example much less billions. Until column store, column store was the answer to that. We were in first to the block though, we went first to the party. Our competitors have had column store, a feature call column store for quite a while.
Would you like for me to talk a little bit more about the competition?
Greg Low: I suppose yes partially, I must admit the first I saw of it in documentation actually goes back probably even into the 1960s. This sort of concepts have been around, however the earlier implementations in database products is interesting.
Jimmy May: the one with which I am mostly familiar with is Sybase. I had a customer, I used to be a member of the customer advisory team, SQL CAT under Mark Souza. I ran the lab, boy what a ride that was for 2 ½ years. I had a customer only came in, they had one Sybase implementation and was running their flagship product and they wanted to just be a SQL Server shop. Not only for the homogeneity of the product but also for licensing costs. Apparently Sybase costs even more then SQL Server licences, but we didn’t have an answer so we bought these people in under 2012, pre-RTM. You may remember Denali.
Greg Low: Indeed!
Jimmy May: we got column store to work, for them. It was a great thing, they went into production with AlwaysOn. Pre-production bits, pre-RTM bits of column store and AlwaysOn.
Greg Low: that is awesome! Look at I suppose, the thing I get at, I look at the sort of trends in the industry and if I look at where I see it in the past, I see systems had a bit of CPU, they had relatively small amounts of memory and everything sort of lived out on disk. The problem was, given that you are always trying to work out how to get things on and off that disk as quickly as possible but I see it is almost like a new way of doing things to sort of compress the data as far as you can possibly can. Then all of a sudden gigantic amounts of it fits in memory and I think the thing that people often don’t get is just how compressed that data is.
Jimmy May: well column store, we haven’t talked about internals yet but by its very definition it doesn’t take a rocket scientists to maybe figure out how that is contrasted to a row store. Let me talk about that for just the second, for those who may not know and then we may talk about compression for another moment to answer your question.
SQL Server data page when it holds the classic data page, dimensional table data page, row store contains a variety of values. All the columns in the table will be represented on the data page and there are many rows on that table will fit on that page. And you are familiar of course with row compression and page compression. Page compression depending on the data can be fairly robust up to double so the non-compressed amount of data.
Greg Low: yes in fact, this site I was at this week, actually even page compression was down literally to about a quarter of the size of the original data. That is a sizeable database.
Jimmy May: that’s not bad. Column store on the other hand, because of the way column store restores data. Instead of rows, the entire row of a table being stored on the data page, column store data pages contain values for only a single given row. So the day that is far more homogenous, you are going to have nothing but dates for example or nothing but names for example or nothing but addresses for example or nothing but sales were example or other kinds of events. That data tends to be far more homogenous and far more compressible.
Greg Low: if you look down a column in a typical table, you often see a high degree of commonalities, so just storing data for a particular column is great.
Jimmy May: that lends itself to incredible compression. 10 times or more is quite common and also column store 2014 has an archival bit which adds another 30% compression and that’s what the kind of data you don’t access is frequently, historical data etc.
Greg Low: indeed from what I could see basically just after it forms, the row groups in memory it seems to apply just like to zip type algorithm or something to, that really affects the storage more than anything else doesn’t it? So it is still the same thing, when it is actually in memory?
Jimmy May: with fewer bits to read from disk, which is historically the bottleneck even in SSDs we can put more data in the buffer pool far more quickly and combine that with SQL Server historically aggressive read head algorithm it has always been good especially warehousing data where you are likely to read continuous data. Column store anticipates, you might have an issue when you are accessing a whole bunch of rows and theoretically continuously and so by definition the read head algorithm can be very efficient. Anticipating what to use, once to respond to a query and getting that into the buffer cache well before it is actually compressed and actually needed.
Greg Low: indeed and so I see it is a large part, the more of that the rows that you can compress and feed into memory, the more you can do via brute force with CPU operations rather than having to do all sorts of tricky indexing and things.
Jimmy May: yes but you know the casual listener may suspect you have set me up with this question. In fact we have not discussed this in detail. Are you familiar with the vector processing load column store leverages?
Greg Low: yes.
Jimmy May: so instead of manipulating, first of all you know about mash mode which you mentioned by definition. Column store lights to grab 1000 rows at a time but those are manipulated in the L1 cache.
Greg Low: one million rows at a time or a thousand?
Jimmy May: one thousand rows at a time, if I had said a million excuse me. The column store is created, at 1 million rows at a time.
Greg Low: Ok.
Jimmy May: batch mode thousand rows at a time and those rows are manipulated at the CPU not by value but by pointers to the thousand rows. It is basically, I’m sure we have all taken algebra in calculus remember arithmetic substitution? Well that is what column store substitution does, it takes a very small pointer that represents the values for these thousand rows and manipulates them in the L1 cache. It lends itself to fantastic performance improvement over classic row stores kinds of manipulations.
That is just one we have compression that we have just talked about which is enhanced relative to conventional SQL Server 1/10 of the compression. We have also just talked about batch mode, grabbing thousand rows at a time by definition and manipulating those rows via pointer in the L1 cache.
Greg Low: ideas on how compressed it would be?
Jimmy May: based on studies I have seen, 10 times or more and I mentioned the archival bit a little while ago and that adds another 30%. Actually the deck I presented, I have been doing it for a couple years now as you know let me digress a second. At the time I was in MCM training with you as one of the instructors that this partition alignment paper was coming out. Those clever folks of the Windows team remedied the problem for which disk partition alignment was necessary so I needed to find another trick. That ponies getting old, and I chose column store. So I’ve been working on this for a couple of years, so one of the slides in the deck that I typically present has a graph where we look at a compressed table with indexes which is not realistic because everybody uses compression now right?
We have a warehouse table with compression and indexes and then we compare that to a table that is non-clustered column store index and then with a clustered column store index. The difference is quite huge so again we are talking about a factor of 10 in terms of savings. For a table and non-clustered indexes even compressed vs clustered column store.
Greg Low: in fact I would love to say everybody’s using compression but that really is what I’m seeing at the moment in fact even I was at a large financial this week and at the moment they had been using that. I think there is this perception in the industry where people would just because of previous history with disk compression techniques and things. I think people would think compression, they think smaller and slower but they are not getting most of the time it is smaller and faster.
Jimmy May: yes yes that is part of it plus there is the institutional resistance to change, people think they are going to upgrade to the latest and greatest version in some day and the maintenance and window they require for some things for a variety of reasons contribute to the non-implementation of compression and other features for that matter. By the way when I said that everybody uses compression, just for the record that was half tongue-in-cheek.
Greg Low: yes indeed and look the other one of course is that for the compression as well it is also in enterprise edition as well and not in standard. Again there is a large number people that run on standard as well.
Jimmy May: good point. That is a challenge with many of the features that people would love to start implementing, it is really too bad. This is a marketing decision also I am not going to quibble with that because I’m not the person that crunches the numbers and they make a lot of sense. I will quibble with some decisions but I won’t necessarily quibble with that. But wouldn’t it be great if start-up databases departmental size databases, PLCs etc had access to the enterprise features. Perhaps hobbled, perhaps limited some way but get those features into the application from the ground up so when they grow their enterprise is ready. As opposed to a whole new development effort.
Greg Low: I completely agree, one of the things that maybe different about what I do is I spend a lot of time in software houses and because they build their applications in a way they don’t want require their customers to have to have enterprise edition. They want to work with either addition, in fact any feature that is in fact enterprise edition only they ignore. They treat it as though it is not even in the product.
Jimmy May: tragic!
Greg Low: that is a tragedy because SQL Server actually then doesn’t compare all that well with a lot of other products if you exclude all the enterprise features. The other noise I hear from people they wish it had the same T-SQL surface so you know rather than something when you run it on standard it goes bang what would be better is if you had some statement they said hey I would like this level of aggression of compression or something. Maybe for example it just doesn’t actually do it much under the covers or something like that but the actual command would not explode. So you can actually have a single code base.
Jimmy May: once again Greg we are aligned, again to the casual listener this was not planned.
Greg Low: yes indeed. One of the challenges certainly has been these standard in edition there but yes as we were saying, the compression thing has been a good story all the way along. I suppose one of the downsides of the 2012 iteration was when we built a non-clustered column store index and attached to a table (a) the table became read only and so it became messy to do updates but the other problem was you still had to have the original copy of the data as well. In terms of percentages if it was 10% of the size you still had 110% of their original size.
Jimmy May: right that is function of the fact that in 2012 column store indexes were not only read only but also they were non-clustered indexes. They relied on the underlying b-tree clustered index heap, conventional clustered index heap in order to exist. That changed in 2014 as you know.
Greg Low: yes, indeed with 2012 so for the people who are still there with that what approaches, what has been your experience on approaches for achieving updates? I have tended to see if you different ways people will try it but what have you seen works or doesn’t work?
Jimmy May: it works great, if you have implemented a classic partitioning indexes b-tree and you don’t need real-time data. You are switching things in during your sub maintenance period where every hour, every night for example a new able to take that schema mode lock and switch it in using classic table partitioning. Easy peasy and that works great for just a lot of applications, people are comfortable with partitioning. It has been around for almost a decade now and we are talking typically warehouses right where partitioning is classically used and you typically don’t need real-time updates. For those applications, which do require real-time updates the way we worked around it in 2012 it was core trickle loading. Trickle loading, it was basically just a very simple difference in regards to classic table partitioning. You had your historical table that has a non-clustered column store on it and that is read-only of course by definition. Then you have a table that matches this scheme are almost exactly 2 that accept you don’t have your non-clustered column store and you are writing your real-time data into that.
Let’s call it the equivalent of a staging table, that it is really can contemporize data and you just apply view over the historical data which is the column store versus and the real-time data. When it is time to switching during your maintenance window, you apply the non-clustered column store to the real-time data switch it in, boom, you crack a new staging table.
Greg Low: yes the beautiful thing is for the reports which are typically on the historical stuff you Ashley had the column store in place for those.
Jimmy May: absolutely you target the table directly it is sort of view you chose.
Greg Low: yes I found that seems to work pretty well and the other one you were saying with the partitioning yes there are few things with that. It is more admin, yes you take the current partitions, you switch it out a new updated, put the column store index back on it and then switch it back in and I found that work pretty well. The thing that I think was also good without in 2012 and I think it was also a hot fix for 2008R2 but was the increase in the number of partitions so again the size of those partitions could be much smaller as well.
Jimmy May: absolutely, absolutely we have some I have collected a variety of success stories and many of them because 2014 is so new many of these success stories were based on 2012. There is some incredible performance enhancements. I am actually doing sort of a blog series, I tried to get one out there at least once a week and I’ve done my second one already, where I am documenting these successes. It is just phenomenal, phenomenal just the way column store changes the way we interact with our data. When it works it works right, it is incredible!
You have queries where takes up two minutes, which are supposedly real-time and now it takes seconds or sub seconds on a consistent basis it is truly remarkable! And now I am sounding like a marketer, believe me I’m just a fan I am passionate about the product, the feature works, it is incredible, in my opinion.
Greg Low: so look the most common scenarios where using this in Star schema structures?
Jimmy May: absolutely in fact column store queries are optimized, or column store is written to take advantage of Star schema queries. Classic warehouse queries, shouldn’t be much of a surprise.
Greg Low: yes the other thing is the join performance on those, again my presumption has always been integer based keys are going to provide the best join performance for that as well.
Jimmy May: that is absolutely correct, that is a gotcha it was a big deal in 2012 but don’t think that has been remedied in 2014. The optimiser just does much better job when you are joining on integer keys in fact I mentioned earlier the application that are bought into the lab, really implemented column stores successfully along with AlwaysOn. And it happened to be, some of you listening may know Shari how she me at the SF Net he was the lead developer for the implementation.
They had to go through their database because all their joints were on character data and to get things to work they had to re-factory, add integer keys and once we did that we were off to the races. Things were like.
Greg Low: yes I can find the performance of those and look the thing is I suppose maybe people don’t stop and think about it but comparing to numbers for a system is really easy but I think people are underestimating the complexity involved in comparing two characters unless the thing is straight binary which it usually isn’t in the correlation then there is sophisticated rules that has to be applied every time you do every calculation. It is amazing!
Jimmy May: just for the record, let me also add creating warehouses on integer keys surrogate keys when you don’t have an actual key that has been best practice for a long time that is nothing new.
Greg Low: Yes.
Jimmy May: if you design your databases quote “correctly”, architect them quote “properly”, they already have the integers they need on which to.
Greg Low: one that did surprise me in 2012, given the fact that it was an add-on on non-clustered indexes, was the fact that we couldn’t have filtered indexes. Again I have lots of scenarios, where I actually love filtered indexes. I’m in the implementation doesn’t seem complete but in the case of the non-clustered column stores my recollection was that was one that we couldn’t do filtered indexes anyway.
Jimmy May: that is correct.
Greg Low: have you been using filtered indexes much in general?
Jimmy May: not a lot, even though in MS IT where one might think we are doing cutting-edge stuff, we are not necessarily ahead of the curve, only on selected projects. So it is not something I get to work on nearly as much as I would like.
Greg Low: yes, the ones that I find that it is the biggest impact on are things around. I mean some are like where they have character status, one specific value in a small number rows but these sort of highly selective scenarios. , One for me is where people have bit columns. There are lots and lots of old material where people used to say it is pointless indexing bits.
Jimmy May: no, no.
Greg Low: and I was always thinking that is tragic advice and the beautiful thing is if I have a thing that says is finalised or something and I have a handful of rows that aren’t finalised and hundreds of millions of rows that are finalized. I would never ever want to use the index to look up the finalised ones but I would sure want an index to find the few that weren’t.
Jimmy May: you bet ya.
Greg Low: and yes the idea that being able to build an index with a WHERE clause that just had those in it is wickedly cool. That was just another limitation that just sort struck me as being a little odd. One with filtered indexes itself that surprised me is you can’t build them on computed persisted column either. For the life of me I can’t think of why they disallowed that. I suspect that was a mistake actually I mean a computer column I kind of get but a persisted one uhhh, it seems really, really peculiar that you can’t sort have built a filtered index on it.
Jimmy May: Greg I am with you on that one, on both counts. We get why it might not work on a non-persisted column but why not a persisted column?
Greg Low: yes actually you already got the data.
Jimmy May: yes it is stored as far as I know quite identically to a conventional column.
Greg Low: while we’re on those things, what about data types support? That was another thing that was an issue, yeah.
Jimmy May: yes that is a good point I had that on my list to bring up if we didn’t do it. The only types of support in 2012 was fairly complete, fairly robust. In 2014, it is everything that but a BLOB, of course you are not going to be adding a column store to an XML or CLR etc or nvarchar(max).
Greg Low: so when you are saying a BLOB, you are including SQL CLR types and things like that.
Jimmy May: yes what we used to call text data, nvarchar(max), XML etc. and all conventional data types are compatible with column store. The engine team, dev team with regards to their prudence to column store in 2014 in general have done a great job in 2014 and specifically in regards to column store. They have done an extraordinary job in enhancing the feature. I am very, very impressed, you may know we are all into the cloud etc, etc. a lot of our dev resources are devoted to the cloud, yet there are still pieces of the engine that we are spending time on, spending resources on. And column store is one of those features that I am happy to say.
Greg Low: yes, it has actually been an amazingly good story. Now going to get you to head into 2014 territorially but before we do I just have to let people know Jimmy is one of our heroes in real life. One of the big things is he has just become a super fit person over the last few years. I just encourage people to go on and look for pictures. I just love you to share for a minute, just the thinking of the mindset that is required to do that? Because your transformation has been astonishing.
Jimmy May: oh Greg thank you very much, that is really great. It actually has been pretty exciting. Historically I have been fit often on throughout my youth and adult life but the last be years especially since getting my job at Microsoft I’ll was well over an eighth of the ton for most of my tenure. It was pretty darn frustrating and you know there is an expression you get sick and tired of being sick and tired. There is an incredible program here locally available to us in Redmond at a local health club called the Pro club. Where it is a medical program, combining physical therapy, dietician, medical supervision as well is intense training and loss 85 pounds and have a look back. That has been two years now, the first time.
Greg Low: so repeat that number?
Jimmy May: 85.
Greg Low: 85, wow.
Jimmy May: so that’s great I can run again, did are half Marathon on. First half Marathon line in about 20 years recently and I’ve got a passion for skiing and when I lose my skis I can actually bend over and put them back on myself etc. I’m learning to do handstands, it is really exciting everything in my life is different.
Greg Low: as I have said all of us have been in the background cheering because has been a stunning transformation and something you should be very proud of.
Jimmy May: while thank you I have a yoga practice now, as you know since your Facebook friend I am posting latest and greatest photos. It is a lot of fun and the fact that you are sharing this with me Greg, while thank you it is great to know that I am inspired at least one individual.
Greg Low: oh yes, more than one I would say. It is great!
Jimmy May: nice, thank you.
Greg Low: listen, into 2014. I suppose one thing before we do, are there any other hobbies that you have outside?
Jimmy May: yes I mention skiing, I have a yoga practice. Those are my big things, things related to mountain biking things related to physical fitness. Very excited, very passionate about those things and in fact we have a house back in Indianapolis that we haven’t sold yet in three years. But after taking my skis back to Indiana and trying the quote “slopes” there, there is just no way. I have gotten the ski bug and I’m stuck out here on the beautiful gulf left coast.
Greg Low: yes we have had an unseasonably warm again weather sort of locally and I noticed on the news this morning our local ski resorts are all running snow machines and things at the moment and trying to top things up. So yes it is actually has been a bit warmer than normal, but anyway.
Jimmy May: yes, I am when I retire young and wealthy or release wealthy I would do this somewhat summertime skiing because it is equivalent to what is December down there.
Greg Low: yes awesome come and visit.
Jimmy May: yes it is a deal.
Greg Low: in 2014, I suppose the big thing first up clustered so we don’t need also have another copy of the table so this can be the primary copy of the table.
Jimmy May: correct.
Greg Low: so that in itself is a huge thing because the disk savings and so on are staggering. Really the number of places that I go into where when people put in their SANS and things in the first place it is going to be the savior of everything and yet what they end up doing is endlessly arguing what fits in space and so on from that point on.
Jimmy May: it is amazing to me. I remember the mantra going around, this space is cheap a few years ago and it is not. This space is not cheap, what worse way for the DBA to spend their time than slinging discs around just to accommodate scarcity of space. It is a little bit crazy not very good use of time.
Greg Low: indeed, so that is a huge one just on its own the fact that we don’t need that separately. Are there any real restrictions on the designs of the table much, if it is going to be a clustered column store?
Jimmy May: well it cannot contain, a non-compatible data type which in 2014 would be a BLOB data type: XML, CLR, spatial etc. You exclude that and if you need those in your data set just vertically partition that into a separate table. Also something to be aware of, I may be jumping the gun here for you but because the clustered column store is not binary structure on which constraints depend. Because constraints of course enforced under the covers within index, you can have a unique constraint, foreign key constraint defined on a clustered column store.
Greg Low: so that is the trade basically of performance and space against some consistency?
Jimmy May: correct or at least an assurance of consistency there we have other ways to find it right?
Greg Low: yes indeed. No that is good and so that is the first one and of course as you said the second one is as big a deal is the updatable nature of it all. Although I would argue, even though the performance of that doesn’t seem stunning for a lot of people the ability to do anything in that regard is a big deal.
Jimmy May: I won’t say it is a big deal, I just want to put into perspective that the enhancements to the optimiser other really big deal.
Greg Low: yes.
Jimmy May: the updatable being is cool and allows us to avoid having to jump through hoops for those applications. To need real-time updates and goodness were forbid you have an ad hoc update or something down in your historical data and you would have to switch on. In 2012 you have to switch off the partition, delete the column store index bit, flip the bit, recreate the column store and switch back in. You no longer have to do that, the challenge with a writable column store is that people who don’t know about the potential performance impacts are treading would fire. Sorry that is a mixed metaphor.
Greg Low: certainly if you expect on your gigantic table to go in there and go UPDATE some column on some table you are not going to be a happy soul.
Jimmy May: yes exactly, but what happens more and more routinely is that people are inserting 1000 rows or 100,000 rows or maybe some big batch and not using BULK INSERT which by the way is fully could supported by column store. There are thousand rows at the time, in fact let me just cut to the chase and characterize a case study that you will be hearing about soon on my blog and it is also my deck. At a customer at MS IT implementing Hekaton and column store and by the way column store is orthogonal to most of the features. It just works, it doesn’t matter what else you are doing and of course Hekaton is for OLTP kinds of data which is different piece of the application. The database also has a column store table which happens to be joining to each other.
Query performance initially was wonderful on this query, this initiated the column store, and query performance went to some wonderful time. I forget the couple minutes or so or something like that then they started updating the column as part of the routine process. They were throwing in a 1000 rows at a time thousand rows at a time and over the course of a few days what used to be this really fantastic great running query degraded and was taking over an hour.
The problem was these thousand rows at a time were going into what is called the delta store which is the feature of writable column store to host rows that are yet eligible for compression or full-blown column store format. What do I mean by not eligible? Well the threshold is 1 million rows, you got to have 1 million rows in the delta store in order for the engine to flip the bit and say I have got enough rows so let’s modify these million rows into a column store object. Between the time you are inserting an arbitrary number rows a you hit that trigger, you are not going to get column store performance when you are hitting the data.
For those people administrating applications or developing applications that weren’t aware of that, it is a landmine. You are going to get into it and people are going to scream at the SQL Server and really it is a matter of education. Really I think it is up to us, at Microsoft people such as you experts in their field to be keenly aware of this potential pitfall. Just to be aware of it, it is not a bug it is just one of the product works and you need to know how works in order to leveraging it effectively.
Greg Low: I quite agree, so basically from what I have seen they seem to have a bitmap associated with all the existing rows where they can just sort of turn it off to say this row effectively is no longer here and they just have a delta store added on the end. So if you update a row they just sort of turn off the original road and then add the new road at the end.
Jimmy May: correct, an update, deletes is the equivalent of a deletion and then insert. Which probably won’t surprise you, so you get the row flipped, the bit flipped as a delete and you get to do data in the delta store.
Greg Low: cool, so we get the update and so the other big challenge we talked about it was in 2012 was getting into the batch mode where you was doing the vector-based operation. That was very, very tricky to do in fact I will remember there was a white paper I think from SQL CAT team that talked about you know you are things you might need to do to your query to try and encourage it to get into that mode.
Jimmy May: yes there were things like we talked about the character joins, you had to avoid certain kinds of unions, LEFT JOINS, anything that’s built to disk huge problems.
Greg Low: yes and so I notice that when they listed the things that were supported in the 2014 one. It was things like all the join types so OUTER JOINS, NOT INS, INS, UNION, UNION ALL were amongst there. We had various types of aggregates and so on, so all of these things much more easily you write the query and they just work.
Jimmy May: yes exactly, it is the power and the platform and I am sorry another marketing term. That is how they say it and in this case I absolutely agree with it. You create your cluster column store, you write your queries and usually it just usually works and when it does work it is like magic. Oh by the way something that didn’t work also in 2012 were not just complex joins and the other structures that you suggested but also if you have too many tables. Sometimes as few as seven or eight tables, the optimiser would say nah I am going to own mode.
Greg Low: indeed, I suppose one of the other little challenges just by the way thing. I have always been intrigued, the optimiser to because at all sits above the storage engine I have often wondered if it’s a problem it doesn’t have more visibility on what is going on in the storage engine. I sort wondering like if I had to pick between two indexes and one is compressed and the other is not. From what I’ve seen in that sort of info is not available to the optimizer.
Jimmy May: while not being a developer of C++ myself I can’t speak to that specifically.
Greg Low: I think they have just gone with a fairly clean separation of those things in the initial design and again it is sort of thing that might over time change.
Jimmy May: yes it is easy to defend that in terms of rapid development and our aggression algorithm for testing etc.
Greg Low: so the idea was that so we said with the column store we break all the columns apart into separate things that are going to get stored. We basically break those into 1 million rows at a time, so we have 1 million values from a column as a segment. We then get all the segments to go together in the same row and that makes a row group. Then the archival bit is the other thing that was different in 2014, so this applies and looks like another lot compression over the top of it again when that group is written to disk. That doesn’t affect that you memory structures.
Jimmy May: no not at all. Can I iterate again talked about the process of creating a column store and I think it is worth bears repeating. It is so fundamentally simple at least on a higher level. So you issue the create column store index statement and you are looking at a warehouse table when you do that. The first thing the engine does is it grabs the first million rows and that forms you row group. It is just the fundamental simple horizontal partition arbitrarily done at 1 million rows.
Then the second step is a vertically partitions the table into its constituent columns and each of those columns within a row group as you said correspond to what is called a segment and that by the way for any given column and any given row group that is the unit of work. Value from that segment is loaded into memory and the third thing that is done is you got your horizontal partitioning into row groups, vertical partitioning via column into segments and you got your third thing that these compression and coding. And boom you have your column store index, at a high level it is fundamentally simple.
Greg Low: one thing I’ve often wondered about which I don’t know is in your territorial not but when you sort of thing when they look at that table in the first place and say I am going to grab this and break it into thousand row chunks I sort wonder if they have any smart sorry million rows chunks. I sort wondering if they have any sort of smart in deciding the order of the rows that it is done in?
Jimmy May: this is an area that is rife for my year investigation. I have been keen to be the primary author column store White Paper that we are going to start writing next quarter. I am very excited about it, there is some secret sauce that we need to get out there for example one of the case studies I talk about, one of my colleagues Roy Tuttle is on the Watson team. Have you ever had a blue screen Greg, ever?
Greg Low: yes indeed and the name is familiar.
Jimmy May: Well Dr Watson knows you and that table has 61 billion rows at last count. 61 billion they had a lot of queries that they used to target at that table with just never finished. They couldn’t get the information they wanted out of this data. They apply column store this is in 2012, they applied column store on it and it is classically partitioned of course. Now they get answers to problems that they could never get to before. Queries that used to take an hour now take five minutes, queries that used to take a couple minutes and hours second.
Greg Low: that is awesome!
Jimmy May: yes that is awesome, it sure is.
Greg Low: what it I was getting at though is the order of rows is going in.
Jimmy May: oh yes I’m sorry.
Greg Low: yes whether you has a big effect.
Jimmy may: that was a segway there, thank you for reminding me. Yes the partitioning key is critical, they experimented a little bit was 61 billion rows with a subset of course and the partitioning key was critical. Fortunately it turns out that since column store by design are created for warehouse queries and you warehouse queries are characteristically partitioned on date and typically date is part of your queries. You using that for partition of the nation, you after the races. For most implementations it is no problem however I have seen where it can be. A focus over the next 3 to 6 months will be to try calculate or figure out what that magic source is.
Greg Low: yes it was something I was planning on experimenting with. I thought what would be interesting is to have some way I suppose I could look at the statistics that have been held for any indexes that are already on the table and I could look at density values and things like that to try and decide. You know hey if I ordered them this way I would end up with something smaller again.
Jimmy May: yes possibly, again I don’t know the internals to that degree yet. Something that is very interesting that I will learn as part of function of writing the White Paper.
Greg Low: one of the ones I suppose we should talk about I suppose is the T-SQL things involved in in that is all pretty much normal indexing stuff. We have the word column store goes in amongst it, I do remember there is an option which was useful for testing that ignore non-clustered column store index as an option.
Jimmy May: yes one of the reasons I mentioned earlier a couple of years ago I needed a new trick and I chose column store which not the least reason for which was. It was so fundamentally simple, straightforward to get up to speed, you can look at it my deck. There is a TechNet wiki out there you can read which is 15 pages, very consumable, maintained by the product group and read that and read my deck and you off to the races in terms of study your experiments. And sometimes you can change the world overnight, experiments can be done on your laptop on a subset of your production data. It is very straightforward. Getting up to speed in terms of the DML, in terms of implementation can be very simple. You have to add a keyword creating the clustered column store, when you create the non-clustered column store you just at the column store keyword. When you create a clustered column store you at the column store keyword and there is no column list.
Greg Low: yes that is a classic, the weird thing is you just say create it can or can’t depending on the structure of the table based on which data types that are there.
Jimmy May: yes exactly disable an index is exactly the same existing classic index and the option you refer to is okay I have got my column store index and I want to check out performance with column store versus without column store and I simply add a hint ignore the index. It is a fun demo, actually it is for the first demo I do my presentation.
Greg Low: another link I will put into is that Niko had a whole series of blog posts where he has sort of gone often done a lot of investigation on the internal structure of these as well which I found very interesting. One of the things that I noticed somewhere amongst the reading was the idea that tuple mover, the thing that picks up when the rows store delta values get enough of them it picks up and creates them into row groups.
From memory there was also, it wasn’t like it 1 million rows, it was like at 100,000 rows or something like that?
Jimmy May: that is for BULK INSERT.
Greg Low: Ahh that’s where it came from.
Jimmy May: the value is different when using BULK INSERT, when I’m saying using BULK INSERT this is a generic term of course I’m talking about the keyword. That has an interesting ramification in terms of performance. I mentioned the delta store, it is 1 million rows threshold and by the way if I’m incorrect about that I’ll let you know and you can add an addendum to that. For BULK INSERT the threshold is 100,000 rows.
Greg Low: so it was with BULK INSERT, I remember there was something around that.
Jimmy May: there is a performance implication and that is row groups become smaller. If you have a BULK INSERT of 90,000 rows for example you are subject to that same 1 million row threshold. So you got to have an 11, sorry make that 12. 90,000 bulk row insert before the tuple mover says it is ready to roll into action. Whereas if you do a 100,000 row BULK INSERT it initiates it upon insert.
Greg Low: so what you end up with is a whole lot of smaller segments in that case.
Jimmy May: yes is a lot smaller row groups and a lot smaller segments, exactly. Look it is magic the way the engine coalesces the data that you asked for.
Greg Low: Yes.
Jimmy May: but the magic has its limitations and if you have for example 10 times as many row groups even if it was for the same number of rows represented you’re going to have less performance.
Greg Low: yes indeed, so one of the things you could then look out for I suppose in that case, how many segments you have got in there that is less than that in terms of the size. There are system views and things in there that show you all of the stuff.
Jimmy May: another work around besides looking at the system views to see if it is fragmented which I want to provide guidelines for when I’m calling fragmentation. Of course this is classic fragmentation.
Greg Low: Exactly.
Jimmy May: metaphorically it is the same, by analogy it is the same. We want to put our guidance there, when does that make sense, when do you look at your meta data and say this threshold is such that we are considering rebuilding. And speaking of rebuilding, I mentioned that bulk inserts if you use 100,000 rows of course you are welcome to use 500,000 rows on your bulk insert committing it.
If you end up because your application to demands a hundred thousand row insert and you have a lot of row groups, more than you necessarily need. Just build be affected partition, not necessarily the whole table just the partition is that require it.
Greg Low: actually that is another thing that is actually quite good, with partition tables in SQL 2014. Just subtle little things but the ability to rebuild and individual partition online is such a good thing.
Jimmy May: it is, can I show for 2014 there are a lot of great new features that just aren’t being told. One of them is the online single partition rebuild. It is a DBA’s dream.
Greg Low: that is a game changer for a lot of people as well. The other one to even amongst that is the handling of the lock priority when you are doing switch and so on.
Jimmy May: yes.
Greg Low: yes these are all subtle little things but that I make it onto the brochures so much.
Jimmy May: Subtle, subtle my formerly fat ass. Partitioning was built as an online AlwaysOn feature you know was helpful but for those of us for our apps. For an hotel or an airline, 24-hour business online for get it main we are not going to do a partitions switch outside our maintenance window. But because of the lock prioritization, the new lock privatization available in 2014 and it is easy to implement.
No big changes you don’t need to build anything, you just decide what parameters you want and the thread
Greg Low: stopping you are blocking you.
Jimmy May: or the thread that you are trying to switch partition with just wait it just lerks in the corner waiting for an opportunity to grab a schema lock. It cuts into the queue and grabs it without blocking and queueing everything behind it. It is a wonderful new feature.
Greg Low: I thought it was wickedly nice new thing. Actually I suppose one of the question too I suppose you mentioned before rebuild to clean things up and the effect of reorg, reorganize on the clustered column stores. It is a right that my recollection, I recall it does something like runs the tuple mover or something like that?
Jimmy May: yes, again that is an area for which I have done investigations in. Most of my work unfortunately is in 2012 I are pretty extensive experience and the newer improve things, this subtle newer are improved things I have not yet investigated thoroughly.
Greg Low: yes it is on my little list of things I’m going to check out further is just exactly what happens in a reorg versus a reindex.
Jimmy May: yes let me know what you find out, if you beat me to it.
Greg Low: yes indeed, so listen I suppose we should summarise then in terms of best practices for people when they are using this. What are the main things you think other key takeaway in terms of learnings?
Jimmy May: just use it, don’t be afraid to experiment, just give it a shot. Experiment with clustered column store at your earliest opportunity. If you are experimenting with non-clustered column store in 2012, consider upgrading to 2014 just to see what it can do.
Greg Low: Yes the best candidates are the large fact tables.
Jimmy May: yes the last fact table is of course an also consider looking at if you have any large dimension tables that are getting scanned. You can checkout usage stats to determine that, your dimension tables are getting scanned if they are not indexed properly or they may be eligible for a column store implementation. But yes classically it is the large fact table that are the primary candidates for column store.
Greg Low: and of course this loves memory and it loves CPU.
Jimmy May: yes it does, especially will if you build these things like any index. You tell SQL Server to build a column store it is going to grab all the CPU that is available and all of the memory that it can. Gosh do we have time to going to the resource governor, probably not. I guess checkout the deck, that is all I can say.
Greg Low: indeed.
Jimmy May: partition switching it as a best practice, again typically we are talking about warehouse fact tables that shouldn’t be big problem. I used BULK INSERT be careful, be very careful about using ad hoc updates and inserts you might move the tuple mover with row groups. You might end up with a bunch of rows in the delta store that impacts query performance significantly.
Greg Low: cool, so listen that is great Jimmy, is there anywhere where people see you all things you coming up all or shall we will just what your blog?
Jimmy May: sure watch the blog, column store is a keyword I will be adding to that IP with some discipline over the next few months and there is or do some good stuff out there. Including my deck, just use the Bing Jimmy May blog you’ll find me, search will column store or just type Jimmy May column store and you will find something called the big deck which is relatively up to date. You find a lot of great information and it is fairly organized and there are sections in there available to a wide spectrum of audiences.
Greg Low: will you be at the summit this year?
Jimmy May: yes I will be at the summit this year.
Greg Low: the PASS summit.
Jimmy May: I hope to speak, I mention handstands a moment ago. A big surprise a sneak peek, I did a dry run of comparing row store to column store using myself and a friend and used his body is a row store lying on a table and I did a handstand to represent column store and I will be practising for PASS.
Greg Low: I love it, so listen thank you very much for your time today Jimmy that is awesome.
Jimmy May: Greg thanks for the invitation I really appreciated. Take care.
Greg Low: all good.
Jimmy May: bye.