Adam Machanic is a database-focused software engineer, writer, and speaker based in Boston, Massachusetts. Implemented SQL Server for a variety of high-availability online transactional processing and large-scale data warehouse applications. Also specializes in .NET Access lair performance optimization. Microsoft most valuable professional for SQL Server and a Microsoft certified professional. Adam is the co-author of Pro-SQL Server 2005, published by A Press.
Greg Low: Introducing show number eight with guest, Adam Machanic.
Adam Machanic is a database-focused software engineer, writer, and speaker based in Boston, Massachusetts. Implemented SQL Server for a variety of high-availability online transactional processing and large-scale data warehouse applications. Also specializes in .NET Access lair performance optimization. Microsoft most valuable professional for SQL Server and a Microsoft certified professional. Adam is the co-author of Pro-SQL Server 2005, published by A Press. Welcome, Adam.
Adam Machanic: Thank you.
Greg Low: Was fortunate enough while I was away at the PASS conference in Dallas, got to meet up with Adam. One of the things I got to see was the session he presented. I thought I would give him along today to talk about the things that were covered in that. Maybe first up, fill us in on how you came to be involved with SQL Server in the first place.
Adam Machanic: Well when I graduated from college I had studied philosophy and computer science. Focused on logic in the philosophy area. Went into the real world searching for a job. First thing I landed was a web-development job. One-man project. I had to do everything; all the web development, database work, everything. I soon discover I hated the web development. I fell in love with the database. Logic there. Database for shorter logic as applied to data management.
Greg Low: What sort of tools were you using it for web development?
Adam Machanic: ASP development. I don't remember what the version was at this point. Notepad. Pretty stripped down, you could say. I found the database to be something I could really grasp and fall in love with still. That’s what I did. From there I moved on to jobs as a database engineer. Here I am today.
Greg Low: Where did you first try SQL Server?
Adam Machanic: My first job. Originally doing Access and Access wasn’t scaling to meet the needs of the product. Since it was a Microsoft shop, the natural choice was to switch to SQL Server. That’s what I did from there.
Greg Low: Excellent. I’m interested when you saying you’re finding Access wasn’t scaling. There’s always a constant discussion with people saying what is it about Access that makes it not suitable for larger applications? What are your feelings as to the real issues there?
Adam Machanic: I don't think it’s so much larger applications. I’ve had some more experience since that first job. The real problem as I see it is that Access is not designed with concurrency in mind. It’s really designed as a single-user or maybe a couple of users; small-scale database system, or single-application or thread. I’ve seen an Access application that someone tried to put on SQL Server unsuccessfully. Actually performing better in Access than it did when they imported it over to SQL Server. It was a specialized single-user application. Really what Access is meant for. SQL Server is meant for concurrency for multiple users hitting it at once. In some cases Access is better.
Greg Low: I admit, while I’ve seen a few situations where people had very much single-user applications, they were tending to read a whole lot of data sequentially. No real concurrency in things involved. In raw speed Access was very good for that sort of thing. As soon as you start throwing more than one user into the mix, it’s a very different story. The thing that I look at that I think makes a big difference is I love the fact that products like SQL Server and above. You have a process which is saying “I’ll be responsible for the integrity of the database as well.” Rather than with something like Access you have every individual client being responsible for the integrity of the database.
Adam Machanic: Right. The other thing with Access is when people try to upsize from an Access installation to a SQL Server installation; they don’t realize that Access is half-database and half-user interface. A lot of people don’t really understand when they’re trying to upsize that SQL Server is no user-interface at all. Enterprise Manager into user interface using table editing features; going in asking questions on the forums “why can’t I see all my data in Enterprise Manager?” This cased a lot of problems; misconception about what SQL Server is compared to Access.
Greg Low: What version of SQL Server were you first involved with?
Adam Machanic: 6.5
Greg Low: 6.5 yeah. What were your thoughts on the product at the time? I thought it needed a lot of management compared to now.
Adam Machanic: I think so. I wouldn’t say I knew anything about management at the time. I was really trying to develop… Didn’t know anything about databases at all; just thrown in. “Write this application,” so I did a lot of reading; did as well as I could. Didn’t do anything right. Wasn’t until sometime later when I was on 7.0 doing some real development on a real team with people who knew what they were doing that I figured out what was going on. At that point, I was about as naïve as possible.
Greg Low: That’s alright. What I’m interested in; in the first of the topics you are covering in the session at the PASS conference was toward procedures as APIs. Maybe start there?
Adam Machanic: Yeah, I think this is an interesting topic. Kind of a religious debate in the database community. You often see this debate, “should I use stored-procedures or ad-hoc SQL to access my data from my application?” Oftentimes when you see these discussions and these debates, it always goes to performance. Everyone’s debating “stored-procedures can perform better.” “I can make ad-hoc SQL perform better.” Blah, blah, blah. I don’t even want to get performance involved in this conversation. For me this is actually a very central software development question. One of coupling. When we are talking about coupling, we refer to two systems. Two systems or two components are said to be tightly coupled if one component which uses the other one relies on the other too much. For me, ad-hoc SQL couples the application tightly to the database. It gives the application knowledge of tables and column names; other schema in the database. This means that if you change something in the database, you have to go back and change the application. If you use stored-procedures, you create a layer of indirection between the application and the database. All the application knows about is the interface exposed by the stored-procedure. If you change something in the database, you change a table name; you change the schema in some way. If you want to change the query or performance purposes for instance. As long as that stored-procedure still takes the same parameters as inputs; still outputs the same columns in the same formats, the application doesn’t need to change. You had D coupled; you can change one without changing the other. Freedom and flexibility to work with. In my opinion, there’s no question, stored-procedures for everything. Maybe Views in some case.
Greg Low: What’s your thoughts with the fact that we can now expose stored-procedures as Web Services in the SQL Server 2005? From the performance point of view people would say “exposing any of this tends to be a slower way of doing things.” One of the things that intrigues me about the idea is at least it’s now contract-based interface. One of the questions that people ask all the time with stored-procs is “is there any way I can find out programmatically what the values are or what data it might return?” One of the problems with the stored-proc is it’s up to the logic of the proc as to what sort of values are actually going to get returned. Take different code paths in the proc; get different things coming back. When you start hitting on the Web Services path, you have to programmatically discoverable contract as well. Any thoughts there?
Adam Machanic: That’s pretty interesting. Something I talk about in that talk when I’m discussion stored-procedures is that developers should always have a single set of inputs and outputs. That’s how… Anytime you discuses design by a contract and those kind of software developments questions, these are always what you hear. “Need to have consistency in your interfaces.” I think that being able to enforce that is very interesting. This is something that developers should be doing anyway with stored-procedures. You often see shops where they’re passing column names in, or some other list of… Have three different applications that all need slightly different result sets in slightly different formats. Huge complex stored-procedures, a million different parameters; it’s impossible to test. The only person who knows how to fix the thing is in Columbia on vacation. It’s a mess.
Greg Low: Or not coming back.
Adam Machanic: Right. They’re in Columbia. So this is just what I’m trying to push DBAs toward is writing testable code. Actually considering database code to be code. That’s a major problem in a lot of shops I’ve seen. They have this attitude of “well, as long as we’re just changing the stored-procedures we’re not changing anything.” But you’re changing the code that drives the data. If the data-driven application, that’s very important code. That’s the code that runs the application. I think it might be changing in some shops, I still see a lot of shops that aren’t using source control for database code and stored-procedures. That scares me.
Greg Low: Which source control system have you found useful for stored-procs yourself?
Adam Machanic: The free one from Microsoft… Actually not free. Visual Source Safe is popular and easy-to-use. No big deal. I really like a source control system, not well known, AccuRev. That’s the most amazing source control system I’ve ever seen. Screen-based. They have a totally different idea of how to do source control. They have this model where every change you do creates automatically under the covers a new branch. You can have lots of people working simultaneously on the same code, it branches everything automatically. When they’re ready to deploy, re-merges everything almost automatically. Really cool source control system. I wish they had more people using it so they could keep making it a little better. They’re doing pretty well at this point.
Greg Low: At least with SQL Server 2005 in the new Management Studio, there is a degree of the ability to integrate with source code control systems. I think it’s still been one of the main missing areas in the product is some sort of deep integration with source code control.
Adam Machanic: I agree. I also think a lot of the problem is that attitude about how to do source control. There’s a product called DB-Ghost, seemed popular. DB-Ghost will detect changes in the database; check them in for you automatically. A lot of people think this is a great idea. Make changes using Enterprise Manager, whatever. Then DB-Ghost figure out what’s there, check source control. That’s a horrible idea, personally. Not to say anything bad about DB-Ghost company. In my opinion, the correct order of operations is that you check a known piece of code out from a source control repository. Make your change, verify your change, and then you knowingly check it back in. Promote it to testable code. If you have a piece of software automating this process you don’t know what you’re getting. Someone could have changed the database for testing purposes. One of the problems when working with databases from a development standpoint, anyone can go in and alter anything they have access to. You won’t know about it. Not like compiling an application, where you find out about it because dependency has been broken. People need to control it themselves, not rely on SQL Server or third-party product to hack the source control.
Greg Low: Actually, the dependency one is interesting. One of the areas that I wish there was an option to get around in SQL Server is things like the deferred name resolution. The fact that I can come up with an incorrectly-spelled name of something and it’s not going to pick that up at the point I create the proc. It’s going to assume that that’s some name it just doesn’t know about yet. I wish there was some option to turn off the deferred name resolution and say “look, please…” compile isn’t the right word for the proc. “Go through and not just check the syntax. I want you to do object name resolution as well.” Do you ever run into problems with that?
Adam Machanic: All the time. I don’t understand and I’ve asked the SQL Server team to implement this as well. I recommend that everyone listening go to sqsworshipmicrosoft.com. They can do it for views. You have to do it for views. They can do it for user-defined functions. They can’t do it for stored-procedures. I realize that Dynamic SQL and other things like that might play a role in making it more difficult. For anything non-Dynamic I don’t understand why it can’t be done. I don't know.
Greg Low: That’s great. Well certainly, it’s another layer of things that could cause problems with run time. Leads us into another topic that you discussed in your session: exception handling in SQL Server. Obviously, pleased with the options that are now available in 2005 compared to what we had before.
Adam Machanic: The tri-catch syntax is amazing compared to what we had before. In SQL Server 2000, you can’t catch an error, can’t handle it. If one occurs the application is going to hear about it. It’s going to bubble up. There’s nothing you can do about it. All you can do is check to see if you’ve gotten an error. Maybe roll back the transaction. In 2005, you can catch the error; you can do something about it if you have something intelligent you can do. Depends on the situation. If you do you can keep the application from knowing about it. This leads to the encapsulation that I was talking about before. The application… Let’s say you have a deadlock. Great example.
Greg Low: It is; superb example.
Adam Machanic: Sometimes I’ve seen code in an application where they know that there’s some SQL prone to dead-locking. They’ll put a retry logic in the application. Once again this couples the application to the stored-procedure. The application has to watch for this exception. It knows that if I get this exception I need to retry a few times to get past deadlock. Now you can put that in the stored-procedure. It can loop with the tri-catch block a few times. The application will never know that anything went wrong. All the application will know is “I passed in some parameters, got back data.” Whatever. I think it’s great.
Greg Low: Deadlock’s a good one, it was one of the non-trackable errors up to SQL Server 2000. It’s one of the things that I find invariably that there are a number of things people need to write code to handle their various errors that can be dealt with in code. Always disappointed to see that bubble up to the user instead of being caught. I think deadlocks are one example of that. Also a good example of why, you’ve also got to trap all the error messages. The error message from the deadlock; tells the user they’ve been chosen as the deadlock victim doesn’t go down all that well if they manage to end up seeing that.
Adam Machanic: That’s very true. It’s also important from a security standpoint to never show end users the errors that are generated in the database-layer. In order to protect against SQL-injection attacks, things of that nature. If you start showing errors, experienced hackers will be able to exploit them, attack your system. Anyone showing the user the raw errors is just asking for trouble.
Greg Low: Plus, a lot of them look silly. Things like unexpected error, or catastrophic error. Sort of harder-technical material for an end user. All that has to be caught and turned into some sort of… Translated to something more pleasant for the user. That’s great. The other errors that… The other thing that looked ugly before is that if all you had was an error variable… The thing is if you make extensive use of that it tends to make the stored-procs look like a VB script thing. On error resume next. After everything you did you checked the value of the variable to see if something went wrong. A lot of stored-procs end up looking like that today. Structure exception handling allows you to write cleaner-looking code.
Adam Machanic: I think a lot of people have advocated over the years using checking the value of that error. I really like the transact-abort setting instead.
Greg Low: That’s a good one to mention. That’s not well-understood I think. This is the set abort on…
Adam Machanic: Right. That option changes the default behavior. Let me back up. In SQL Server there are three primary types of exceptions you can get. There are connection-aborting exceptions. That bores your whole connection to the server. Those are fatal. One set below those are batch-aborting exceptions. When you send requests to SQL Server you can send them a batch at a time. Send two select statements at once in the same batch or two updates, whatever. A stored-procedure is one batch for the most part. Dynamic SQL changes that a little bit. Assume that a stored-procedure is a single batch. You get certain exception types; it will abort the whole stored-procedure. If you’re on an update and there’s another update after it. If the first update has an exception the other one won’t get hit, your transaction will get rolled back. Statement-aborting exceptions. If you’re on an update and hit a statement-aborting exception, the next update is still going to run. Data can be in an inconsistent state. What this set-transact-abort setting does is turns the statement aborting exceptions batch-aborting exceptions. Upgrades them and rolls back your transaction if you hit any type of exception. Instead of having to check the value after every statement, you just set transact-abort on as you’re starting your stored-procedure, begin transaction, run everything; if you have any exception it will automatically be rolled back. Makes code cleaner, that’s how I do most of my work.
Greg Low: That’s great. That’s probably a good point; we’ll take a break for a few minutes. Come back and talk about testing.
Greg Low: Adam, tell us anything about where you live, family, sports, hobbies, anything you want to share so people get to know you.
Adam Machanic: I live outside of Boston Massachusetts. Town called Somerville. What can I say about Somerville? It’s a town, nothing too special. It’s one of the most, per square mile; there’s almost more people here than anywhere else in the country. It’s extremely packed; lots of people living here. It’s interesting because it’s a suburb, you’d expect New York City to be the most populated place per square mile in the country. Somerville is rivaling that right now. I’m not sure how that’s possible. It’s a good place. My wife, Kate, we just got married in July. That was nice.
Greg Low: Excellent.
Adam Machanic: Hobbies, I like to brew beer, that’s my hobby.
Greg Low: So you’d be a good mate for Kent Tegels. He’s keen on beer.
Adam Machanic: Yes, I am too. I like to brew beer, I like to drink beer. Those two go hand-in-hand. I have a cider actually, right next to me. Going to be ready next year. I make ten gallons of cider a year, of hard cider.
Greg Low: How long does cider take to make?
Adam Machanic: It takes about 20 minutes to make and a year to let it age until its ready. I like to wait at least a year, sometimes two. I put everything in a keg system. I make it; I leave it and let it ferment for about six months. Then transfer it to a keg and let it sit under pressure for six more months. Then it’s ready to drink. Good stuff.
Greg Low: At least you don’t have to watch it the whole time. That works away all by itself for a year, that’s great.
Adam Machanic: Ferments over in a corner of my office, gives me a nice bubbling sound once every 30 seconds or so.
Greg Low: Next thing we were going to talk about was testing. I must admit in the application development areas; people seem to be getting their head around unit testing and things like that more and more by the day. Area in SQL Server I see very little of any of this happening.
Adam Machanic: That’s kind of a sad state of affairs. A lot of DBAs I talked to don’t know anything about unit testing. What it is, what its goals are. Vague idea of testing in general. Unit testing, the goal of it is to do repeatable tests. Get back to this one again, interfaces. If you have a stored-procedure, you want to guarantee that the stored-procedures interface’s inputs and outputs don’t change. If you have an application that’s depending on those inputs and outputs, the contract we were talking about before; you want to make sure that application will always be able to use those inputs and outputs the right way. Any other applications that you write against that same interface will also be able to abide by that same contract. A lot of stored-procedure testing by DBAs is really just one-op testing. When you write the stored-procedure, pass in a few parameters; make sure you get data back, looks good. Ship it. Unit testing is really a lot more structured, more aimed at writing tests that can be repeated lots of times. When I say lots of times, I mean preferably multiple times a day. There is an idea of continuous integration that’s really popular right now. This is where people have servers 24-hours a day just to run a unit test. Anytime the application changes, it’ll almost immediately be unit-tested automatically. You’ll find out if you broke anything. The goal of all this is to find out before we go into product whether we’ve broken something. All too often, ship something to product; ten minutes later get the phone call that the system is down. You have this big fire fighting mode that you get into. Try to fix everything really fast, you don’t have time, you’re stressed. It’s a bad situation, something I would rather not repeat again in my life if possible.
Greg Low: One of the challenges with testing in databases compared to applications. With applications, most procedures in applications don’t tend to have a permanent effect. They’ll have something where you can pass a series of parameters; it ends up returning some value. One of the things in database applications is if we’re running some procedure, what it’s going to do is modify something in the database. To realistically re-run that test you really need to get the database back in the same state as it was to be able to run the proc again to do the same test. That’s one of the things people tend to find a bit of a challenge with database testing. I’ve even seen where people start the test by restoring the database from a backup; something like that. That can make it a fairly long testing procedure compared to what you can do in many applications.
Adam Machanic: I don't think that’s necessary all the time. I think testing against known data is definitely valuable. If you’re running very specific tests, it’s necessary actually. In a lot of cases, when we write unit tests, we’re concerned with interfaces and just being sure that those contracts didn’t break in some way. We don’t care as much about what’s in the database, or about the data. We know that in production that data is going to change. What we want to verify isn’t that we get necessarily the results that we’re expecting, but that we’re getting the results in the format that we’re expecting. The application can deal with them. We know the application is set up to deal with lots of types of results. The application is expecting them in the same format. If that breaks, that’s when we know we’ve caused a major problem. That’s when you ship something to production and start getting conversion errors in the application; the stored-procedure is suddenly returning the data. Someone had to change for some reason. Forgot to test that. That’s the kind of problems that unit testing is useful at finding.
Greg Low: In the session you did at the PASS conference you also had discussion on white box testing. Macros and things that you had produced.
Adam Machanic: Yes. I think not all exceptions can be tracked down. As I said, with unit testing. If they could be we probably wouldn’t have quality assurance departments in our companies. A lot of exceptions are internal or are not of the nature of interface exceptions. Interface exceptions that we look at with unit testing, that’s a black box test. Unit test will go inside stored-procedure; doesn’t care what’s going on. All it knows is passing something in one format; get it back in the same format. The formats are known to us. That’s all we know. White box test on the other hand is a test where we know what’s going on inside; we’re actually testing the internals. That’s really a form of de-bugging. What I’ve found, I worked for a while as a C++ developer. I found that in C++ world, a lot of people were using these things called assertions. What an assertion is is a special statement that you can put into your code. You are declaring a certain condition to be true at that point in the code. I can say “I assert that at this point in the code this variable must be equal to this value.” If that variable is not equal to that value it will throw an assertion failure. What the assertion failure does is it raises a very high level exception and stops the code right there. If you’re de-bugging the code you immediately find out when something’s wrong. This is an internal test. You put these assertions inside the code. Various points where you have logical assumptions you’re making. It’s a way of documenting your assumptions and your code. Automatically testing them as you’re de-bugging and as you’re running the code. As you’re going through quality assurance. This is something you turn on during de-bugging. You turn off when you roll the production. Two reasons. One, the application code might be set up to deal with some of the problems in the real world. You don’t want end users getting a bunch of these assertion failures. Two, they can cause performance problems if you continually test lots of different conditions. I realized that nothing like this existed in the SQL Server world. What I did was wrote my own. I ended up with a library called T-SQL Server. In order to create that I had to write another library first, called T-SQL macro. Macro-framework. This lets you define code expansion macros in your SQL code. You can write complex testing logic in these macros; programmatically turn the monitor off. I think I’m pretty happy with the results. I’m trying to convince others to start using this stuff; get some testing done.
Greg Low: You have those available for download from your site?
Adam Machanic: Yes. Datamanipulation.net/tsqlmacro. There’s a bunch of documentation and descriptions about how to use the thing; you can download it. Give me a bug report, all that stuff. Only gotten one bug report so far; disappointed in the user community. Either I’m disappointed or I’ve done a great job. I’m not sure which. It’s yet to be seen.
Greg Low: That’s great. The other one we didn’t mention is T-SQL unit.
Adam Machanic: Yes, that’s the T-SQL unit testing framework. I don’t remember the gentleman’s name who wrote it off the top of my head. That’s been around since 2002. I don't think it’s gotten a whole lot of use. It’s a great tool kit. Basically, in dot.net and java programming you have these frameworks called N-unit for .NET and J-unit for Java. What this guy did was he created a very similar framework for T-SQL. Called T-SQL unit. This lets you write tests which you could do anyway. It lets you run tests in a very easy, automated way. It handles all the reporting for you. It logs all your tests, you can tell it “I want to log these things for the test and run these tests.” Gives you a framework for controlling all that. Good thing to use. When I was writing T-SQL assert I was thinking about T-SQL unit at the same time. These Java and .NET frameworks have assertion frameworks built into them to help developers more easily write their unit tests. Not in T-SQL unit. It all goes hand-in-hand.
Greg Low: That’s great, most interesting. The other thing we should find out what else is coming up in your world at the moment? Terms of travel and things, presentations, books? It was great to get to meet up with you when we had the PASS conference. For the listeners, one of the things it was different this year is the MVP summit ended up clashing time-wise with the PASS summit. The SQL-MVP summit ended up being co-located with the PASS conference down in Dallas. Interesting experiment this year. Normally the MVP summit, you only have people there for a couple of days. What happened this item is an awful lot of people arrived on Sunday and were there until almost the next Sunday. Really interesting mixture of people. Most of the SQL-MVPs there plus a lot of the product group folk, a whole lot of folk who were there for the PASS summit as well. Interesting mixture of people. What were your thoughts there?
Adam Machanic: This is my first time at both PASS and MVP summit. I went in with wide eyes, took it all in. Had a good time. Found that the Europeans like to drink a lot. I’m not so sure about you Australians. I didn’t see you guys partying quite as much; I was a little disappointed. Maybe next time. I think you guys need to have a drinking contest against the Europeans and prove who are the better drinkers. That would be an important experiment; lead to better understanding of our interpersonal relationships.
Greg Low: So what’s coming up in your world?
Adam Machanic: The book I co-authored just came out. Pro-SQL Server 2005. I’d like everyone to go buy that right now. It’ll teach you.
Greg Low: That’s on A Press?
Adam Machanic: Yes, it is. I wrote the sections on SQL CLR programming; T-SQL for DBAs. I think I’m pretty happy with what I wrote. I hope you will be. We had some other great people on the project. Tom Rizzo, product manager for SQL, wrote the bulk of the book. Lewis Davidson, another SQL Server MVP wrote…
Greg Low: Lewis was there with us at the conference, we must get him on the show sometime soon.
Adam Machanic: He wrote the section on T-SQL. Excellent as well. Other than that, just doing a lot of writing. Have an article on the macro and assertion frameworks, will be coming up in the December issue of SQL Server professional magazine. More information on those.
Greg Low: Great. Thank you again Adam, thank you for your time.
Adam Machanic: Thank you very much, this was fun, appreciate it.
Greg Low: Great, thank you.