Jim Gray
SQL Down Under Show 9 - Guest: Jim Gray - Published: 22 Nov 2005
A.C.M. Turing Award winner and Microsoft Distinguished engineer Dr Jim Gray discusses the future of SQL Server. Jim (a database industry legend) gives his thoughts on the future directions for databas
Details About Our Guest
Jim Gray is part of Microsoft’s research group. His work focuses on databases and transaction processing. Jim is an active member of the research community in ACM, NAE, NAS, and AAS. Received the ACM cheering award for his work on transaction processing. He edits a series of books on data management, and has been active in building online databases like Tera-severer.net and skyserver.ss@sdss.org.
Show Notes And Links
We are very sad to hear that Jim Gray appears to have passed away. He was lost while solo sailing his yacht Tenacious to the Farrallon Islands to spread his mother's ashes. His loss to the industry is great.
For more information, see:
Jim Gray (Wikipedia) -http://en.wikipedia.org/wiki/Jim_Gray_(computer_scientist)
Jim Gray (Microsoft Research) - http://research.microsoft.com/en-us/um/people/gray/ />On a personal note, I wanted to add that in my early years learning about database management systems, Jim was one of my heroes and I was so pleased to get to spend time with him and for him to then agree to participate on the podcast. I was deeply saddened to hear of his loss.
Greg
Show Transcript
Greg Low: Introducing show number nine with guest, Jim Gray.
Welcome. Our guest this evening is a very special guest, Dr. Jim Gray. Jim is part of Microsoft’s research group. His work focuses on databases and transaction processing. Jim is an active member of the research community in ACM, NAE, NAS, and AAS. Received the ACM cheering award for his work on transaction processing. He edits a series of books on data management, and has been active in building online databases like Tera-severer.net and skyserver.ss@sdss.org. So welcome, Jim.
Jim Gray: Great to be here.
Greg Low: That’s great. Thank you so much. When I was at the PASS conference in Dallas, spent a little bit of time talking to me; I appreciated that. What I would really love to start with would be if you could give us some details on how you came to be involved in the database and SQL Server community in the first place?
Jim Gray: Well I guess it goes back to variable in a way. Fundamentally interested in how we know anything and how you represent information. That goes back to curiosity I had as a kid. Went through college; I majored in computer science. In graduate school. Hard to do that at the time, there wasn’t such a thing. One thing led to another; I found myself working at IBM research in San Jose. That was the database Mecca for IBM. In particular, we were much influenced by them and Ted Codd. Relational was a better way of programming information systems than the circles and arrows and diagram navigation model that Charlie Bachman was advocating. Gave it a try, built something called SQL. The project evolved into the IBM DB2 products. Gave rise to Oracle, many other companies. My career went along for a while; I worked at IBM for a decade, Tan Computers for a decade. Second SQL system there, maybe a third, I built two at IBM. After that went to Deck, worked on their database product. Life at Deck was difficult, we hit a rock. Thirty percent a year growth for 25 years, all of a sudden it stopped. I guess I arrived just about the time it stopped. Very unpleasant scenario where we had to sell off our assets. One of the things we did was sell our database to Oracle. I was not eager to work for Larry, so it was part of the sale. I took a year off, licked my wounds after. Visitor at UC-Berkeley. Looked around for where I’d like to work. Wasn’t really much of a choice, it was Microsoft and everyone else. Microsoft was my first second and third choice. I’m a software guy; it’s a software company. Very attractive place in many ways. I came to Microsoft and once I was here it was pretty obvious that SQL Server was my adopted home. I’ve watched the product evolve over the last decade; it’s come a very long way as you know. I think this is the current release; the 205 release is something to be proud of.
Greg Low: It’s a huge release.
Jim Gray: Short chronology.
Greg Low: It’s a huge release. One of the interesting to know your thoughts about the whole idea of taking five or so years on a release as opposed to a series of shorter releases?
Jim Gray: Huge is a good word for it. It’s fair to say that none of us fully understands what’s in SQL 2005 anymore than we understand what’s in Windows. One of the things that’s happened is the database systems have become an ecosystem. You have traditional tabular data store. XML store now, text. You have data mining, you have cubes. There’s an extra transform load service; there’s an English query in there somewhere. Security model hiding in there; management and self-tuning. It’s just awesome. Each of the things I mentioned is… I already said XML? XML and X-Query. Each of those things represents the efforts of dozens if not hundreds of people. It is a special thing. We have this ecosystem; getting the whole thing to fit together is a challenge. Hiding in there in addition is all of .NET. I think the unification of databases and programming languages is required; inevitable. Essential, important, something we’ve all wanted. Wonderful. So on. Also been a painful process for Microsoft and for the SQL Server team. If one had to point to one thing that caused the enormous delays… We were expecting SQL 201 and 203. You may have attended a launch event for that; airlift, some such thing. I attended one for 203. What on earth could have caused us to miss the schedule by so much? Underestimating the complexity of unifying something as enormous as .NET and SQL Server.
Greg Low: I remembered thinking and telling people a few years ago, I thought there was a time in the seventies when you could know most things about computing. There was a time a few years ago when you could know most things about SQL Server.
Jim Gray: I believe that’s the case; you could read the entire code base in 95 to 97. About 97, that ended. Fundamentally we went from a world of ten to 20 developers to a world of 1,000. Now there’s an ecosystem. One of the things that holds it together is in fact the calm; what service is structuring paradigm. That is the glue that makes it possible to have all these complements co-exists, make it possible to de-bug them. Getting that substructure inside of SQL was painful. This was among other things a re-structuring of the product internally. The complements have fairly clean interfaces.
Greg Low: That’s an area that intrigues me in terms of… I do get fascinated with the idea of how you also give a consistency when you have so many different groups of people working on different parts of the product.
Jim Gray: Well, the fundamental thing is that the currency inside of SQL Server is the data style or TDS. Tabular data stream. We’re gradually moving away from that and toward the web-services model and the data stuff. Every one of the complements takes in a command set. Old ADB model; commands in and data sets out. The thing that allows these components to interact and have fairly clean interfaces is this command-in data set-out model. The fact that we can get data mining and cubes and XML and tables and text all to co-habit the same space. In fact work together fairly beautifully; from the outside look like its one product. That is in fact underneath the covers there’s all a change in data sets. I haven’t come to this in a while; done some extensions to SQL. The fundamental thing that allows me to extend it is that I’m producing a data set. Producing a table as an answer, simple data set. You asked about “well, what about releases every five years?” I had a guy at PASS come up and say to Dave Campbell after his talk “Thank you for not giving us releases every year! It’s great you’re coming out with releases every five years!” Annual releases are very de-stabilizing. It’s much more convenient for our customers if the rate of change is a little bit slower. The problem is that the rate of change actually is not really that much slower. Now, when you get a release every five years, the release is a huge transformation, as opposed to lots of little ones. The amount of stuff we’ve added in this release and the changes to the whole development environment. Visual Studio; new workbenches, so on, have required all of us to learn a lot of new skills. I’m in the throes of learning these new skills. Many of your listeners are as well.
Greg Low: Yeah. I think one of the things a lot of people ask about why I got involved with the beta programs and so on. I think for myself, one of the reasons I’m so keen to be involved on a daily basis as the product is evolving; it reduces the learning curve. The same with things like the VB insiders group. Even though it produces traffic for me in my email every day, the difference is I learn a little bit every day rather than the day the product finally ships sitting there trying to make sense of it. With the number of things that have been enhanced or changed. The other thing I like is it also means you tend to have a background idea as to why things were done that way as well.
Jim Gray: That’s an organic process; you get to see the evolution of the ideas as they go. It’s quite astonishing how many ideas get thrown out. We’re in earlier release, now they’re gone. There are many things like… On either hand, now we are blessed with Link. I’m enthusiastic about. We’re not watching the organic processes Link is on.
Greg Low: That would be a good point to get onto that. Interesting to see your thoughts for that. Every time something appears like CLR-integration for example, there’s been the discussion about “is T-SQL dead?” Link is the next in the line that says “is T-SQL dead?” Interested in your thoughts.
Jim Gray: It is not dead. I cited that as examples that there’s enough lines of code written in T-SQL that it’s not going to die. There’s enough skill; I’m comfortable writing T-SQL, I bet you are too. It just is so easy to write. Scripting languages are that way. Frankly, database and C# and C++ are more demanding languages in terms of the amount of programming involved. A lot bigger than T-SQL programs. I think that one of the things that’s kind of interesting is that any CLR program actually has T-SQL at its root. T-SQL execution engine is what’s evoking that CLR program. One could argue that T-SQL will… Even if you’re writing C# you’re running a T-SQL program. The related thing is that in the development organization there is a T-SQL group, they’re not brain dead. Working hard to make T-SQL better. They are constrained by compatibility issues. Exception-handling. Lots of other features into T-SQL. Still keeping it a scripting language, tight and laid down. I think that the T-SQL will always be attractive to you and to me and to many of the listeners. It’s so easy to write. It’s so well-integrated with the SQL language. I also think that for professional programmers, people who like to have strong typing, good option model; actually want to build fairly large libraries of code. T-SQL helps with lack of skill. Loose language definition is probably not going to be the language of choice for those people. In Visual Studio it’s clear for everyone else. Source code control, project management; come a long way since 2000. Many of the arguments against it I think have been reduced by having it integrated with Visual Studio.
Greg Low: I must admit one of the… If I look at any area that frustrates me at times, I must admit it’s probably to do with the looseness of it. Also a real strength of it as well. If I look… One of the things that frustrates me is there’s not currently a way that I could compile a stored-procedure; have it resolve all the object names at the time I build it. The whole idea of deferred name resolution and so on; I completely understand why it’s there. By the same token I do also wish I had a mode where I could completely resolve them because I want to find those areas that compile time rather than run time.
Jim Gray: My favorite thing is to create a table in one statement and use it in the next statement. That’s going to be hard to do in a strongly type language. We talked briefly about Link. Has IntelliSense, wonderful features. It pivots on the fact that it’s a compiled model. The data distributions are static; you compile against these data definitions. You don’t just make up a string and create table and execute it and then it happens. Another string and use that table you just made. That’s not going to work.
Greg Low: And that’s a very similar thing when I look at implicit type conversions as well, the same sort of thing. The trying is very dynamic in the language, trying to be helpful. Concerns me in the reverse that there’s not strong checking. If I say something like “customer dot name equals three,” T-SQL compilers assume I can’t write three as a string. Human compiler would say “no, there’s actually something wrong there.” Reverse problem where that can bury bugs deeply.
Jim Gray: The database Q&R people are much more strict about their interpretation. They don’t do coercion it’s called. The people who did T-SQL were a very loose. They said if they could imagine a way of coercing this value to make it work, we could do that. Frankly, one could argue both ways. When I’m in a hurry I like the implicit coercion. When I’m de-bugging I hate it. Usually this is at the same time.
Greg Low: I think I almost wish there was a way to turn on a switch that says “right now I’m interested in being kind of precise.”
Jim Gray: I think there is actually; that’s what the ask flags are about. There’s a thing that says “be very strict, make me quote everything, don’t do implicit conversions,” so on. There’s a flag. I haven’t turned it on; I think that’s what it does. It says “be very strict, don’t do any of that conversion stuff for me.” There’s a reason I’ve never turned it on. It would break all my programs.
Greg Low: Mentioning Link before; I’m interested in your thoughts as to the whole development of that. I was in a software design review in May where they were showing us different versions of that. I must admit I thought the VB syntax looked more natural to me than the proposed C# syntax. The whole idea of integrating that into the language, your thoughts there?
Jim Gray: Wildly enthusiastic. My first reaction. Microsoft has been not very good at supporting embedded SQL, which is, in fairness, a very static way of doing things. On the other hand, allows you to get this compile time checking. Embedded SQL always had this mismatch between the variables of the language; be it in C++ or Small Talk; Python. In between their type system and with SQL type system. One of the things that we’ve done is to make sure that the types of SQL aren’t the same as the types of .NET. That’s something we want. Step two is we wanted to do an object-relational mapping. An object’s basis was a kind of relational mapping. If you look in Access you see object-relation mapping. Link takes a minimal approach to this. It says “a row of a table is an object, and a table is a class.” You can do the same thing for Views and so on. Minimalist approach to object-relational matter. You immediately get, for every table, a class. Then you get a very natural programming model which is the database and .NET in general. C# has this notion of innumerable. Our tables are innumerable. So is anything that is a list or hash table or dictionary. We can now do for each on the table. We can also do it for each on the answer to a query. The idea of Link is that you can, in your program, trade tables as just natural collections that are innumerable and mess with them. Cursors go away. Wow, what a wonderful thing. That went on forever; forever getting confused as to what the syntax is, whether it’s open or closed. Link is going off and doing all that junk for me. Because of the syntax it’s chosen, and the syntax is a little screwy to somebody who’s been writing SQL for the past 30 years. The “from” comes first, and the select and the “where” come later. That’s so IntelliSense works.
Greg Low: Going to say we had a session with a guy who was trying to do T-SQL IntelliSense. Was in the first beta of 2005, but was removed. I must admit after spending an hour with him I have a new found respect for how complicated that is.
Jim Gray: Indeed. So, at any rate, the question you asked was “what’s the story on D-Link and X-Link?” Both of them are going to become extremely popular with the crowd who likes to program in database. In C#. It’s one of the things that might attract you away from T-SQL. It really is an early binding. The people in T-SQL have an easy way of accessing data. If you write ADO.NET programs, we have to somehow tease the outputs of a SQL statement into variables by saying “give me the first result and put it in this variable. Give me the second result…” The amount of gunk that you need to write to get the null program to work is disgusting. That’s the big selling point for Link; it’s so easy to get started. So, I’m a big enthusiast of it, although in fairness I haven’t written very much code in Link. I’ve written a couple of demo programs, but nothing in anger. I’ve been so busy learning about SQL 205 and data mining and unified dimension model and XML and x-query and so on. I haven’t had time to pay much attention to Link.
Greg Low: What your thoughts on the type system where we’ve got the CLR types as opposed to the SQL types? How we get to the point of resolving that, given the fact we’ve also gotten anti-committees? Mismatch.
Jim Gray: The way it’s going to be. It may seem like a mismatch, but I just declare everything to be a SQL type and it works out great.
Greg Low: Yeah.
Jim Gray: I take the strings and I cast them. I cast the SQL to a regular string. It is a bit of a pain, I agree. The problem is that… I was there, we argued. I’m a programmer, I argued against it. “They make my life miserable!” A friend of mine writes the null-memo, an impassioned plea that we get rid of nulls. We had lunatic theory guys who thought this was really cool stuff; they loved nulls. We’re stuck with them. I’ve given up, resigned myself to living in a world in which there are nulls and I think it’s something of us. Give me the strength to change the things I can and accept the things I can’t. Nulls I’ve given up.
Greg Low: There are quite a lot of programming done in various languages of these. Some of the simplest ones have been when I worked on projects where the people have just accepted that everything in the database is not null-able. The full-on effects from that. The simplicity of the code was just amazing.
Jim Gray: When I create a table, everything is not null. Just to be clear. I wish it were default to make that the way things go. I say not null, not null. I’m blessed with databases that other people designed. Whenever I use those it’s not universal. Pretend nulls don’t exist.
Greg Low: One interesting thing I’ve liked to get your thoughts on is the object purists who just see the database as a sort of repository for an object. One I was asked about; someone passed me a thing to have a look at. What had me fascinated was how they mapped the objects. They had a hierarchy of objects. You might have an estate, a type of asset and so on. In the end what they would do is take what is effectively the base class; build a table that had all of those properties of those paths as columns. What they then did is all the derived classes from that they then added all of those columns in as well. Rather than modeling a table, it was estates as one table, assets as another. They found the base class and then modeled the whole thing as… In the end everything was an asset at the bottom level. Added column after column. I just seemed like the ugliest design when I look at that. Is that where we’re headed when the object guys say I need to put my objects in the database?
Jim Gray: Not necessarily. Called the universal relation. You just add columns to one table. The problem is that table you end up with is very sparse. Different approach, call it the fat table approach. Skinny table. You have one table that is the pivot of that. That table has essentially four columns. It has the object identifier, column identifier, values. Three columns. You can pivot table number one to get table number two. Unpivot to get table number one. Those two representations have one thing in common. They are terrible performers. How do you do indenting? For certain operations like updating an individual field they’re great. For other operations, they more or less suck. The one we talked about when talked about Link, they take a minimalist approach to object-relational mapping. If you had a class that’s a table, if you got a table that’s class. That’s pretty minimalist. It requires that the classes don’t have very much inheritance in them. Neat about how inheritance works in that model. If you can imagine. One possibility of the way inheritance work is the way you described; the universal relation of the model. That doesn’t explain to me if we have two children of a parent; what’s the table look like then? We’re in two different directions. Each leaf class has a separate table. What does the parent table look like? The different strategy is that every sub-class gets the key of the parent. Then gets the members that are unique to that subclass.
Greg Low: Otherwise you’d eventually say “in the end everything’s an object.” You’d end up with one table with every single column in it.
Jim Gray: Indeed, that’s the universal relation. It’s easy to have religious debates about what’s the right way of doing this. I think most of your listeners have their opinions; I’m not going to persuade any of them. By and large I think the Link guys have a minimalist approach. That’s approximately the approach I’ve been taking. Time will tell, we’ll see more and more of this over the years. Eventually something will emerge as being a good style. Two or three styles.
Greg Low: Do you think we’ll see inheritance appear in T-SQL anywhere?
Jim Gray: that’s a good question. Since it doesn’t have class concept at all, I’m not sure how inheritance would come in. It would require… T-SQL is so loosely typed, it doesn’t actually have… It’s only classes are tables.
Greg Low: I think what I’m thinking is if I, for example I build a stored-procedure that takes “animal” and I can’t pass “cat” to that. Wonder how that works.
Jim Gray: Stored-procedures do have signatures, that’s true. The signatures are pretty loose. Not just that, but they return either integers or strings or tables. Tables have signatures. If you think of a table as class, you might say that we actually have… Can’t pass certain tables as parameters. It’s pretty well-functioning with stored-procedures. Something that drives me crazy is all the funny rules about what you are and aren’t allowed to put in a function. The T-SQL guys are at a sweet spot where it’s easy to write T-SQL; the language is very loose. To the extent that they start tightening it up, more type definition; more and more like C#, they’ll be on a slippery slope where they give up a lot of the benefits they have. Not going to be as clean as C# ever. My advice would be to say accept it, it’s a great scripting language. Might be competitors, might go on the direction of Pearl and Python.
Greg Low: That’s a good point to take a break. We’ll come back after the break and talk about some futures for it.
Greg Low: What I’ll ask you is to share anything you want about yourself, life, where you live? I gather you’re based in San Francisco as opposed to everybody else in Redmond?
Jim Gray: We have a small research lab here in San Francisco. About five of us here. Our focus is on scalable computing and also on personal media management. Important to record everything about their lines; build information management system that allows them to organize their information, find things, make presentations from it.
Greg Low: There was a video on the Channel 9 site about that.
Jim Gray: Very exciting. The best poster child I could think of with both of us. Better infrastructure than the QT stuff we built as a sub-structure. The other things we’re doing is working on scalable servers. Probably the most famous thing we’ve done is the Terra-Server. We’ve been working with the SQL Server guys. Did a prototype of something for the database mirroring and SQL tool device. We’ve done some interesting work called RAGS, belongs to running the SQL part through backwards to generate SQL queries. Running the SQL queries against Oracle, DB2 and SQL Server. Seeing if you got the same answer. If you got the same answer you concluded all the systems were doing the same thing, that’s good. “This is probably a bug” if you got a different answer. Four or five gave the same answers, the fifth gave a different answer, it was likely the fifth was wrong. We concluded once the fifth was right and the other four were wrong. This is at a time when we felt that SQL Server was bad (mid-90s). Oracle and DB2 and others were better. We concluded that they all have their shortcomings. The SQL guys had been using ranks since then to do regression tests. As the test for SQL Server, it has had a big improvement in the correctness of it. Optimizer. Personally been working with the astronomers to build the sky server; trying to get all the world’s astronomy data on one. Experimenting in using databases to manage scientific data. One of the things that came out of that was 205; that would put a special index that the astronomers find useful into the samples of 205. It’s actually pretty good at finding points near points and points in common. People have picked up on it, used it in SQL Server for their applications. Several people used it without SQL Server at all, their way of proximity in their applications.
Greg Low: How did you come to have an interest in the astronomy side of things?
Jim Gray: It was happenstance really. We were doing Terra-Server, astronomy groups had been working with had a telescope, they had data coming. They had been building a data management system, it wasn’t working very well. They came to us and asked for help. We said we were busy; they went away, started asking really good questions. The next thing I knew I was helping them because they asked such good questions. I was learning more than they were. It’s been like that ever since. One of the things the Terra-Server brought up is the fact we didn’t do spatial data very well. Working with them has advanced my understanding and Microsoft’s understanding on how to manage spatial data. The compare-server was a spatial data warehouse. Pixel space; people mousing around the pixels, give me those pixels and those. Very little database in there in the sense of complicated SQL. The astronomers have these substantial catalogues, multi-terabyte, they won’t run ad-hoc queries against. The queries are very high-dimensional data. Hundreds of columns in the tables. Queries could go on for a page. Building the system that can take a random query off the Internet, that’s what we do, and then run it. Get reasonable answers in reasonable time. Stands as a real challenge. Working on that.
Greg Low: The whole idea of spatial areas is kind of interesting at the moment. The public has a big interest in that since things like Google Earth and Terra-Server, all those types of areas.
Jim Gray: They do.
Greg Low: Just started to realize what you might be able to do.
Jim Gray: We’ve been doing Terra-Server since 1997 it’s been on the Internet. It was popular, still is. In market today. That’s a big deal. It was especially still small potatoes. Google Earth came along, people started doing mash-ups of Google Earth with other things. People had been doing mash-ups with Terra-Server. It has the web services, had 2001 it first came out. Written up and various things. I think the reason is I’m not really quite clear through Google Earth much more caught people’s imagination than Craigslist merger with Google Earth was… When I saw it I said “wow, that is really cool!” Some people describe it to Ajax, but that’s been around for…
Greg Low: For a long time.
Jim Gray: Yes, a long time. Maybe the fact that the Netscape, Firefox finally supports Ajax made the difference for this community. Supporting it for a good long time.
Greg Low: Where do you see applications developing in those areas?
Jim Gray: I think that the fact that we are at about one billion cell phones, we’re hitting for six billion, 60 billion cell phones. Every piece of smart dust is a cell phone. Location for those are going to be absolutely central. There are four easy dimensions that we have in organizing information; three of them are space and one of them is time. Beyond that it’s brutally hard. Those four work in China, Russia, Israel. The minute you get into words, people have different concepts. I think red means stop and danger, and in China it means happiness. The color dimension is not one that has universal meaning. Languages are all different across the planet. Space is a big deal, time is a big deal. Certainly there’s a lot of information in text. Inside a culture, it’s probably text is the main dimension that you use to organize information. Cross cultures space and time are it. That means that I think cell phones are really going to thrive in this. They are the ubiquitous computer of the future; the PC of the future. One of the fundamental things they do is they give you spatial locality. There’s a new generation of GPS coming that works inside of buildings. It is very inexpensive, good to within a meter.
Greg Low: The other thing these cell phones have generated with the popularity of SMS as being almost another version of written language.
Jim Gray: Yeah.
Greg Low: Intrigues the English scholars as to the overall affect of that eventually.
I’ve found fascinating, if a table of things that with your building a system to try and correct or interpret text messages, one of the favorite things is now with predictive typing on the phones; the words that are ending up with the same key sequence but people commonly send the wrong word instead of the word they actually mean. That’s an interesting deciphering exercise as well.
Jim Gray: I have spelling correction on my email; it’s often fairly hilarious, embarrassing. At any rate, the theme here is that very large databases are in our future. Many of them are going to be spatially organized. That’s one of the dimensions you get – time and space; the dimensions you get for free. Much of the source of information is going to be cell phones. You’ll have metadata about who the owner of the cell phone is; other metadata. One of the things you lack is location. Frequently, people will ask “tell me about the things nearby.” That will become a very common query. People talk a lot about search. Usually people are doing search because they’re trying to accomplish some task. The task is the thing the people are routinely looking for to do something. Oftentimes what they want to do depends on where they are. Gas station, restaurant, a place to park their car. All these things have to do with location-constraints. Spatial search and spatial databases are a big deal. Already, going to become even more central in the next decade.
Greg Low: Yeah. That’s fascinating. The other thing I suppose, for SQL Server itself, have you got feelings as to directions it might evolve?
Jim Gray: Well, I came to Microsoft to do scale-out. We spent the last decade doing scale-up and scale-down. SQL Server running on Windows SE, 64 processor, one terabyte RAM, 6,000 disc systems, we’ve done a reasonable job of scale-up and scale-down. SQL Server grinding out of a million transactions per minute is awesome. Stunning. One of the things we haven’t done a good job of is scale-out. Allowing people to build arrays of hundreds of machines. Each running an instance of SQL Server, managing the array of a single system. Drop data into the array, have it be self-organizing. Dropping queries in it, having the queries run automatically. I think that over the next five years we will finally deliver on our scale out strength. That’s when I talked to our customers, talked to our sales people, talked to everyone. The one feature that we haven’t delivered on is what Oracle calls RAC. We’re getting beat up pretty badly about that in the field. It’s the one thing we don’t do. That fact hasn’t escaped the people. Realize there’s this fairly big hole in their story. They need to fix the hole. I’d say we made a decision; probably heard Peter give this talk, we made a decision not to chase the DB2 and Oracle tail pipes. Make SQL Server the next generation problem-solver instead of the one for last generation. Added data-mining, auto management, XML support, forward-looking. We had limited resources until we got a few things sliding. We let slide scale-out. What you’re likely to see in the next five years is first filling out the framework that you have seen in 2005. Many things were thrown out of the lifeboat just before 2005. Many things are slipping into the actually release. The one that got away is there was a step of what’s going to happen absolutely next. The next SQL has a bunch of things you can already see when the beta is out. Both of the features that are coming, more or less in the can. The thing after that is things like Link. Better integration with Visual Studio, more data-mining rules. Deeper XML support. More web-services, SQL Server is now web server.
Greg Low: That’s an interesting question; I’m wondering when you start to solve the scale-out problem, where do you find that that then leaves multi-tier architectures? Does everything collapse back inside SQL Server as an application server at that point?
Jim Gray: I think that’s already happened with 205. That’s a fairly radical view, but it’s a very radical view. I think what Service Broker built into SQL Server, web services built into SQL Server. SQL.net, built into server, don’t need IIS at the front end anymore. The only reason to have IIS is to access the firewall. That is part of the demilitarized zone. SQL Server has a large attack service, might not want to put it out on the Internet. On the Intranet, perhaps even in the Internet, you could put SQL Server out and have requests come directly to it. Our scale-out story will have a very large web services component too. Talked earlier about how you get the ecosystem to work and how you get things to federate together, need clean interfaces. Web services are clean interfaces.
Greg Low: That’s great, that brings us out of time. The other thing I would ask about is what’s coming up next in your world? What’s happening?
Jim Gray: I’m doing an enormous number of things, I talked about the astronomy work, the Terra-Server work, engineer spatial. One of the things I did not mention is that we are working very hard to get scientific literature online as well as…. There is something called club-med central. It’s your own international library. Each thing in SQL Server database that has the abstracts, mostly in XML-ish format. All the medical literature. The U.S. Congress’s mandate; any research it sponsors, the national pursuit of health sponsors should be deposited. Be public six months after it’s published in a journal. Taxpayer access. If you get exotic disease you can see the research that your tax dollars paid for rather than having to pay $50.00 to get an electronic copy of it.
Greg Low: Becomes interesting as to how that works across country boundaries. Which countries?
Jim Gray: That’s where we’re headed. U.K. has similar thing. Australian government and South African government has mandated the same thing. They don’t want their libraries to be first. We’ve made a portable version of club-med. Installed in the U.K., Italy, and South Africa. It’s going to be versions of it in Japan, various other sorts of places. Copies of one another federate from each other using web servers. When a document is deposited in one place it goes to all the other places. Interesting database problem as you can imagine. The poster child for XML and web services is a great SQL Server story. I’ve been working with them on that; the related project is the conference management tool that Microsoft has. Another SQL Server database that makes it easy to create program committees, run conferences. Looking at making that an easy way of making all the access journals. Trying to mentally get the information at your fingertips for information that is public, make it really available to people. That’s a very exciting project; going to have big impact on many people’s lives. Someday I expect to need some medical information and be able to go to the Internet. Medical literature online; fully accessible.
Greg Low: Thank you so very much, we’re honored to have you.
Jim Gray: Thank you; appreciate your organizing it, great to talk to you. I look forward to talking to you again soon.
Greg Low: Indeed, thanks.
Phone: 1300 SQL SQL (1300 775 775) l International +61 1300 775 775 l Fax: +61 3 8676-4913