Watch videos with subtitles in your language, upload your videos, create your own subtitles! Click here to learn more on "how to Dotsub"

Microsoft SQL Server In-Memory OLTP Project

0 (0 Likes / 0 Dislikes)
  • Embed Video

  • Embed normal player Copy to Clipboard
  • Embed a smaller player Copy to Clipboard
  • Advanced Embedding Options
  • Embed Video With Transcription

  • Embed with transcription beside video Copy to Clipboard
  • Embed with transcription below video Copy to Clipboard
  • Embed transcript

  • Embed transcript in:
    Copy to Clipboard
  • Invite a user to Dotsub
[SQL Server "Hekaton": Developer Deep Dive - Sunil Agarwal ([email protected])] Thanks for coming to this session. I am really excited that you guys are here to listen to "Hekaton." This is a technology we have been working for 4 years, and this is going to be in SQL 14, so this is really exciting times for us. My name is Sunil Agarwal. I am a program manager with SQL Server team. I've been with Microsoft for 10 years. And as I said, I'm really excited to be here. Before I proceed, can you just show me by raising your hand how many of you have been to overview session that we had yesterday. Okay. Approximately 50% to 60% of you have been to overview session. So what I'll do is I have one slide which will just recap what Hekaton value proposition is, and then what we will do, we will dive into how do you create tables which are memory optimized, how do you access those, what is the concurrency, and how do you migrate your application. Those are key things. There's a session that I have tomorrow which will go into the storage because not only it's in-memory tables, but we provide full durability. It's not like you shut down the server, your data is gone. It has fully ACID properties. So a lot of people when they think of in-memory, they say, "What happens if the system crashes, if there's a corruption?" So key thing to remember is with in-memory technology you have full control on data durability. In fact, we will provide integration with always on so you can have full failover, and that kind of functionality is not available at this time in CTP1 but will be available in CTP2. So that's the thing. Please feel free to ask questions because my goal is to understand where the issues are, and I'll try to answer those questions. And if the question becomes too detailed, we can take it offline. Sounds good? Okay. So Hekaton recap. What we have is we know based on the hardware trends that we are seeing you have machines available with lots of physical memory. Agree? I mean, you are seeing machines, right? And what we are seeing is in many, many cases the OLTP database can in fact fit into physical memory. Now, some of you will say, "No. My database is terabyte. It cannot fit in physical memory." And understood, but the point is you may have parts of an application that are performance hogs, they are not performing well. So what we are saying to you is you do not have to move all your database into memory. What you can do, you can say, "These 3 tables are the performance-critical tables for my database." "Let me just move those tables to in-memory and rest of thousand tables can stay the way they are." So the key point is we just have to move the tables that are performance-critical. And the way we achieve performance is we have optimized the access to those tables, the storage to those tables in a way that we can get extreme performance. And you will see that. And what we have seen is in many cases we are able to get 10X. In some cases, like bwin is a customer that is in production with Hekaton technology, they are getting 25X. So that kind of numbers we are getting. So it is optimized for in-memory. The second thing that we have done is to make things run much faster, what we have done, if you take a SQL stored procedure— everybody has SQL stored procedures, right?— and you compile them so they create stored procedure, it optimizes, it compiles, but the key difference with today's technology is when you optimize a SQL Server stored procedure and you execute that, it doesn't interpret it. What it means is we compile the stored procedure at a very high level of operators. For example, you have a SELECT * FROM JOIN of 2 tables, so the query plan that you see will look like join, operator, and then it has 2 child feeding it—one is a scan of table t1 which is assumed that was a table t1 and your join and a table t2—and then what execution happens is open table t1, get the next row, and then join, right? So this is how it is executed today, and it is quite fast. Many, many applications run extremely fast with this mode. But what we did was we said instead of interpreting the stored procedure tree, why don't we convert that? This is actually pretty amazing. We convert that to a C program and then use a C compiler to create a DLL. So now when you execute the stored procedure, it is not interpreting; it is running in machine code. And that's what we call native stored procedures. Okay? And what we have seen, by doing that we are able to reduce the number of instructions needed to execute the stored procedure, and that makes it run even faster. This is actually a pretty huge thing for us. The third thing what we have done, actually, let me ask you this: How many of you have locking, latching issues in your application? Right? This is a common thing. And it is not a bad thing, in a way, because databases by definition are concurrent. The locking and latching will happen because concurrent data access has to be done. What we have done is with in-memory technology we have moved away from locking and latching. And the way we were able to do it, because we know data lives in physical memory, we do not need pages. It's a longer discussion, but the thing about this is the storage format of in-memory data is very different than what it is with traditional SQL Server. So we have moved away from latching. It's all optimistic control. So with that, you can scale well, and there are many application patterns. My favorite example is last page insert. If you have an increasing clustering key and a lot of people are inserting those in increasing clustering key order, they all hit the last page and what happens, you have a latching contention. And we have seen it does not scale beyond 12 or 16 cores. So those kind of problems just go away. What we are providing is with in-memory technology friction-free scale up. And this is actually very important in today's terms because now you are getting machines with 256 cores. The question is I have 256 cores available in my machine, how do I take advantage of that? If I do not scale beyond 16 for certain patterns, that's not good. But with in-memory technology you can scale. The fourth thing which is very important again, that in-memory technology is part of SQL Server. It is not a new product. It is an engine that sits inside SQL Server. And we have integrated it in such a way that all the tools, all manageability is all integrated. So it's not a new product. All you have is a very seamless experience. You have a database that you use today. You can say, "I'm going to migrate these 2 tables to in-memory." You just move those 2 tables to in-memory. Now what you have is the same database. It's not a new database. Same database, 2 tables in-memory, the rest are regular database tables, and you are working, and you can just do the backups and the log backups. Everything just works without changes. So this is the power of integration. Diagnostics, the DMVs, everything is all integrated. So that's the solution that we are providing with in-memory technology. Any questions so far? I'll be happy to take questions. Or I'll go into the next slide. Okay, good. [Memory optimized tables and indexes] The question is, I mean, we all know how data is managed or stored for SQL Server today. What we have is you create a table and then you insert rows, and what happens underneath, SQL Server is packaging those rows into pages. Pages are 8K size. And it can pack rows and on and on, right? But with in-memory technology, we know the data is in memory, so we do not have to package them in a page. So let's see how you create an in-memory table. Here what you have is a DDL for the CREATE TABLE, and a few things stand out. First you see that I have a new kind of index called hash index. So remember in SQL Server today you have clustered index and nonclustered index. All are v3s, right? We are introducing a third kind of an index. It's a nonclustered index. It's just that it is a hash index. So what we are saying here is it's a hash index, and all of us have used in computer science hash tables, and there's a size, number of buckets, rows. So this is saying my customer ID has a hash index, it's a primary key, and it has bucket count of 1 million. Bucket count is an interesting number. So for example, if you have 10 million rows and your bucket count is 1 million, if you had a perfect distribution, each bucket will have 10 rows, right? And that means if I want to search for a row, I get to the bucket and then I search. In the worst case, I have to hop through 10 times to get my row. So what we are saying is the number of buckets you should define should be twice the number of rows that you expect. So that way you can minimize the troubles row. But that's a detail. So there's a hash index, and I can have multiple hash indexes in my table. So here on the next column, Name, I have another hash index. A few things to note here is this is a new option that we have. It is memory optimized. That's what it says. So that's what SQL Server has to know. This is memory optimized. I do not have to use pages. When you say that, the data can only live in physical memory. You have to make sure you have enough physical memory for that table. The second thing we have done is—actually, this is a very interesting point— durability. It has schema and data. What it means is if you shut down and restart, you will get the table back, you will get the data back. For SQL Server users, this is not news, of course. The data has to come back and of course the table has to come back. We have another option for in-memory tables in which we provide an option, schema only. What it means is when you restart SQL Server, table will get created but no data. This is a very important scenario. Think about ETL kind of scenario, where you want to load data into a table but you do not want to do logging, you do not want to do any storage. Why? Because in case something happens, you have the data available to you in a file outside or in a different system. This is what we are providing, a nondurable in-memory table. And this has been an amazing feature. A lot of customers who are doing ETL kind of scenarios instead of using temp tables today, they can leverage a nondurable table which is part of in-memory table OLTP to load the data and to massage the data. We already talked about hash index and the bucket count. And as I said, there are secondary indexes. So this is pretty much what you need to do to save this table list, memory optimized. There's a question in the back. [audience member] You said it has to fit in memory. What happens if I create my table and it now grows in size so it exceeds memory? [Sunil A.] That's right. This is a common point. The question really is what happens if my table grows beyond the physical memory we have? In that case, we will start failing the insert. So for example, you had 64 gigabyte of physical memory and in a simplistic case assume that your table grew beyond 64 gigabyte. It will start failing. It is not any different when you run out of log space or run out of disk space. So what we are saying is you have to provision your system appropriately so that you can guarantee that you have enough physical memory. Now, we are providing enough monitoring capabilities to see where memory usage is, so you can, for example, if you're running out of physical memory, one thing you can do, you can add more physical memory in the box and increase the SQL Server max memory parameter to do that. But that's a great point. It will do a roadblock. So I'll take a question here and then, yeah, carry on. [audience member] To follow up his question, can you limit how much memory those tables use? [Sunil A.] Great question. What we provide is we can limit. You can say this Hekaton database maybe cannot take more than this much physical memory, and we will just control that. And we do that through Resource Governor. One of the things I should say, that this is product in development. So that memory control thing is not part of CTP1, but it is coming in CTP2. A question here, then I'll move on. Yeah. [inaudible audience member question] [Sunil A.] The question really is if I have 128 gigabyte of physical memory that I have allocated to my SQL Server, right? You said max server memory 128. What you can do is—and you have 5 databases, right?— you can say, "This database has memory optimized tables and I only want to give 16 gigabyte to this memory optimized table." So what you can do, you can create a Resource Governor kind of functionality and say, "This database can only take 16 gigabyte." So rest will not be impacted. So we provide those kind of controls. But still, you have to make sure that you have enough physical memory for your tables. [inaudible audience member question] [Sunil A.] That's correct. So this is actually interesting. The question really is that 128 gigabyte that I have for SQL Server, does the Hekaton memory come from that? The answer is yes, because this is an integrated thing. So it's the same SQL Server. It's just giving memory in a different way. I'm going to proceed. What we do is with in-memory technology you have a CREATE TABLE DDL that we just saw, and what we do is we compile that into a C program. Even in the CREATE TABLE we create a structure in C, and I'm going to show you the structure, how it looks like. It looks pretty amazing. This is what the table is. And we get a DLL, and when CREATE TABLE is successful, the DLL is in the physical memory. So that's how we have a table, unlike a regular SQL Server table, which is not a DLL. So what I'm going to do is both memory optimized table and regular SQL Server table are SQL Server tables, so I'm going to use the term memory optimized to refer to these tables, and I'm going to use disk-based tables to refer to traditional SQL Server tables. When I use the word disk-based, it is misleading in the sense even memory optimized tables need to be stored on the disk to recover from crashes and corruptions. So just for a simplification, memory optimized means this, disk-based means what we know and love, the page-based tables. Okay. So now the question is how do we store the data in physical memory? Remember, memory optimized tables are not page-based. So they are just in memory. So we are storing them as rows, not as rows within a page. So what I have here is a table. This is a very interesting slide, and we should spend some time on it. This row has some headers. This is headers. It has 2 columns, Name and City. Simple table. I have created a hash index on it on Name. So here the first bucket, if you will, is pointing to a row with the timestamp 200, infinity. What it says to me is that this row was created by a transaction with the ID 200, number 1. And infinity means that this row is a current row. That means this row exists today. It was created at time 200 and it exists today. So if I do a SELECT * from this table, I will see this row. The yellow row that you see says it was created at time 90 and it was deleted at time 150. So that means if at this time the time is 220, for example, if I do a SELECT * from this table, I will only see the green row because the yellow row does not exist at time 220. So we achieve that through row versioning, and I'm going to go into that, but the key point is this timestamp that you are seeing, the 200, infinity or 90, 150 is telling you the time slot, if you will, where this row is visible. If I had a transaction that starts at time, say, 100, then I will only see the yellow row, not the green row because green row came into existence at time 200. This is how the data is arranged. Let's go beyond this example. Here what it shows is before time 200, the row in fact was there. What I changed in the row—so think of this— I created the row at time 100 and the values were John, Paris. That was the value of this row. At 200 a transaction came and changed Paris to Beijing. The row got updated. So that means between 100 to 200 the row was John, Paris. After 200 it became John, Beijing. This is an update. Now in this example, if I start a transaction, let's say, at time 110, what I will see is I will see row John, Paris and in fact I will see row Susan, Bogota. So you see that, right? Think of this: The rows are all in the memory organized through hash index. And when the transaction comes, we know the transaction starting point. The transaction starting point in this example is 110. You will find all the rows that qualify my predicate and are valid for my time. Any questions on that? I know this is a tough concept. Yeah. >>[audience member] How do you manage the TTS transaction? What if the transaction goes down? Will it grow back or what? [Sunil A.] First of all, we have a monodynamically increasing transaction ID thing, so everything is increasing. If a transaction rolls back, we just can move the row from that. [audience member] But it's the update that goes wrong. Does it remove the chain point? [Sunil A.] I think the example you are giving is that the transaction at 200 who wanted to update the row rolls back, right? So at that time it won't be infinity, it will say it is rolled back, and when the transaction comes there, it will realize this was not the row that is valid. It will go back to the row 100, 200. So let me give a different example. If the transaction that was modifying the row at 200 rolls back, the timestamp for the John, Paris will be 100, infinity because the transaction rolled back because that is still valid. [inaudible audience member question] >>[Sunil A.] Sorry? [inaudible audience member question.] >>[Sunil A.] Correct. So let me move on. Here is another thing. I have another row attached. I have a bucket here, first bucket. It points to John, Paris, which is the previous version of the row, and it also points to another row which is not John, it is Jane and Prague, and that row was created at timestamp 50 and it is still valid. So what I am saying is here there is a collision. You have 2 rows, Jane and John, part of the same bucket chain. This is how the hash index is organized. I can have another index on City, and here there are 2 pointers. One is pointing to Beijing, another one is pointing to Paris. And then there are further pointers. So essentially what happens is you can think of all the rows that are in-memory. They are arranged through pointers. And when I am searching for a row, I go to the right bucket and I traverse through the chain and I'll find it. I just find the rows that are valid for my transaction. Once the row becomes old, like in this case the row was— in this case if you look at Susan's row—90, 150, and let's say time is 300 right now and I know in my system there is no transaction active that will need a row from 90 to 150. If I know that there is no transaction active that will need this row, I can garbage collect it. So what we do is periodically there's threads running in the background that are garbage collecting it and it gets garbage collected. So at time 300 the only rows that I have are John and Jane. [inaudible audience member question] [Sunil A.] Correct. Yeah. Think of this: As rows become old, we keep garbage collecting. We in fact have a functionality, snapshot isolation, or read committed snapshot, in SQL Server 2005 onwards, and there also when the row versions get created, they get garbage collected periodically. And what happens is all SQL Server has to do is to know are there transactions that are interested in this older row version. If there is no transaction in that older version that needs that older version, we just remove it. So this is exactly what's going on here. So this is how the data is arranged. And at this time we only have hash index, and we are working on range index as well. So you will be able to query, "Give me all employees," or any range kind of query. Yeah. [inaudible audience member question] >>[Sunil A.] Correct. The question really is John and Jane are essentially 2 different names. Why are they linked to the same bucket? This is a typical hashing thing. When you hash a value, like hash a name in this case, it can map to the same hash value. John and Jane in this example map to the same hash value, so they get mapped to the same bucket. But in a most common case, they will go to the different buckets. There are some limitations, and one of the things to remember is SQL Server T-SQL surface area is huge, it has been developed since '94 onwards. What we are doing is for in-memory tables we wanted to not implement everything that we have. We said, "Let's focus on OLTP." And these are some restrictions, and let me just walk through those. We do not provide for memory optimized tables. There are no DML triggers available. Number 2, we do not support XML and CLR data types. Now, what it is is if you think about OLTP, workloads typically do not have XML data file. And we do not have CLR, so those are the 2 limitations there. The second thing is the row size, if you look at SQL Server, the row size is 8060 bytes. We support the same size for in-memory tables. The difference is in SQL Server you can have off row columns. So if I have varchar 800, if the row is not able to fit in that memory on the page, it can be stored off row. So we do not have off row for in-memory tables. It seems like a big restriction, but if you think of an OLTP workload, the typical row size is 200, 300 bytes. Let me ask just to validate how many of you have row size greater than 8060 bytes? I see 1 or 2 hands. If you have a situation like that, it is possible that not all columns are accessed in your performance-critical path. You can in fact break the table into 2 parts, 1 for in-memory and 1 for— Right? So there are ways to do it, but that's a limitation. And the third thing, which I think is scoping limitations, which some of you will say this is critical for you, we do not provide PRIMARY KEY, FOREIGN KEY constraint on the memory optimized table. If it is an issue, then you can implement that through a logic because PRIMARY KEY, FOREIGN KEY constraint, that is built inside SQL Server. What it does is you cannot really insert it into a FOREIGN KEY table unless there's a PRIMARY KEY in a previous table. So you can have some changes to your logic to do that. These are the kind of restrictions that you have, and there are workarounds for some of these. We have been working with many, many customers, and they are fine changing their application with these limitations. Another key point to remember is even though these are limitations, the key point is not all of your application is moving to in-memory, right? Only your performance-critical tables are moving to in-memory. [Accessing memory optimized tables] Yeah, a question. [audience member] When you say additional release there, are you talking about 2014? >>[Sunil A.] I'm talking about 2014. Yeah. So what we have done is we wanted to come out in a very targeted OLTP scenario and wanted to provide the surface area that can satisfy most customers, and we will continue to create on that. So now I have created memory optimized tables. How do I access it? The good thing is you can access them using the same tools that you have today, namely the T-SQL. So no changes. So what you can do is there are 2 ways we provide to access a memory optimized table. One we provide using a normal T-SQL that you have. So if you were using a stored procedure, SELECT * FROM table t1 and you decide to move that table t1 to in-memory, the same SELECT * will continue to work. What it is is what we call an InterOP mechanism. InterOP mechanism means that you can continue to use same SQL to access your memory optimized table. So what it means is all you have to do, migrate to table 2 in-memory and your application will continue to work without changes. So that's InterOP mode. Very useful. The second mode that we provide, like I was telling you that we can compile the stored procedure into machine code for faster execution, this is what we provide. We provide a way to create a stored procedure which is native. So here what you see is it's like a simple create stored procedure that you have, like you have in SQL, except you are saying it is native compilation. What it means is I'm going to convert this code into C and create a DLL out of that. So that's what it is saying. When we are compiling these things, we want to have all session level settings available at the compile time so that we can optimize it. So the second thing that we have done is you have to say SCHEMABINDING. What SCHEMABINDING is saying is that if the stored procedure is referencing, say, 5 tables, you cannot really drop those tables until you drop the stored procedure. In a production environment you don't drop tables that are being accessed by these stored procedures, but this removes any long-time checks that the stored procedure has to do. So SCHEMABINDING. And we need to know the context of the caller, so you have to specify it is EXECUTE AS OWNER, and then the code that you have inside the stored procedure is atomic. So think of it as like a save point. When I call the stored procedure, the whole thing either happens or does not happen. So it is an atomic thing for the whole thing, and then these are session level settings, what isolation level I'll be executing as in the language. So this is what you need to specify for your native stored procedure. And the native stored procedure goes through the same things. You create a DDL, it optimizes. This is the same optimizer that you have for SQL Server. It optimizes that. And then it creates a DLL. It converts into C code and the DLL is generated, and that's what it is. So essentially what happens is when you have in-memory table, for both table and the native stored procedure you have DLLs, and when I execute that, it just executes a C code and you get high performance query response time. As I said, we have 2 modes to access in-memory tables. In the interpreted T-SQL, which is the blue box, what it provides is that you can have a scenario where you have 2 tables. I'm going to use word t1 and t2. You can say t1 is my memory optimized table and t2 is my disk-based table. In an InterOP way you can say SELECT * FROM t1, t2 and you have a joint predicate. So what it means is that with InterOP you can access memory optimized tables and disk-based tables together. There is no restriction, which is important because you don't want to change your application. The second thing is it is less performant in the sense it is not optimized into machine code. It is not as optimized as a native stored procedure is. And the good point about InterOP is it is almost a full T-SQL surface area that you have available to you. So this is actually our customers are using InterOP way as a first step to migrate application from disk-based tables to memory optimized tables. Native on the other side, there's a limitation in the sense if you access a table inside native stored procedure, it has to be memory optimized table. So if I go back to the example of t1, t2, if t2 was disk-based table, I could not use that table in my native stored procedure. The only way I could use that table would be make t2 also memory optimized. It gives you maximum performance because we compile the code, and there are some limitations in terms of T-SQL surface area. The T-SQL surface area that we have defined, what we have done is we have found common operations people do for their OLTP workload, and those are the T-SQL construct that we support. And for some that we do not support, we have workarounds available for many of those. When to use these things. So InterOP is a very good way for ad hoc queries because there is no stored procedure. So that's one common case. Second case is reporting queries because remember, with the native stored procedures the surface area is limited, but with InterOP there is no limitation, so you can use any complex query that you want, like data warehouse kind of queries, so there is no limitation. Why would you use InterOP? In many cases—and this is a demo that we did in a foundational session on Monday— if you have a latch contention, all you need to do is move your table to memory optimized and your application runs as is and you remove the latch contention. So you can see significant performance gains. And we have seen in many cases where just by moving the table to memory optimized, not changing any code, we took care of latching and suddenly your throughput has gone much, much higher. In many cases we have seen 5X gain just by moving the table. But native is very useful. Let's say you do not have any contention in your application, no contention, so no latching issues but you're running 100% CPU because you are using the maximum capacity of the box. So what you can do now is you can say, "I'm going to migrate my logic, which is in stored procedures, into native." Now I can execute the same stored procedure, let's say, in 1/3 the instructions. If I can execute my logic in 1/3 the instructions, that means my throughput will go up 3 times because instead of taking 100%, 1/3 will take only 33%, and now I have 66% available in my CPU so I can push more logic. So this actually has been a very important scenario for us for reach scale-out. So if you have reach scale-out where you have 6 SQL Servers which are pushing the read workload like happens if you have read-only database available for your web form and everything, so instead of scaling out 6 ways, you can even do 1 server or 2 servers because we can provide much more throughput with native compiled storage procedures. [Demo - Accessing Memory Optimized Tables] Let me now switch to a demo and show you some of the things about native stored procedures. Everybody can see it? In the back row? Okay, good. What I will show you is in this case I have already created the database, and one thing that you should see in the syntax is that this CREATE DATABASE statement is like what we do today. This is no changes to the CREATE DATABASE statement. What we have done here is if I want to add memory optimized table to this database, what we do is we add a memory optimized filegroup. Notice here I am saying add a filegroup to this database which contains memory optimized data. That's what we do. We have a separate storage container for memory optimized table. So it contains memory optimized data, and then I add a file to it. So this is what I have done. If I go back to my Object Explorer— and this is something I want to show you in terms of integration with SQL Server— if I look at the properties of this database, I'll go to the filegroups here. It says this database has a filegroup which is memory optimized. And if I go here, it says how much memory is allocated to memory optimized objects. So we are tracking the memory that is being used in this database by memory optimized objects. And this is available not only through Management Studio, it is available through DMVs and so on. So that's what we have done. So I'm going to close this guy. And then what I have is let me go to my database, F5, and I have created a table here. So what this table is showing me is that it is a sales order table, and I have one index with a bucket count of 2 million. That's a table with memory optimized. And for this demo I have already created that table. So if I want to know through catalog view which tables are memory optimized, it says sales order table is memory optimized. Now let's do this. Let's insert some rows into this table. Let me delete all the rows in this table. There is nothing. And I'm going to insert 500,000 rows into this table. Notice I am using InterOP. This is a no native stored procedure InterOP. All I'm doing is all the constructs that you are used to, like loops at the loop, do the insert. So I'm going to insert the rows. And while it is inserting, when I created this table, durability was on. So I was able to load 500,000 rows in 4 seconds. You see here? In 4 seconds I was able to load that many rows. I'm going to do the same experiment with same schema of a disk-based table, and you'll see the difference. This is good. And now I'm going to create a disk-based table. I have unclustered index, exactly the same. I create the table. Now I'm going to insert the same loop, but I am inserting the rows into the disk-based table. I go here. Previously it took 4 seconds. This will take around 12 to 14 seconds. Let's see how long it takes. So right there you see on my laptop, which is not a production quality machine, we are able to insert data that much faster. So it took in this case 14 seconds. So I'm not doing any magic. I just had the same schema, same table in-memory and I'm using InterOP. So this is actually an interesting point. Without even doing my native stored procedure I am able to get this kind of performance for my load. And in this case the data is fully logged. The next thing I want to show you is I can do the queries. So I am going to do query. I say SELECT TOP 100 FROM this table. And notice I am getting the data in all random order. You know why it is so? Because it is in hash table. So it is starting from the first hash bucket, coming down. If I had done this thing for v3, it would have started from the left page forward. So you will see data in an ordered fashion. You see that? If I want to see in ordered way, all I have to do is order_id. Now I see order ID. Imagine this. I am scanning top 100 rows and I am sorting it and everything else. This is all that is going on. Now let me delete the rows from my memory optimized tables. Say its order is memory optimized. It got done 00. I mean, it got done under 1 second. I'm going to delete the same number of rows in my SQL table. I go back here, and note it is going to take time. One of the things to come away from this demo is, number 1, memory optimized tables from a user perspective are the same as regular disk-based table, same kind of syntax. But even on my laptop you see such a huge difference in terms of insert performance and delete performance. So that's the starting point, and I'm not even using right now the native stored procedures. Okay, so I deleted everything. Now what I'm going to do is here is a stored procedure. This is the native stored procedure. In this case I have already created the stored procedure. So before I do anything, I want to show you how does the C code look like. I will go to my directory here and go here. I'm going to show you this. This is t_5. This is a table as it looks to C code. So I'm going to open this. I just want to show you not for you to follow it, but this is how once I created the table, this is the code that was generated for that table. And you will notice here that this is how the table looks like. I'm going to go here. This is how the table looks like. It has 3 columns, which is like a structure in a C program. So as I said, it looks very different, but that's how we do with the DLLs. Let me do this, minus, and I'm getting out of here. Don't save. And even I want to show you the stored procedure. This is a stored procedure, just a very quick thing. Open with— Let's see. Sorry. I want to go to the C code. Notepad. And here, this is my stored procedure. So the key takeaway is we are not interpreting our T-SQL code, we are converting them to C and compiling. So this actually looks like pretty hairy code, but that's what gets done underneath the covers. I don't know about you, but when I first saw it, I was saying, "Man, this is what C code looks like underneath? This is actually pretty amazing." So that much on that. Now I'm going to go back to— Remember how much time it took to insert 500,000 rows through InterOP? It was around 4 seconds, right? I'm going to do the same thing through native. And you would expect it should go faster, right? Okay. The stored procedure is already created, and notice it is inserting 500,000 rows. I assume I dropped all the rows from this table. I think that is true. Let me just make sure it is. Yeah. There's no rows here. I go back and I'm going to execute the stored procedure. Now it is, for whatever reason— This is the funny thing in my demo. Sometimes there are some kinks. It should not be taking 7 or 8 seconds. It should go much faster. So I'm going to— It's a bit embarrassing right now. In my case it took 2 seconds, and I tried a few times this morning, and it's not going to listen to me here now. It's done. Okay. Let me do one thing just for the heck of it so that you at least trust me on that. I'm going to delete all the rows and run it again. Okay. It did. And if you think I'm lying, let me just do count(*). So this is actually the build that I have. It is not the build that is coming out for CTP1 because CTP1 is going to be end of this month. So there were a few kinks. And as I said, the behavior sometimes becomes—I can't explain it. And I'm going to do count(*) here. It looks insane in the sense I execute the procedure and before I finished my statement, the rows were inserted. Not the first time. The first time it took 27 seconds. I don't know why. Okay. Sir, question. [inaudible audience member question] [Sunil A.] That's right, that's right. I'll come to your question, sir. The question really is what is this magic? Why is it going so much faster? I mean, I understand data isn't physical memory, I understand your native compile, but this does not look real. Right? Do you agree on that point? It does not look real. So I have 30 minutes, but let me walk through that. Think of your current SQL database. Let's say it is 24 gigabyte. I'm just picking a number. Twenty-four gigabyte SQL Server database. And let's say your physical memory in the box is 64 gigabyte and for now assume the whole database is in physical memory. Assume that. And now if I do a query which says, "Give me the employee with employee ID 10," now any DB or any application developer would have created an index on employee ID. Agree? Okay. So employee ID index is there. So when I search for employee ID 10, I need to first go to the root of the index tree. What root of the index tree is doing is there's a page and it has a bunch of rows. You open the page, then you say, "Go to the next page, next level." So it is going down the index tree. In a typical large system, the index tree could be 5 levels deep. So what you are doing is even though everything is in physical memory, you are walking down 5 trees, 5 pages. Now you get to the leaf page. That's a leaf row, right? If the columns that you're looking for in your query are not covering columns, what do you need to do? You need to traverse a clustered index. So essentially what we are doing is in a traditional disk-based system, even though everything is in physical memory, you are walking through all these pages and searching and then latching and everything. It takes time. What we have done with in-memory is hash table. You hash on employee ID, you directly go to the row. So you can see I am not spending as many instructions to get to it. And what we did, we did an experiment. Have you heard of TPC-C, which is a kind of an OLTP workload? There's another version of it, TPC-E. We took a workload. In one transaction we said, "Next run the same transaction on disk-based tables and run the same transaction on memory optimized tables." We found disk-based table number of instructions were around 1 million. It was like 980,000 instructions. For the same query or same transaction it took 40,000 instructions. So you can see we are taking so many fewer instructions. That makes it so much faster. So I think with this, let me move on to another interesting thing that I want to talk about. I will actually skip this one in favor of the presentation because we are a bit constrained on time. [Transactions on memory optimized tables] So far, memory optimized table, native stored procedures and the performance you have seen, does that look exciting to you? I mean, you guys— I think it is actually pretty exciting. We have seen, as I said, many customer workloads running faster. Think of a latching problem. As a program manager I have worked with many customers. A latching problem typically means you have to rewrite your application or you have to change the logic of your application to get rid of that problem. Imagine this: All you had to do was take the table and move into in-memory. The only requirement that we have for you is to make sure that you have enough memory for your table. If your table is terabyte, that's a different deal, but most of the time tables are not that big and it can fit in physical memory and can guarantee it. It's such a simple way to get higher performance for your application. Yes, question. [audience member] The whole table or just part of the table? [Sunil A.] Right. So the question really is can I just move part of my table or the whole table? This actually is a very common question. At this time we do not support partitioned tables that have one partition as memory optimized and the rest as SQL because that would be the ideal solution. I have a table, a partitioned table in SQL Server today. I know my latest partition is the most active partition, and I want to optimize that for physical memory. Could I just not say that this partition is memory optimized and the rest is not? So that we don't support for this SQL 14, but you can get around that problem by creating a view. I am not saying it is the most elegant solution, but you can get around that way. I'll take one more question, then I'm going to go forward because I have tons of things to do. Carry on, sir. [audience member] How much memory do you need for table? When you showed the transaction [inaudible] in the memory while you do the transaction. Is that correct? Let's say you have 100,000 rows. How much memory [inaudible] [Sunil A.] The question really is how much memory do I need? The first thing is if I have 10 columns in my disk-based table, I have 10 columns in my memory optimized table. So on a ballpark, whatever the column storage you need for disk-based table, you need the same storage for your memory optimized table. So if I have a 16 gigabyte table in SQL Server, I need comparable storage for memory optimized table. There are 2 differences. And remember, in a disk-based table data is fragmented. Pages are never fully full. I may have 50% fill factor. In memory optimized there are no pages, so you can pack the rows much, much better. You agree? The second thing is our indexes, the way we manage indexes in memory optimized table is very different. For example, I'm going to make a little bit extreme example, but you get the idea. Let's say I have a table with 16 columns and I have created 5 indexes on those tables, and each index has 2 column as keys. So I have 5 indexes, each index has 2 column keys, so basically I have 10 columns that are part of my index keys, right? Two for each index. Essentially what you are doing is you have 16-column table and you are having a second copy of 10 columns. With memory optimized indexes, since it is hash-based, we don't store keys as part of the index. So index storage is much, much lower. So the point is in general what we have seen, the storage taken by in-memory table is much lower than what you have for disk-based tables. So if I am migrating a table which is— I'm rolling my disk-based table that is 16 gigabyte, we are saying pretty much you can just take that as a ballpark and say that's what memory you need. Because we have row versions, in SQL Server snapshot isolation the row versions are kept in tempdb, but here we are keeping row versions inside memory optimized database. So we are saying in the worst case have twice the— So we have some guidelines, and there will be a white paper on that. So you get the idea on that, right? Okay. Let me move on to that. We have only 20 minutes. I think this is a really important topic, the transactions. Let me walk through this one. Actually, it is interesting that you talked about multi-versioning, and I think it is coming here. If you have non-blocking protocol, you do not want to block, in a traditional SQL Server what happens if I am modifying the row, what do I have? I have exclusive lock on that row. If you want to select the row, you have to wait because I am modifying it. The only way to have a non-blocking access is to have a row version. This is pretty much what we provided with snapshot isolation or RCSI. So we have multi-versioning even with memory optimized table. However, we are not using tempdb for multi-versioning. So we keep the row versions. Remember the first picture that I showed where you had John, Paris and Beijing because all those versions are sitting in the hash table. Everything. So if I look at a hash index at a given time, let's say I have 10 rows which are real rows for my table. There may be some rows which are the older row versions, but they are part of the table. So let's say I have 5 older versions and 10 latest rows. So how many rows do I have in my hash table? Fifteen rows. So when I scan for my transaction, all SQL Server is doing is, "Okay, for this transaction I know what the timestamp is." "I'm going to go through everything and I'm going to only go to the rows that meet the timestamp criteria." You see that? So I will only get the rows, but everything is sitting inside the same hash table. The transaction isolation that we support in memory optimized table is snapshot isolation, but it is not based on tempdb. As I said, we keep its own. No locks, no latches, and because no locks, no latches, there is very little context switch. This is a bit geeky point, but what happens in SQL Server, I am running a transaction and I get a lock. If I have to wait, what happens is that transaction or that thread gets off the CPU some of the thread counts. So there are a lot of context switches. But if I do not have to wait for locks, I just keep going, so it minimizes the context switches. That is going back to the point that you were mentioning, why it is faster. What we have done is it's like the term well-oiled machine. What we are trying to do is any waits we have in the system, can we remove them. If we remove the waits, we can go much faster because CPU is not changing, CPU is still the same, but we are just doing things more efficiently. The question now is if I don't take lock, I don't even take up deadlocks. So I am coming, I am updating the row; you are coming, you are updating the row. You are not aware that Sunil is updating the row. You just see the row as it exists. So what we do is we allow you to update. So both guys updated the row, but only one person will succeed. What we do in the end, we validate. So the first person who commits is successful; the second guy has to roll back. It's almost like deadlock detection. Remember like when the deadlock happens, one guy is chosen as a victim? So we have a conflict detection, and we figure out okay, you stepped on another transaction. It can be a problem. That means my transaction is getting rolled back. But in a typical OLTP environment, you do not get that kind of conflict. You do not step on each other. When we step on each other is my row and your row goes to the same page. Because it is going to the same page, I need to get a latch. So I am inserting row 10, you're inserting row 11 on page P1. I latch page P1, I'm inserting my row 10, and you are coming. You say, "Hey, I need to latch page P1, but Sunil has a latch on P1." It waits. So that kind of wait we see. It's not that Sunil and you are updating the same row. It happens. I'm not saying it does not happen, but it's not as common. So we have a conflict detection. We are saying that since we have a non-blocking protocol we are supporting 3 isolation levels: snapshot, repeatable read, serializable. One common question people ask, "Hey, what about read committed because that's my isolation level?" Think of this: Read committed is a default isolation level for SQL Server. Ninety-nine percent we have data because we have a lot of monitoring that we do for the customer application. Ninety-nine percent customer applications use read committed isolation level. Snapshot is a superset of read committed. So if you are using read committed, your application will continue to work fine with this snapshot isolation. There are certain applications that depend on blocking. There are applications that do not want row versioning. They want to block. Those kind of applications cannot use memory optimized table, but that's a rarity. And the reason I'm saying it is is because when we implemented— How many of you are using RCSI, read committed snapshot? Only very few. So what RCSI does, it provides a non-blocking implementation of read committed isolation level. There are applications that depend on blocking. They do not work with RCSI. We were concerned about that point, so we actually had a blog, we had articles, and we found only a couple of customers complained that their application did not work because of that. So it's not as common as at least we had started off with. So these are the isolation levels we support. Here is the scenario that I was describing to you, the write-write conflict. Here I have 2 transactions, T1 and T2, and remember the green guy is updating the row with key c2=123, and the red guy is also going to do the same thing. So what happens is now once this happens, both are proceeding forward. So the green guy updated the row. Now the red guy comes and he wants to update the row. When it goes to the row, it knows it was already updated by the green guy. So it fails right there. You see that? So that transaction will have failed. That means you have to put a logic in your code to say, "If I fail because of write-write conflict, I need to retry." So that's the change you have to do, and I have an example that shows you how you can make that kind of a change. There is that kind of a complication. The point is those 2 transactions could fail because of deadlock as well in traditional SQL Server, disk-based tables. So most customers do have a logic to say, "If I fail due to deadlock, I have to retry my transaction." If you had that kind of a logic, this should just work as it is. So why people use locking hints. They use NOLOCK, you know what I'm saying, because they want to get around the blocking. NOLOCK, that kind of thing. But now with memory optimized table there is no locking, no locking hints. Make sense? Okay. You have to account for failures because of write-write conflict or sometimes validation. That's a change and I agree needs to be done, and, as I said, it is going to cause you some application change if it is in fact a factor. Third, avoid long-running transactions. This is any developer's guide. Even disk-based tables, if you have long-running transactions you are in trouble in the sense you are holding logs for disk-based tables for a long time, you are holding transaction logs, so on and so forth. So those are the guidelines that we have. So this is the last topic for me and then we go from here. Remember we support snapshot isolation, repeatable read, and serializable for memory optimized table. And for disk-based tables we support read committed, snapshot, repeatable read, and serializable. So there is incompatibility. What do we do? So the question really is it is not an issue when you are doing native stored procedure because native stored procedure cannot access tables which are non-memory optimized. So how do we manage it? So in the disk-based tables if you are running your transaction, say you have a database which has disk-based tables and memory optimized tables. If I am running a transaction that is only touching disk-based tables, nothing has changed for you. Agree? Right? Because that's how it is. If you have a transaction that is only accessing memory optimized tables, the only thing is it is supporting snapshot, repeatable read, and serializable. But what if you have a transaction that is not only touching disk-based tables, it is also touching memory optimized tables? How do I deal with that because I have a disk-based transaction read committed. What do I do now? We have this table. It may look a little bit complicated, but let me walk through and simplify it for you. You see the first row. If the disk-based transaction is read committed, you can use snapshot for memory optimized tables. As I said, this is the most common case. And snapshot is a superset of read committed. So for 99% of applications this is just fine. Read committed for SQL transaction— when I say SQL it's the wrong thing. Your session setting is read committed, which is default, and you need to access snapshot in the Hekaton tables or memory optimized table as snapshot. If you use read committed for your disk-based tables, you can use repeatable read and serializable for your memory optimized table. But if you use repeatable read and serializable for your disk-based tables, you can only use snapshot. This looks like a rule that makes no sense. Why and things like that? And I don't think I can go into a lot of details for that, but the common case that you should think about is my application is using default isolation level. I'm just going to use that, except for my in-memory table I have to say explicitly snapshot. So when I need to access my memory optimized table I have to say SELECT * from memory optimized table, snapshot, and then I can get it because your session level setting is read committed. If you have a disk-based table under snapshot, we don't support that transaction with memory optimized table at this time. So there are some limitations like that. But I think the thing that I want you to walk away from is your common scenario is read committed, and that should work without any problem. [Demo - Transaction Isolation] You have a question? Okay. [inaudible audience member question] [Sunil A.] The question really is how does it work in an always on scenario, because in always on, the readable secondary by default maps everything to snapshot isolation. The question really is if I have a readable secondary on a database which has memory optimized tables and everything is snapshot isolation, I think in the previous slide I was telling that if you have a snapshot, we don't support access to the memory optimized table. So that's the question. Always on integration is not available in CTP1, and this is something that we are working with. What we have done is we have done a customer survey. We said, "If I give you a non-snapshot rows for memory optimized table without any consistency, is this acceptable?" And all customers said that is acceptable. They want a capability to read from secondary. So we will be able to get that, but that is something that we are working through right now. I have a demo, and I know it's 5 minutes left, so I'm going to skip the demo. Essentially what it is trying to show is— and I'm going to approve the demo scripts on the TechNet site. What it is trying to show is how do I code against write-write conflict and things like that? Before I proceed with more questions, let me just show you one quick slide. [Migration Examples] Migration examples. This is actually very important. For example, the retry logic. As I said, we have to code for conflicts and validation failures. So here is an example. I have a stored procedure. This is a T-SQL stored procedure. Let me go through the full thing. I'm going to stop here. So what it is doing is it is calling a— In the center you see EXEC usp_my_native_proc. It is calling the native procedure. See that here? So what we have done is in this logic for this load procedure I have put it under the TRY block and I'm catching the exception. And if it fails because of validation failure write-write conflict, it retries. So this is pretty much what you have to do against if you were just executing like this. So I'm not saying it is that simple, but it is not that complicated. It's a very simple template that you have to do. Any time you expect write-write conflict or a validation failure, you put that native stored procedure inside another procedure and handle the exception and to start the procedure again. So that's how you can do it. That is one example. I think I should stop here. I know we have a few more minutes left. I'll be happy to take more questions. I will start from you, sir, and then I'll come to you. Yeah. [audience member] How often does it checkpoint? >>[Sunil A.] Okay. Checkpoint is a very important point because— All of you know what checkpoint is in SQL Server, right? You have to checkpoint so often because otherwise your log cannot get truncated, and you have to flush the dirty pages to reduce recovery time. So just like your disk-based tables, we also do checkpoint for memory optimized table. Currently we do checkpoint every time the log exceeds 1 gigabyte. And right now the policy is not tuned, but our expectation is it is going to be integrated with SQL Server checkpoint and the checkpoint will happen. Currently if you want to know, we do checkpoint every 6 hours or when the log exceeds 1 gigabyte. So in today's terms, 1 gigabyte log is no big deal. You have a question, sir. [audience member] How are you going to integrate with [inaudible] [Sunil A.] That's right. So the question is that for native stored procedures, for native access to your memory optimized table you need a stored procedure. But we also allow InterOP way. So those things will just continue to work. So what we are saying is this is a very common case with ISVs. ISVs do not have stored procedures because they have to work not only with SQL Server, they have to work with Oracle, they have to work with everything else. So we are saying just move your table to memory optimized. You still get the benefit of no latching, you still get the fast access because of hash indexes and so on. Yeah. Okay. Question? Yes, sir. [inaudible audience member question] [Sunil A.] Right. Let me come to that. Actually, let me show one thing here. I have a session tomorrow on the storage, and it will cover about the checkpoints and everything else. Okay. This is the session I have tomorrow. I hope you can find time to come. The question really is what is the limitation on— If I have 2 terabyte physical memory available, can I have 2 terabyte of memory optimized tables? Right now what we have is the memory optimized tables that are durable can be up to 512 gigabyte for durable. For nondurable there is no limitation. So we can have that. What we've found is there could be some customer case where they need more than that, and our expectation is your 512 gigabyte cannot be the heart part of your database. It has to be much lower than that. If it was, it would not have worked very well even with SQL Server today unless you have so much physical memory. So we have a 512 gigabyte limitation for the durable tables and no limit on the nondurable tables. Any other question? I saw some hands here. Yes. [inaudible audience member question] [Sunil A.] So the question is the columnstore is a very useful functionality that we have. Will it be supported for in-memory table? It is not today because in-memory tables are focused on OLTP and the columnstore is focused on data warehouse. However, you can have a database with in-memory tables, disk-based tables with columnstore. You see that? Okay. Yes, sir. [inaudible audience member question] [Sunil A.] You touched upon something very, very important. The question really is remember, we compile the stored procedure. And the question is when it is compiled. It is compiled when you first create it. Let's say I create the stored procedure and now I shut down my SQL Server. I start again, the stored procedure already exists. At that time when you first invoke it, at that time the DLL is loaded. But the important point is what happens if the statistics changed over time? If it changed over time, I compile my stored procedure, the plan may not be right. What SQL Server does today for disk-based tables is it recompiles the stored procedure automatically. You don't even know about that, right? The stats changed, it recognizes that to say, "I'm going to recompile the stored procedure." With memory optimized native stored procedures, recompilation does not happen automatically. What you have to do is you have to read the statistics. This is one other limitations that is important. You have to read the statistics and drop the stored procedure and recreate it. This is not the best user experience, I would agree, but this is what we have for SQL 14, and this is something we will be obviously improving over time. I'll take one question there, then I'll come here, sir. Yeah. [audience member] With SQL Server as it stands now, when you start a SQL Server through undo and redo or the transaction log [inaudible] but then how does that data get from disk back into— >>[Sunil A.] That's right. So the question is— I know the time is up, so thanks for coming, and I will answer these questions. Please stay. The question being asked is undo, redo, those transaction things, and when I shut down SQL Server, how do I load the data? So what we have is we have the memory optimized table data persisted on disk. So when I restart SQL Server, we have to load the data from the disk into physical memory. So we do that in a parallel way. And what we have found in our experiment, we can load that at the speed of the disk. There is one question here. Yeah. [inaudible audience member question] >>[Sunil A.] That's right. So when we say commit the transaction, you can be assured the data that has committed is durable. Now SQL Server goes down, anything happens, you get the data back. [inaudible audience member question] >>[Sunil A.] Correct. I think the question you're asking is end of the day, your bottleneck will be commit of the log record, right? That's correct. So I think if my bottleneck is because I'm not writing in log fast enough, that's a good problem to have because I can have an SSD or higher throughput log device. But your point is an excellent point. I'll be here. I think I should wrap up because there will be a new speaker. But thanks really for coming, and I hope to see you guys tomorrow. We'll do more checkpointing how storage works. Thank you so much. [applause]

Video Details

Duration: 1 hour, 18 minutes and 44 seconds
Country: United States
Language: English
Genre: None
Views: 5
Posted by: asoboleva99 on Jun 20, 2013

Hekaton is the in-memory OLTP engine for SQL Server, as a part of XVelocity. In this DB developer focused session, we discuss how "Hekaton" table and natively compiled stored procedures work, and how ACID is achieved from the transaction perspective without locking and latching, as well as the analysis and migration tool.

Caption and Translate

    Sign In/Register for Dotsub to translate this video.