How to implement In-Memory OLTP on SQL Server in Linux
[box type=”note” align=”” width=””] Below given article is an excerpt from the book SQL Server on Linux , authored by Jasmin Azemović. This book is a handy guide to setting up and implementing your SQL Server solution on the open source Linux platform. [/box]
Today we will learn about the basics of In-Memory OLTP and how to implement it on SQL Server on Linux through the following topics:
Elements of performance
How do you know if you have a performance issue in your database environment? Well, let’s put it in these terms. You notice it (the good), users start calling technical support and complaining about how everything is slow (the bad) or you don’t know about your performance issues (the ugly). Try to never get in to the last category.
The good
Achieving best performance is an iterative process where you need to define a set of tasks that you will execute on a regular basics and monitor their results. Here is a list that will give you an idea and guide you through this process:
- Establish the baseline
- Define the problem
- Fix one thing at a time
- Test and re-establish the baseline
- Repeat everything
Establishing the baseline is the critical part. In most case scenarios, it is not possible without real stress testing. Example: How many users’ systems can you handle on the current configuration? The next step is to measure the processing time. Do your queries or stored procedures require milliseconds, seconds, or minutes to execute?
Now you need to monitor your database server using a set of tools and correct methodologies. During that process, you notice that some queries show elements of performance degradation. This is the point that defines the problem. Let’s say that frequent UPDATE and DELETE operations are resulting in index fragmentation. The next step is to fix this issue with REORGANIZE or REBUILD index operations. Test your solution in the control environment and then in the production. Results can be better, same, or worse. It depends and there is no magic answer here. Maybe now something else is creating the problem: disk, memory, CPU, network, and so on. In this step, you should re-establish the old or a new baseline. Measuring performance process is something that never ends. You should keep monitoring the system and stay alert.
The bad
If you are in this category, then you probably have an issue with establishing the baseline and alerting the system. So, users are becoming your alerts and that is a bad thing. The rest of the steps are the same except re-establishing the baseline. But this can be your wake-up call to move yourself in the good category.
The ugly
This means that you don’t know or you don’t want to know about performance issues. The best case scenario is a headline on some news portal, but that is the ugly thing. Every decent DBA should try to be light years away from this category. What do you need to start working with performance measuring, monitoring, and fixing?
Here are some tips that can help you:
- Know the data and the app
- Know your server and its capacity
- Use dynamic management views—DMVs:
- Sys.dm_os_wait_stats
- Sys.dm_exec_query_stats
- sys.dm_db_index_operational_stats
- Look for top queries by reads, writes, CPU, execution count
- Put everything in to LibreOffice Calc or another spreadsheet application and do some basic comparative math
Fortunately, there is something in the field that can make your life really easy. It can boost your environment to the scale of warp speed (I am a Star Trek fan).
What is In-Memory OLTP?
SQL Server In-Memory feature is unique in the database world. The reason is very simple; because it is built-in to the databases’ engine itself. It is not a separate database solution and there are some major benefits of this. One of these benefits is that in most cases you don’t have to rewrite entire SQL Server applications to see performance benefits. On average, you will see 10x more speed while you are testing the new In-Memory capabilities. Sometimes you will even see up to 50x improvement, but it all depends on the amount of business logic that is done in the database via stored procedures. The greater the logic in the database, the greater the performance increase. The more the business logic sits in the app, the less opportunity there is for performance increase. This is one of the reasons for always separating database world from the rest of the application layer.
It has built-in compatibility with other non-memory tables. This way you can optimize the memory you have for the most heavily used tables and leave others on the disk. This also means you won’t have to go out and buy expensive new hardware to make large InMemory databases work; you can optimize In-Memory to fit your existing hardware.
In-Memory was started in SQL Server 2014. One of the first companies that has started to use this feature during the development of the 2014 version was Bwin. This is an online gaming company. With In-Memory OLTP they improved their transaction speed by 16x, without investing in new expensive hardware.
The same company has achieved 1.2 Million requests/second on SQL Server 2016 with a single machine using In-Memory OLTP: https://blogs.msdn.microsoft.com/sqlcat/2016/10/26/how-bwin-is-using-sql-server-2016-in-memory-oltp-to-achieve-unprecedented-performance-and-scale/
Not every application will benefit from In-Memory OLTP. If an application is not suffering from performance problems related to concurrency, IO pressure, or blocking, it’s probably not a good candidate. If the application has long-running transactions that consume large amounts of buffer space, such as ETL processing, it’s probably not a good candidate either.
The best applications for consideration would be those that run high volumes of small fast transactions, with repeatable query plans such as order processing, reservation systems, stock trading, and ticket processing. The biggest benefits will be seen on systems that suffer performance penalties from tables that are having concurrency issues related to a large number of users and locking/blocking. Applications that heavily use the tempdb for temporary tables could benefit from In-Memory OLTP by creating the table as memory optimized, and performing the expensive sorts, and groups, and selective queries on the tables that are memory optimized.
In-Memory OLTP quick start
An important thing to remember is that the databases that will contain memory-optimized tables must have a MEMORY_OPTIMIZED_DATA filegroup. This filegroup is used for storing the checkpoint needed by SQL Server to recover the memory-optimized tables. Here is a simple DDL SQL statement to create a database that is prepared for In-Memory tables:
1> CREATE DATABASE InMemorySandbox
3> PRIMARY (NAME = InMemorySandbox_data,
5> ‘/var/opt/mssql/data/InMemorySandbox_data_data.mdf’,
7> FILEGROUP InMemorySandbox_fg
8> CONTAINS MEMORY_OPTIMIZED_DATA
9> (NAME = InMemorySandbox_dir,
11> ‘/var/opt/mssql/data/InMemorySandbox_dir’)
12> LOG ON (name = InMemorySandbox_log,
14>’/var/opt/mssql/data/InMemorySandbox_data_data.ldf’,
The next step is to alter the existing database and configure it to access memory-optimized tables. This part is helpful when you need to test and/or migrate current business solutions:
–First, we need to check compatibility level of database.
3> SELECT T.compatibility_level
4> FROM sys.databases as T
5> WHERE T.name = Db_Name();
–Change the compatibility level
1> ALTER DATABASE CURRENT
2> SET COMPATIBILITY_LEVEL = 130;
–Modify the transaction isolation level
1> ALTER DATABASE CURRENT SET
2> MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT=ON
–Finlay create memory optimized filegroup
1> ALTER DATABASE AdventureWorks
2> ADD FILEGROUP AdventureWorks_fg CONTAINS
3> MEMORY_OPTIMIZED_DATA
1> ALTER DATABASE AdventureWorks ADD FILE
2> (NAME=’AdventureWorks_mem’,
3> FILENAME=’/var/opt/mssql/data/AdventureWorks_mem’)
4> TO FILEGROUP AdventureWorks_fg
How to create memory-optimized table?
The syntax for creating memory-optimized tables is almost the same as the syntax for creating classic disk-based tables. You will need to specify that the table is a memory-optimized table, which is done using the MEMORY_OPTIMIZED = ON clause.
A memory-optimized table can be created with two DURABILITY values:
If you defined a memory-optimized table with DURABILITY=SCHEMA_ONLY, it means that changes to the table’s data are not logged and the data is not persisted on disk. However, the schema is persisted as part of the database metadata. A side effect is that an empty table will be available after the database is recovered during a restart of SQL Server on Linux service.
The following table is a summary of key differences between those two DURABILITY Options.
When you create a memory-optimized table, the database engine will generate DML routines just for accessing that table, and load them as DLLs files. SQL Server itself does not perform data manipulation, instead it calls the appropriate DLL:
Now let’s add some memory-optimized tables to our sample database:
1> USE InMemorySandbox
— Create a durable memory-optimized table
1> CREATE TABLE Basket(
2> BasketID INT IDENTITY(1,1)
3> PRIMARY KEY NONCLUSTERED,
4> UserID INT NOT NULL INDEX ix_UserID
5> NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000),
6> CreatedDate DATETIME2 NOT NULL,
7> TotalPrice MONEY) WITH (MEMORY_OPTIMIZED=ON)
— Create a non-durable table.
1> CREATE TABLE UserLogs (
2> SessionID INT IDENTITY(1,1)
3> PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=400000),
4> UserID int NOT NULL,
5> CreatedDate DATETIME2 NOT NULL,
8> NONCLUSTERED HASH (UserID) WITH (BUCKET_COUNT=400000))
9> WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_ONLY)
— Add some sample records
1> INSERT INTO UserLogs VALUES
5> (134, SYSDATETIME(), NULL),
7> (965, SYSDATETIME(), NULL)
1> INSERT INTO Basket VALUES
2> (231, SYSDATETIME(), 536),
3> (256, SYSDATETIME(), 6547),
4> (432, SYSDATETIME(), 23.6),
5> (134, SYSDATETIME(), NULL)
— Checking the content of the tables
1> SELECT SessionID, UserID, BasketID
1> SELECT BasketID, UserID
What is natively compiled stored procedure?
This is another great feature that comes comes within In-Memory package. In a nutshell, it is a classic SQL stored procedure, but it is compiled into machine code for blazing fast performance. They are stored as native DLLs, enabling faster data access and more efficient query execution than traditional T-SQL.
Now you will create a natively compiled stored procedure to insert 1,000,000 rows into Basket:
1> USE InMemorySandbox
1> CREATE PROCEDURE dbo.usp_BasketInsert @InsertCount int
2> WITH NATIVE_COMPILATION, SCHEMABINDING AS
5> (TRANSACTION ISOLATION LEVEL = SNAPSHOT,
6> LANGUAGE = N’us_english’)
10> INSERT INTO dbo.Basket VALUES (1, SYSDATETIME() , NULL)
1> EXEC dbo.usp_BasketInsert 1000000
The insert part should be blazing fast. Again, it depends on your environment (CPU, RAM, disk, and virtualization). My insert was done in less than three seconds, on an average machine. But significant improvement should be visible now. Execute the following SELECT statement and count the number of records:
In my case, counting of one million records was less than one second. It is really hard to achieve this performance on any kind of disk. Let’s try another query. We want to know how much time it will take to find the top 10 records where the insert time was longer than 10 microseconds:
1> SELECT TOP 10 BasketID, CreatedDate
4> (MICROSECOND,’2017-05-30 15:17:20.9308732′, CreatedDate)
Again, query execution time was less than a second. Even if you remove TOP and try to get all the records it will take less than a second (in my case scenario). Advantages of InMemory tables are more than obvious.
We learnt about the basic concepts of In-Memory OLTP and how to implement it on new and existing database. We also got to know that a memory-optimized table can be created with two DURABILITY values and finally, we created an In-Memory table.
If you found this article useful, check out the book SQL Server on Linux , which covers advanced SQL Server topics, demonstrating the process of setting up SQL Server database solution in the Linux environment.