Tag: performance

  • SQL Server 2016 & Azure Query Store

    SQL Server 2016 & Azure Query Store

    I hadn’t been following the news much regarding SQL Server 2016, so when I did some reading the other day I was quite pleasantly surprised by some of the new features announced (not to mention that it looks like SSMS will finally be getting some much needed love). 🙂

    We’ve been having some frustrating intermittent performance issues recently, and I was struggling to gain much insight into what the issue was (since we’re using Azure Databases, so scope for troubleshooting is a little narrower than for on-premise SQL servers).  So when I read about the “Query Store” feature available in SQL Server 2016 and Azure Database (v12) I got quite excited.

    What is it?

    I’ll keep this short and sweet since there’s already a few good posts out there about the Query Store. Basically this feature allows you to track down queries which have “regressed” (i.e. it was performing well, and then all of a sudden it turned to crap for no apparent reason).

    Not only can you track them down, you can now “pin” the old (i.e. good) execution plan.  Essentially you’re overriding the optimiser and telling it that you in fact know better.

    Sweet! How do I do it?

    You could do this before now, by forcing plans using USE PLAN query hints, etc.  But the Query Store and it’s related new shiny UI makes it soooo much easier and “trackable”.

    Dashboard displaying a summary of regressed queries in SQL Server, highlighting query execution durations and plans.

    As I said before though, I’m not going to go into details about how to use it. I used this post to figure out how it works, how to force plans, how to monitor how it’s going, etc.

    Ok, so how did it help?

    Our problem was that we were seeing intermittent DTU spikes (remember, we’re on Azure, so this means we were maxing out our premium-tier database’s resources in some way, whether CPU, Disk I/O, etc). We tracked it down to a heavy stored procedure call which was running well 99% of the time, but would get a “bad plan” every now and then.  So we would see a spike in our app response time in New Relic, I’d jump into a query window and run an sp_recompile on this proc, and usually the problem would go away (until the next time).

    Obviously this wasn’t a sustainable approach, so I needed to either rewrite the proc to make it more stable, tweak some indexes, or force a plan.  I fired up the new “Regressed Queries” report (shown above) and it quickly highlighted the problem query.  From there it was a case of selecting the “good” plan, and hitting the “Force Plan” button. Well… I don’t trust buttons so I actually ran the TSQL equivalent, sys.sp_query_store_force_plan.

    Visual representation of tracked queries in SQL Server Management Studio with execution plan and query performance metrics.

    Some interesting observations

    In the above image you can see the forced plan (circles with ticks in them). What seems to happen, which initially threw me, is that when you force a plan SQL generates a new plan which matches the forced plan, but is picked up as a different plan by the Query Store.  Which is why you see the ticks in the circles up until the point you actually pin the plan, after which point you get a new, un-ticked plan.  At first I thought this meant it wasn’t working, but it does indeed seem to stick to this forced plan.

    Other uses

    I’ve also found the reports very useful even when not resorting to forcing plans.  In several cases I’ve found queries which aren’t performing as well as they should be, and either altered the query or some underlying indexes, and then seen the (usually) positive resultant new plans; as shown in the image below, where this query was a bit all over the place until I slightly altered an existing index (added 1 included column) and it has since settled on a better, more stable plan (in purple).

    A graphical representation of SQL Server query performance over time, showing various plan IDs with distinct colors, highlighting performance fluctuations and trends.

    Cheers,
    Dave

  • CURSORs and WHILE-Loops

    can of wormsYes, I know this is a giant (some might say pointless, and I wouldn’t totally disagree) can of worms I’m opening here, but stay with me for a minute

    Why?

    We were discussing this at work a few months ago, and I was one of those people who followed “conventional wisdom” which says that SQL cursors are evil, and should be replaced with WHILE-loops if a set-based alternative isn’t possible (or practical).

    So I thought I’d do some testing for myself.  I know this has been done more than once before (I won’t mention any articles specifically, but they’re easy to find using your search engine of choice) – but I like to figure things out for myself rather than read a blog and take the author’s word as infallible.  Some of the other tests I found online also only concentrated on one specific metric when measuring performance (e.g. physical I/O), or were too obviously biased for my taste.

    Living in the Real World

    There are also what I refer to as “non-real-world” arguments against one or the other, for example:

    • Cursors are best because they’re easier/quicker to write (usually less variables and/or temp tables required).
    • While-loops are best because they’re inherently faster, since they don’t have to make use any of the internal mechanisms that cursors use to allow backwards and forwards data-set traversal.

    The reason I call these “non-real-world” problems, is that I don’t think any DBA or developer worth their salt is going to worry too much about how many more characters one takes to type than the other, and I’ve very seldom (if ever) needed to use anything but a simple forward-only cursor – so these problems aren’t representative of the majority of use-cases in my experience.

    Obviously if you take something like this super-simplified while-loop below, it would be difficult to write a cursor that would perform faster – but why the hell would you?

    DECLARE @i INT = 0;
    WHILE @i <= 1000000 SET @i += 1; -- 405ms on my laptop

    What?

    Alright, down to business. Using the AdventureWorks database on my laptop (running 2008R2 build 10.50.2500.0), I wrote a simple test script which iterates through each SalesOrderHeader row (for online orders) and counts the number of SalesOrderDetail rows for each SalesOrderID.  I know… for someone who was just talking about “non-real-world” examples, this is pretty unrealistic – but the point is that it generates a measureable amount of I/O and CPU load, and I can reproduce the logic easily using both a while-loop and a cursor.

    Here’s the code for the while-loop test:

    ------------------------------------------------
    -- WHILE LOOP (using counter)
    ------------------------------------------------
    CHECKPOINT;
    DBCC DROPCLEANBUFFERS;
    GO
    SET NOCOUNT ON;
    GO
    IF OBJECT_ID('tempdb..#SOID') IS NOT NULL DROP TABLE #SOID;
    CREATE TABLE #SOID (ID INT IDENTITY, SalesOrderID INT NOT NULL);
    DECLARE @i INT = 1, @ii INT, @SalesOrderID INT, @LineCount INT;
    
    INSERT #SOID (SalesOrderID)
    SELECT SalesOrderID
    FROM Sales.SalesOrderHeader
    WHERE OnlineOrderFlag = 1;
    
    SET @ii = @@ROWCOUNT;
    
    WHILE @i &amp;lt;= @ii
    BEGIN
    SET @SalesOrderID = (SELECT SalesOrderID FROM #SOID WHERE ID = @i);
    
    SET @LineCount = ISNULL(@LineCount, 0) + ISNULL((
    SELECT COUNT(*)
    FROM Sales.SalesOrderDetail
    WHERE SalesOrderID = @SalesOrderID
    ),0);
    
    SET @i += 1;
    END
    IF @LineCount &amp;lt;&amp;gt; 60398
    BEGIN
    RAISERROR('Bad Total',16,1);
    PRINT @LineCount;
    END
    GO

    Even for a simple while-loop there are multiple ways to write the logic – I’ve chosen a simple counter here, but didn’t see massive variances in the performance of the 2 or 3 methods I tested.

    Here’s the code I used to test cursors:

    ------------------------------------------------
    -- CURSOR
    ------------------------------------------------
    --DECLARE cursor_name CURSOR [ LOCAL | GLOBAL ]
    -- [ FORWARD_ONLY | SCROLL ]
    -- [ STATIC | KEYSET | DYNAMIC | FAST_FORWARD ]
    -- [ READ_ONLY | SCROLL_LOCKS | OPTIMISTIC ]
    -- [ TYPE_WARNING ]
    CHECKPOINT;
    DBCC DROPCLEANBUFFERS;
    GO
    SET NOCOUNT ON;
    GO
    DECLARE @SalesOrderID INT, @LineCount INT;
    DECLARE SOID CURSOR TYPE_WARNING
    FOR
    SELECT SalesOrderID
    FROM Sales.SalesOrderHeader
    WHERE OnlineOrderFlag = 1;
    
    OPEN SOID;
    FETCH NEXT FROM SOID INTO @SalesOrderID;
    WHILE @@FETCH_STATUS = 0
    BEGIN
    SET @LineCount = ISNULL(@LineCount,0) + ISNULL((
    SELECT COUNT(*)
    FROM Sales.SalesOrderDetail
    WHERE SalesOrderID = @SalesOrderID
    ), 0);
    
    FETCH NEXT FROM SOID INTO @SalesOrderID;
    END
    CLOSE SOID;
    DEALLOCATE SOID;
    IF @LineCount &lt;&gt; 60398
    BEGIN
    RAISERROR('Bad Total',16,1);
    PRINT @LineCount;
    END
    GO

    This represents the simplest test (i.e. just a plain cursor declaration with no options other than TYPE_WARNING).  I used this as a baseline, to which I added various combinations of cursor options, such as STATIC vs DYNAMIC, LOCAL vs GLOBAL, etc.

    To measure performance I profiled the execution of the main batch (i.e. not including the checkpoint, clearing the caches, etc) and recorded CPU, Duration, Reads, & Writes over 5 executions per test run (I actually ran each more than just 5 times, but results were so consistent that I only recorded the last 5 executions of each test).

    Show me the money

    Cursor DurationThe surprising thing for me (in this test case) was how poorly the while-loop performed.  The graph to the right shows the various cursor-based test runs, all of which completed in less than 1 second (click the image to viewer a larger copy).

    The while-loop consistently took 30 seconds to run.

    By the way, please feel free to point out any mistakes I’ve made.

    Average ReadsThe reads show a similar story regarding the stark difference between the while-loop and any of the cursor tests.  The cursor read I/O figures are in the graph to the right, and the while-loop averaged well over 1.8 million reads (it dwarfed the others on the graph so I removed it).

    I guess what I find interesting is that many people have run tests, and look at logical or physical I/O, but don’t take into account CPU or overall duration.  If my data is sitting in memory anyway, I don’t really care about those logical reads – but the difference between a query taking 100 vs 1000 ms of my CPU’s time is very important to me.  Am I neglecting to take anything else (important) into account?

    Conclusion

    I don’t have one…

    These tests are too fickle to make any concrete observations – you could change just a few lines of code and get completely different results to what I’m seeing.

    Personally, I will continue to use both where appropriate.  Before anyone has a heart attack let me clarify that I would always first use a set-based solution before resorting to either.

    But… my rule of thumb is that I use CURSORS if I’m performing data manipulation, and a WHILE loop if I’m performing an action until some condition is met.  For cursors, I tend to use the “LOCAL STATIC FORWARD_ONLY READ_ONLY” options, just because they roll off the tongue so nicely. If you’re on the SQL Server team at Microsoft and you’re reading this – in the next version of SQL could you please consolidate these options down to a single “GO_BABY_GO” option instead? Winking smile

    As a final note, highlighting why a set-based alternative is always better; running this query “properly” (i.e. set-based, not RBR) gave me just over 900 reads, and completed in 29 milliseconds. No contest really.

    Cheers,

    Dave

  • Benchmarking I/O: The test rig

    Work has been a little crazy lately, but I’ve still been wanting to do some I/O performance testing since we’re due to get some new servers in soon and I wanted to be able to compare apples with oranges. So this past weekend I decided to combine something I’ve been meaning to learn more about (PowerShell) and some SQL scripts based on those described by Paul Randal (blog | twitter) in his post about testing Fusion-IO ioDrive Duo SSD’s.

    Just to be clear before going any further, I’ve tested these scripts in small scale (i.e. on my laptop) and they worked well, but I’m happy to have any bugs and/or crappy code pointed out to me in the comments. 🙂 Thanks.

    So, because I tend to over-engineer things, I laid out my requirements early on to prevent any self-induced scope-creep. These were:

    • It must accept parameters for anything the user would need to change
    • It must create the test database and any other required objects
    • It should automatically re-run the test the specified number of times (to allow averaging of results)
    • It should automatically record test results without requiring me to intervene between test runs
    • If possible, it should be able to automatically change test configurations (e.g. number of database data files) and re-run the tests without intervention

    The jury’s still out on how many of these goals I achieved, but let’s dive in anyway. I’m not going to post snippets of all the code – otherwise it’ll just get too bulky.  Instead I’ll highlight the more important bits, and then upload the whole shebang at the end for you to download.

    The PowerShell script basically controls the flow of the test process, but there are 5 SQL scripts which do the actual heavy lifting. The first 2 files create the database (with the specified number of data files, and of the specified size), as well as a few tables used in the test:

    -------------------------------------------------------------------------- 
    -- Create test database 
    --------------------------------------------------------------------------
    SET @cmd = N'
    	IF DB_ID(''' + @db_name + ''') IS NOT NULL DROP DATABASE [' + @db_name + ']; 
    	CREATE DATABASE [' + @db_name + '] ON PRIMARY (
    		NAME = N''' + @db_name + ''', 
    		SIZE = ' + CAST(@data_file_size_mb AS NVARCHAR) + 'MB , 
    		FILEGROWTH = 250MB, 
    		FILENAME = N''' + @db_data_path + @db_name + '.mdf'')'; 
    
    -- Add additional files 
    WHILE @i < $(NoOfDataFiles) 
    BEGIN 
    	SET @cmd = @cmd + CHAR(10) + ',(NAME = N''' + @db_name + CAST(@i AS NVARCHAR) 
    		+ ''', SIZE = ' + CAST(@data_file_size_mb AS NVARCHAR) + 'MB , FILEGROWTH = 250MB, FILENAME = N''' 
    		+ @db_data_path + @db_name + CAST(@i AS NVARCHAR) + '.ndf'')'; 
    		
    	SET @i = @i + 1; 
    END 
    
    SET @cmd = @cmd + CHAR(10) + ' LOG ON (NAME = N''' + @db_name + '_log'', SIZE = ' 
    	+ CAST(CASE WHEN $(TargetDbSizeGB) <= 1 THEN 128 WHEN $(TargetDbSizeGB) > 1 AND $(TargetDbSizeGB) <= 5 
    		THEN 512 WHEN $(TargetDbSizeGB) > 5 AND $(TargetDbSizeGB) <= 10 THEN 2048 ELSE 4096 END AS NVARCHAR) 
    		+ 'MB, FILEGROWTH = 100MB, FILENAME = N''' + @db_log_path + @db_name + '_log.ldf''); ALTER DATABASE [' 
    		+ @db_name + '] SET RECOVERY SIMPLE;'; 
    
    EXEC sp_executesql @cmd, N'';

    The main work table is a copy & paste from Paul’s post mentioned earlier, and has a clustered index on a GUID column which forces random I/O. I guess you could change this to an integer/identity column to change this to sequential I/O instead.

    The 3rd SQL script performs the inserts, and is copied from a related post from Thomas Kejser (blog) – I just added a bit that works out how many rows to insert based on the target database size specified.

    SELECT @counter = 0, @start = GETDATE(); 
    WHILE (@counter < @no_of_rows_to_insert_per) 
    BEGIN 
    	IF @counter % 1000 = 0 
    	BEGIN TRAN; 
    		INSERT INTO MyBigTable DEFAULT VALUES; 
    		SET @counter = @counter + 1; 
    		
    		IF @counter % 1000 = 999 
    			COMMIT TRAN; 
    END; 
    
    SET @end = GETDATE(); 
    INSERT INTO Results (duration) VALUES (CAST(DATEDIFF(SS, @start, @end) AS INT));

    The 4th script is directly from this post of Paul’s on wait stats – its definitely one worth adding to your toolbox.

    Finally, the 5th script aggregates each test run’s results into one row, so that the test rig can automatically blow everything away and run again without losing any results.

    One of the biggest challenges I faced (as the PowerShell n00b that I am) was figuring out how to asynchronously launch multiple processes and wait for them to finish.  In “batch file land” I would’ve just thrown together some “FOR %%i IN… DO START sqlcmd….” magic, but I wouldn’t allow myself to use anything but PowerShell for the flow control.  After some time spent searching around I found Start-Job which allowed me to launch a given number of instances of SQLCMD and then wait for them to all finish by calling Wait-Job (obviously!). Winking smile

    ###################################################################### 
    # Kick off the specified number of insert processes ($NoOfInsertProcesses) 
    ###################################################################### 
    $sj = { param($SQLInstanceName, $TestDBName, $InsertRowsPath, $TargetDbSizeGB, $ApproxRowByteSize, $NoOfInsertProcesses) sqlcmd -S "$SQLInstanceName" -d "$TestDBName" -i "$InsertRowsPath" -v TargetDbSizeGB=$TargetDbSizeGB ApproxRowByteSize=$ApproxRowByteSize NoOfInsertProcesses=$NoOfInsertProcesses } $(for ($x=1; $x -le $NoOfInsertProcesses; $x++){ Start-Job $sj -ArgumentList $SQLInstanceName, $TestDBName, $InsertRowsPath, $TargetDbSizeGB, $ApproxRowByteSize, $NoOfInsertProcesses } ) | Wait-Job #Wait for all insert processes to finish before continuing

    Funnily enough, one of my main “Aha!” moments was had while reading this post by Aaron Bertrand (blog | twitter) who isn’t your typical hard-core PowerShell or Windows Server blogger, but instead a SQL Server MVP – so either it was a sign I was on the right path, or Google knows far too much about my reading habits. Smile

    Okay – please feel free to download the scripts, tear them apart, point and laugh, etc (although, if you point and laugh, please at least tell me why so I can learn from it). As I said before, I haven’t run them on anything apart from my laptop (which I think would’ve taken several years to perform the full-blown test), so I’d be interested to hear of your experiences.

    IOTestScripts

    In the coming weeks I’m hoping to benchmark several of our existing servers (sitting on top of various storage platforms, such as SAN – both solid state & fibre channel, local disk, NAS, and RAM disks).

    Cheers,

    DB Dave