Benchmarking I/O: The test rig

Work has been a little crazy lately, but I’ve still been wanting to do some I/O performance testing since we’re due to get some new servers in soon and I wanted to be able to compare apples with oranges. So this past weekend I decided to combine something I’ve been meaning to learn more about (PowerShell) and some SQL scripts based on those described by Paul Randal (blog | twitter) in his post about testing Fusion-IO ioDrive Duo SSD’s.

Just to be clear before going any further, I’ve tested these scripts in small scale (i.e. on my laptop) and they worked well, but I’m happy to have any bugs and/or crappy code pointed out to me in the comments. 🙂 Thanks.

So, because I tend to over-engineer things, I laid out my requirements early on to prevent any self-induced scope-creep. These were:

  • It must accept parameters for anything the user would need to change
  • It must create the test database and any other required objects
  • It should automatically re-run the test the specified number of times (to allow averaging of results)
  • It should automatically record test results without requiring me to intervene between test runs
  • If possible, it should be able to automatically change test configurations (e.g. number of database data files) and re-run the tests without intervention

The jury’s still out on how many of these goals I achieved, but let’s dive in anyway. I’m not going to post snippets of all the code – otherwise it’ll just get too bulky.  Instead I’ll highlight the more important bits, and then upload the whole shebang at the end for you to download.

The PowerShell script basically controls the flow of the test process, but there are 5 SQL scripts which do the actual heavy lifting. The first 2 files create the database (with the specified number of data files, and of the specified size), as well as a few tables used in the test:

The main work table is a copy & paste from Paul’s post mentioned earlier, and has a clustered index on a GUID column which forces random I/O. I guess you could change this to an integer/identity column to change this to sequential I/O instead.

The 3rd SQL script performs the inserts, and is copied from a related post from Thomas Kejser (blog) – I just added a bit that works out how many rows to insert based on the target database size specified.

The 4th script is directly from this post of Paul’s on wait stats – its definitely one worth adding to your toolbox.

Finally, the 5th script aggregates each test run’s results into one row, so that the test rig can automatically blow everything away and run again without losing any results.

One of the biggest challenges I faced (as the PowerShell n00b that I am) was figuring out how to asynchronously launch multiple processes and wait for them to finish.  In “batch file land” I would’ve just thrown together some “FOR %%i IN… DO START sqlcmd….” magic, but I wouldn’t allow myself to use anything but PowerShell for the flow control.  After some time spent searching around I found Start-Job which allowed me to launch a given number of instances of SQLCMD and then wait for them to all finish by calling Wait-Job (obviously!). Winking smile

Funnily enough, one of my main “Aha!” moments was had while reading this post by Aaron Bertrand (blog | twitter) who isn’t your typical hard-core PowerShell or Windows Server blogger, but instead a SQL Server MVP – so either it was a sign I was on the right path, or Google knows far too much about my reading habits. Smile

Okay – please feel free to download the scripts, tear them apart, point and laugh, etc (although, if you point and laugh, please at least tell me why so I can learn from it). As I said before, I haven’t run them on anything apart from my laptop (which I think would’ve taken several years to perform the full-blown test), so I’d be interested to hear of your experiences.

IOTestScripts

In the coming weeks I’m hoping to benchmark several of our existing servers (sitting on top of various storage platforms, such as SAN – both solid state & fibre channel, local disk, NAS, and RAM disks).

Cheers,

DB Dave

2 Comments

  1. Ross

    Hi Dave

    The FOR construct you used is valid, but a bit old school 🙂

    I prefer using foreach, I think it improves readability:

    1..$NoOfInsertProcesses | foreach {
    Start-Job $sj …
    }

    I’m interested in those results 😛

Leave a Reply