Tag: script

  • SSIS Data Flow Plus!

    SSIS Data Flow Plus!

    In my previous blog post I talked about BIML, and how it might revolutionise my approach to creating ETL processes.  It’s pretty cool, and very powerful, but there is a bit of a learning curve, so I decided to look for a different way to achieve the same thing, but that required less upskill-time, and preferably less development time too.

    So, the ideal solution will:

    • be quick to build initially, and easy to maintain in the long run.
    • allow for parallel data loads to make the best use of the available resources.
    • allow for ad-hoc changes to the load or schema without having to open, make changes to, and re-deploy the SSIS package.

    I briefly tested several other methods (most of which involved generating large amounts of dynamic SQL and executing that against your source and/or destination). I instead decided to try out an SSIS add-on package called “Data Flow Task Plus”, which I’d never heard of before.

    What is it?

    A company called CozyRoc has developed a set of new components, and extensions to existing components within SSIS, making them a whole lot more powerful than what you get out of the box. This is nothing new, in fact you can develop your own components relatively easily if you so choose (in fact even I’ve dabbled with this many moons ago, trying to read CSV files with annoying formatting “features”).

    Data Flow Plus lets you configure dynamic data flows. You can control various options via package or project parameters, which means less time spent opening packages to edit them when your source or destination schema changes. Basically this means you can create “schema-less” ETL packages which will just transfer data from a source table to a destination table, even if you add or remove (or change) columns!  Too good to be true, right?

    The Pudding

    As they say, the proof is in the pudding, so here’s some pudding… figuratively speaking. Nothing like some green ticks in SSIS to make your afternoon!

    That’s the end result of my proof-of-concept, but don’t worry, I’ll step you through it.

    imageFirst-things-first, you’ll need to go to the CozyRoc website and download the package, either 32 or 64-bit depending on your requirements.

    Once that’s done and you open Visual Studio, you’ll notice a bunch of new components in your SSIS Toolbox. The only one I’m covering here though is the new Data Flow Task Plus (highlighted), although I may cover more in future as there are a couple that sound interesting (like parallel foreach loops!).

    New Plan

    So my plan is to have table metadata stored in a table on the destination (Azure Data Warehouse) database, which is queried by the package and stored in package variables. I’ll then iterate over the list of tables, do my ETL (depending on what kind of load I’m doing), and finally load the data from the source system. Sounds simple enough (… and it is), so let’s get started.

    And yeees I know this isn’t really much of an “ETL” process… but “ELT” doesn’t roll off the tongue as easily. :-p 

    Here’s a SQL script to set up for this proof-of-concept if you want to follow along. It creates 2 databases (a source and a destination), as well as a table to store metadata about the tables I want loaded from one to the other.

    CREATE DATABASE DWSource; 
    GO 
    CREATE DATABASE DWDestination; 
    GO
    USE DWDestination;
    -- DROP TABLE LoadConfiguration 
    CREATE TABLE dbo.LoadConfiguration ( 
        LoadStream TINYINT NOT NULL, 
        TableName NVARCHAR(100) NOT NULL, 
        SqlCreateStmt NVARCHAR(MAX) NOT NULL, 
        IndexColumnName NVARCHAR(100) NOT NULL, 
        LoadType NVARCHAR(20) NOT NULL, 
        ColumnListToLoad NVARCHAR(MAX) NOT NULL 
        )
    -- These are very simplified versions of a few tables in our (Timely’s) database. You'll need to create them in the source database if you want to test this yourself.
    INSERT LoadConfiguration VALUES (1, 'Booking', REPLACE('CREATE TABLE [dbo].[Booking]( 
        [BookingId] [int] NOT NULL, 
        [CustomerId] [int] NOT NULL, 
        [StartDate] [datetime] NOT NULL, 
        [EndDate] [datetime] NOT NULL, 
        [Price] [money] NULL, 
        [BusinessId] [int] NOT NULL 
    )','NOT NULL','NULL'), 'BookingId', 'Full', 'BookingId, CustomerId, StartDate, EndDate, Price, BusinessId')
    INSERT LoadConfiguration VALUES (1, 'Business', REPLACE('CREATE TABLE [dbo].[Business]( 
        [BusinessId] [int] NOT NULL, 
        [Name] [nvarchar](100) NOT NULL, 
        [DateCreated] [datetime] NOT NULL, 
        [Description] [nvarchar](max) NULL 
    )','NOT NULL','NULL'), 'BusinessId', 'Full', 'BusinessId, Name, DateCreated')
    INSERT LoadConfiguration VALUES (1, 'Customer', REPLACE('CREATE TABLE [dbo].[Customer]( 
        [CustomerId] [int] NOT NULL, 
        [BusinessId] [int] NOT NULL, 
        [FirstName] [nvarchar](50) NULL, 
        [LastName] [nvarchar](50) NULL, 
        [DateCreated] [datetime] NOT NULL 
    )','NOT NULL','NULL'), 'CustomerId', 'Full', 'CustomerId, BusinessId, FirstName, LastName, DateCreated')

    With this proof-of-concept I want to test that I can create tables, prepare them, and then load only the columns that I want loaded.

    Variables & Expressions

    A small but important part of creating a package like this is making sure you get your variable expressions right – i.e. make the various SQL statements and values you use as dynamic as possible.  As an example here are my variables for this little package. Note the expression column and how values are stitched together when it comes to building SQL commands used by the various components.

    From top-to-bottom, we’ve got:

    • ColumnListToLoad – this is the list of columns from the source table that I want loaded into the destination table.
    • IndexColumnName – the name of the “ID” column that I can use to tell where to load from if doing an incremental load. In the real world I’ll probably make the package handle either Id’s or DateTime columns, because with some tables it will make more sense to load based on a load-date.
    • IndexColumnValue – if doing an incremental load, then this variable will be populated with the max IndexColumnId already loaded into the data warehouse.
    • LoadSettings – the System.Object variable which will hold the full result set of the initial SQL query, and feed it into the ForEach loop container. Nom nom nom…
    • LoadType – whether we’re doing a Full or Incremental load. Could cater for other load types here too.
    • SQL_DeleteStatement – a SQL delete statement based on an expression. If doing an incremental load then this will delete any data that may exist after the current max IndexColumnValue, which should help prevent duplicates.
    • SQL_DropStatement – a SQL table drop statement. Probably didn’t need to be a fully dynamic expression, but for some reeeeaally important or large tables, you may want to disable accidental drops by putting something harmless in this variable for those specific tables.
    • SQL_LoadStatement – a SQL select statement which will pull the data from the source table. This select statement will make use of the ColumnListToLoad variable, as well as the SQL_WhereClause variable if performing an incremental load.
    • SQL_MaxIdValueStatement – SQL statement to get the max Id value and populate the IndexColumnValue variable.
    • SQL_WhereClause – snippet of SQL depending on whether we’re performing an incremental load, and the value of the IndexColumnValue variable.
    • SqlCreateStatement – The SQL create table statement for the destination table. In this example it’s just an exact copy of the source table. I tend to pull production data across into tables matching the source schema, even if my “ColumnListToLoad” variable means that I’m only loading a subset of columns. This means that if I need to add columns to the load later, I don’t need to change the create scripts.
    • TableName – the name of the source (and in this case, destination) table.

    The Package

    Here’s the steps in my package (and a chance for you to admire my l33t Windows Snipping tool handwriting skillz!). Note that I’m not going to go into a whole lot of detail here, because the purpose of this post isn’t to cover all things SSIS. Instead I’ll link to other sites which explain each step or series of steps more clearly.

    1. Select from the [LoadConfiguration] table, and stick the result-set into an object variable.

    2. Use a ForEach container to loop through each ‘row’ in the above object variable, assigning the individual values to variables scoped to the container.

    3. There are separate sequence containers for Full and Incremental loads. Their disabled states are set via an Expression which is based on the value from the [LoadType] column grabbed from the [LoadConfiguration] table above. So, if we’re doing a full load, the Incremental load container will be disabled, and vice versa. Another (possibly better) way of doing this would be to use precedence constraints with expressions to control the path of execution.

    4. As above, but for the ‘Incremental’ [LoadType] value…

    5. Load data using the new data load plus component. The best way to figure out how to do this is to watch the (rather dry) video from CozyRoc on this page.  But basically it involves setting up the component just like you would the normal data flow task, but then removing all columns from the outputs and inputs (using the advanced editor), and leaving only a single “placeholder/dummy” column. This placeholder column is brilliantly named  “THUNK_COLUMN”.

    Here’s another good blog post on a more complex setup using this component and Sharepoint.

    Conclusion

    Dunno… haven’t finished implementing the real thing yet. But the proof of concept is working well, and it went together pretty quickly, so I’m positive this will work, I think…

    I’ll update this post with my thoughts once I’ve got it all working. As usual please let me know if I’ve made any glaring mistakes, or if you’ve got some awesome ideas on how to improve this process further.

    Cheers,
    Dave

  • BIML, where have you been all my life?

    BIML, where have you been all my life?

    I’ve used the BIDS Helper Visual Studio add-on for years now, and I’ve seen and heard of BIML, but it’s one of those things I’ve never needed to look into any further than that.  Until I discovered that it’s something that would’ve saved me hours of tedious SSIS work!

    What is it?

    BIML (Business Intelligence Mark-up Language), or more specifically, BIMLScript, is sort of a mashup of XML and C# code nuggets, allowing you to create SSIS and SSAS packages.  This is very much the condensed “DBDave” version – check out the official site for a much more eloquent explanation of what it is.

    Basic Example

    When you open up your SSIS project in Visual Studio, if you’ve got BIDS Helper installed, then when you right-click on the project you have the option of adding a BIML file:

    Screenshot of Visual Studio Solution Explorer showing the option to add a new BIML file under the 'BIML Test' project.

    It’ll create a new file under “Miscellaneous” in your project. Go ahead and open it up and you’ll see something like this:

    Screenshot of a Visual Studio interface showing a BIML script file titled 'BimScript.biml' open on the left and the Solution Explorer on the right, highlighting the BIML Test project structure.

    You can “execute” a BIMLScript by right-clicking on it, and selecting “Generate SSIS Packages”:

    A screenshot showing the context menu options in Visual Studio with the 'Generate SSIS Packages' option highlighted, under the 'BimlScript' folder within the 'Miscellaneous' section.

    Now we can jump in the deep end and paste the following into this new BIML script:

    <Biml xmlns="http://schemas.varigence.com/biml.xsd"> 
        <Connections> 
            <Connection Name="SourceConn" ConnectionString="Data Source=.;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" /> 
            <Connection Name="DestinationConn" ConnectionString="Data Source=.;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" /> 
        </Connections> 
        <Projects> 
            <PackageProject Name="BIMLTest"> 
                <Parameters> 
                    <Parameter Name="DateFrom" DataType="DateTime">2016-01-01</Parameter> 
                    <Parameter Name="DestinationDatabase" DataType="String">tempdb</Parameter> 
                    <Parameter Name="DestinationServer" DataType="String">localhost</Parameter> 
                    <Parameter Name="DoTruncate" DataType="Boolean">false</Parameter> 
                    <Parameter Name="SourceDatabase" DataType="String">tempdb</Parameter> 
                    <Parameter Name="SourceServer" DataType="String">localhost</Parameter> 
                </Parameters> 
                <Packages> 
                    <Package PackageName="BIMLTestPackage" /> 
                </Packages> 
            </PackageProject> 
        </Projects> 
        <Packages> 
            <Package Name="BIMLTestPackage" ConstraintMode="Linear" ProtectionLevel="DontSaveSensitive"> 
                <Connections> 
                    <Connection ConnectionName="SourceConn"> 
                        <Expressions> 
                            <Expression ExternalProperty="InitalCatalog">@[$Project::SourceDatabase]</Expression> 
                            <Expression ExternalProperty="ServerName">@[$Project::SourceServer]</Expression> 
                        </Expressions> 
                    </Connection> 
                    <Connection ConnectionName="DestinationConn"> 
                        <Expressions> 
                            <Expression ExternalProperty="InitialCatalog">@[$Project::DestinationDatabase]</Expression> 
                            <Expression ExternalProperty="ServerName">@[$Project::DestinationServer]</Expression> 
                        </Expressions> 
                    </Connection> 
                </Connections> 
                <Tasks> 
                    <Container Name="Truncate Destination Table" ConstraintMode="Parallel"> 
                        <Expressions> 
                            <Expression ExternalProperty="Disable">!(@[$Project::DoTruncate])</Expression> 
                        </Expressions> 
                        <Tasks> 
                            <ExecuteSQL Name="Truncate Table" ConnectionName="DestinationConn"> 
                                <DirectInput> 
                                    TRUNCATE TABLE dbo.DWDestinationTableExample; 
                                </DirectInput> 
                            </ExecuteSQL> 
                        </Tasks> 
                    </Container> 
                    <Container Name="Load Table" ConstraintMode="Linear"> 
                        <Tasks> 
                            <Dataflow Name="Load dbo.DWDestinationTableExample"> 
                                <Transformations> 
                                    <OleDbSource Name="Source" ConnectionName="SourceConn"> 
                                        <DirectInput> 
                                            SELECT * FROM dbo.DWSourceTableExample WHERE KeyDate >= ?; 
                                        </DirectInput> 
                                        <Parameters> 
                                            <Parameter Name="0" VariableName="BIMLTest.DateFrom" /> 
                                        </Parameters> 
                                    </OleDbSource> 
                                    <OleDbDestination Name="Destination" ConnectionName="DestinationConn" KeepIdentity="true" UseFastLoadIfAvailable="true" MaximumInsertCommitSize="100000"> 
                                        <ExternalTableOutput Table="dbo.DWDestinationTableExample"> 
                                        </ExternalTableOutput> 
                                    </OleDbDestination> 
                                </Transformations> 
                            </Dataflow> 
                        </Tasks> 
                    </Container> 
                </Tasks> 
            </Package> 
        </Packages> 
    </Biml>

    What the… ?!?

    Yeah, okay, let’s step through this to figure out what it does.  I’ll show you what each bit of code results in too, which might help make it more tangible/understandable:

    <Connections> 
            <Connection Name="SourceConn" ConnectionString="Data Source=.;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" /> 
            <Connection Name="DestinationConn" ConnectionString="Data Source=.;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" /> 
        </Connections>

    First we setup the connections that will exist within the package. These are just connections to tempdb on my local SQL instance for testing. This bit results in this:

    Screenshot of Connection Managers in Visual Studio displaying DestinationConn and SourceConn.

    Next up, we specify the project and some project parameters that we’re going to use within the package:

    <Projects> 
        <PackageProject Name="BIMLTest"> 
            <Parameters> 
                <Parameter Name="DateFrom" DataType="DateTime">2016-01-01</Parameter> 
                <Parameter Name="DestinationDatabase" DataType="String">tempdb</Parameter> 
                <Parameter Name="DestinationServer" DataType="String">localhost</Parameter> 
                <Parameter Name="DoTruncate" DataType="Boolean">false</Parameter> 
                <Parameter Name="SourceDatabase" DataType="String">tempdb</Parameter> 
                <Parameter Name="SourceServer" DataType="String">localhost</Parameter> 
            </Parameters> 
            <Packages> 
                <Package PackageName="BIMLTestPackage" /> 
            </Packages> 
        </PackageProject> 
    </Projects>

    There are some gotchas regarding project parameters in BIML when using BIDS Helper to check and run your BIMLScript, so keep that in mind.  As per this example, you need to specify the project parameter definitions in here, even if they already exist within your project.

    So because of these issues, I found it simpler just to make sure the parameters already exist, like this:

    Screenshot of the project parameters window in SQL Server Data Tools, displaying parameters for a BIMLTestPackage including Name, Data Type, and Value columns.

    Now we create the package itself, and substitute in some of the package parameters, which in this case we’re using to replace parts of the connection strings for our source and destination connections.

    <Packages>
    <Package Name="BIMLTestPackage" ConstraintMode="Linear" ProtectionLevel="DontSaveSensitive">
    <Connections>
    <Connection ConnectionName="SourceConn">
    <Expressions>
    <Expression ExternalProperty="InitalCatalog">@[$Project::SourceDatabase]</Expression>
    <Expression ExternalProperty="ServerName">@[$Project::SourceServer]</Expression>
    </Expressions>
    </Connection>
    <Connection ConnectionName="DestinationConn">
    <Expressions>
    <Expression ExternalProperty="InitialCatalog">@[$Project::DestinationDatabase]</Expression>
    <Expression ExternalProperty="ServerName">@[$Project::DestinationServer]</Expression>
    </Expressions>
    </Connection>
    </Connections>

    This is the same as this part in the user interface:

    A screenshot of a database query results table showing columns for row ID, word, and operator. The table includes various SQL-related terms and their corresponding operators.

    Finally we add the meat to this SSIS sandwich; the components that perform the actual transformation and/or loading of data.

    <Tasks>
    <Container Name="Truncate Destination Table" ConstraintMode="Parallel">
    <Expressions>
    <Expression ExternalProperty="Disable">!(@[$Project::DoTruncate])</Expression>
    </Expressions>
    <Tasks>
    <ExecuteSQL Name="Truncate Table" ConnectionName="DestinationConn">
    <DirectInput>
    TRUNCATE TABLE dbo.DWDestinationTableExample;
    </DirectInput>
    </ExecuteSQL>
    </Tasks>
    </Container>
    <Container Name="Load Table" ConstraintMode="Linear">
    <Tasks>
    <Dataflow Name="Load dbo.DWDestinationTableExample">
    <Transformations>
    <OleDbSource Name="Source" ConnectionName="SourceConn">
    <DirectInput>
    SELECT * FROM dbo.DWSourceTableExample WHERE KeyDate >= ?;
    </DirectInput>
    <Parameters>
    <Parameter Name="0" VariableName="BIMLTest.DateFrom" />
    </Parameters>
    </OleDbSource>
    <OleDbDestination Name="Destination" ConnectionName="DestinationConn" KeepIdentity="true" UseFastLoadIfAvailable="true" MaximumInsertCommitSize="100000">
    <ExternalTableOutput Table="dbo.DWDestinationTableExample">
    </ExternalTableOutput>
    </OleDbDestination>
    </Transformations>
    </Dataflow>
    </Tasks>
    </Container>
    </Tasks>
    </Package>
    </Packages>
    </Biml>

    We’ve got an “Execute SQL” component running a truncate of the destination table first.  However, we only want this to run if we’ve set our project parameter “DoTruncate” to true.

    Screenshot of the SSIS package 'BIMLTestPackage' in Visual Studio, showing the 'Truncate Destination Table' task and its properties on the right.

    And lastly a Data Flow task to move data.  This is done using a SQL query with a parameter for a “KeyDate” column, as an illustration of what you might do in a real-life situation.

    Load Table transformation in SSIS package for dbo_DWDestinationTableExample
    Screenshot showing the Set Query Parameters dialog in SQL Server Integration Services (SSIS) with parameter mapping for a Data Flow task.

    Cool! Now what??

    So that’s BIML in a very small nutshell.  Even if that’s all you’re doing with it (i.e. creating pretty basic packages) I think it’s worth doing since it makes source control of your packages SOOOOOO much nicer!

    Imagine getting a pull request from a developer who’s made some SSIS changes, and simply being able to diff the BIML scripts to see exactly what they’ve changed!? Smile

    But wait, there’s more…

    In the scenario that lead to discover BIML, I wanted to create a “dynamic” SSIS package, that was driven by metadata stored in a database.  In other words, I could maintain a table with a list of table names that I wanted “ETL’d” from my production system to my data-warehouse, and my magic SSIS package would pick up changes, new tables added, etc without me needing to open and edit one monstrous package.

    This is where the power of BIMLScript and it’s C# nuggets really shines. It lets you drop in complicated logic in C# code to control and mould the output of the BIML.  So you could look up a list of tables to load, then iterate over that list, creating packages per table.  Check out this post for a lot more detail (and examples) on how to achieve this.

    That’s it for now. There’s lots of more detailed examples around if you look for them (Google is your friend), and I just wanted to highlight the possibilities which I didn’t realise were there before. Hopefully you find it as useful as I did.

    Cheers,
    Dave

  • Keeping track of your SQL Server index “inventory”

    Keeping track of your SQL Server index “inventory”

    Something’s always bugged me when it comes to managing indexes in SQL Server; keeping track of when I created or dropped them.

    You can already check just about everything you could want to know about indexes, via handy DMV’s.  Need to figure out how much your indexes are being used? Take a look at sys.dm_db_index_usage_stats.  Or how about which indexes you might need to add to improve performance? Easy, just query sys.dm_db_missing_index_group_stats! You get the idea…

    But what about the create date of an index, or when it was last rebuilt, or even when you dropped an index? For this you need to roll up your sleeves and roll your own solution.

    How to skin this cat?

    There are a few ways we can do this.

    Scheduled job

    The first way I used to do this was to just have a scheduled job running fairly regularly (like every 15 – 30 minutes) which checked for any changes to indexes in the database since the last time it ran. Any new ones would be added to the table, changes would be recorded, and dropped indexes would noted as such.  In fact, I used a version of Kimberly Tripp’s “improved sp_helpindex” to gather and store the index information in a nice format (i.e. with separate columns for included columns, compression, etc).

    This is what the “guts” of the proc look like, just to give you an idea:

    DECLARE TimelyTables CURSOR FAST_FORWARD FOR
    		SELECT	DISTINCT 
    				QUOTENAME(ss.name) + '.' + QUOTENAME(st.name) AS TableName,
    				st.[object_id]
    		FROM	sys.tables AS st
    		JOIN	sys.schemas AS ss ON st.[schema_id] = ss.[schema_id]
    		WHERE	st.is_ms_shipped = 0;
    
    	OPEN TimelyTables;
    	FETCH NEXT FROM TimelyTables INTO @TableName, @ObjectId;
    	WHILE @@FETCH_STATUS = 0
    	BEGIN
    		TRUNCATE TABLE #tt_Indexes;
    
    		RAISERROR('Table: %s (%i)',10,1,@TableName,@ObjectId);
    
    		INSERT	#tt_Indexes
    		EXEC	dbo.sp_helpindex2 @TableName; 
    
    		IF @@ROWCOUNT &gt; 0 
    		BEGIN 
    			INSERT	#index_history ([object_id], table_name, index_id, is_disabled, index_name, index_description, index_keys, included_columns, filter_definition, [compression]) 
    			SELECT	@ObjectId, @TableName, t.index_id, t.is_disabled, t.index_name, t.index_description, t.index_keys, t.included_columns, t.filter_definition, t.[compression] 
    			FROM	#tt_Indexes AS t;  
    		END 
    
    		FETCH NEXT FROM TimelyTables INTO @TableName, @ObjectId;
    	END

    Outside of this loop you can then do your MERGE comparison between “current” indexes, and what was recorded in the “index history” table from the previous run.

    DDL trigger

    DDL triggers are nothing new, but they can be very useful for auditing schema and login changes, etc. So it makes sense that this is ideally suited to creating an inventory of your indexes (if not all database objects that you might be interested in).  In fact, you can even quite easily create your own poor-man’s source control system, but that’s a different kettle of fish.

    The idea behind DDL triggers is that you specify which ‘events‘ you want them to fire for at a database or server level.  In my case, working with Azure, I’m only interested in database level events.  In fact, in this case I’m only interested in recording the details of any CREATE INDEX, ALTER INDEX, or DROP INDEX statements.  Which looks like this:

    CREATE TRIGGER trg_IndexChangeLog ON DATABASE 
        FOR CREATE_INDEX, ALTER_INDEX, DROP_INDEX 
    AS 
    ...

    Now we just insert the EVENTDATA data into our logging table, like so:

    IF OBJECT_ID('dba.IndexChangeLog') IS NOT NULL 
    BEGIN
    	DECLARE @data XML; 
    	SET @data = EVENTDATA(); 
    
    	INSERT	dba.IndexChangeLog(eventtype, objectname, objecttype, sqlcommand, loginname) 
    	VALUES	( 
    			@data.value('(/EVENT_INSTANCE/EventType)[1]', 'varchar(50)'), 
    			@data.value('(/EVENT_INSTANCE/ObjectName)[1]', 'varchar(256)'), 
    			@data.value('(/EVENT_INSTANCE/ObjectType)[1]', 'varchar(25)'), 
    			@data.value('(/EVENT_INSTANCE/TSQLCommand)[1]', 'varchar(max)'), 
    			@data.value('(/EVENT_INSTANCE/LoginName)[1]', 'varchar(256)') 
    			); 
    END;

    EVENTDATA is basically an XML document that contains the important bits you might want to record for auditing purposes… like who ran what, when.  This is what’s available:

    <EVENT_INSTANCE>
        <EventType>event </EventType>
        <PostTime>date-time</PostTime>
        <SPID>spid</SPID>
        <ServerName>name </ServerName>
        <LoginName>login </LoginName>
        <UserName>name</UserName>
        <DatabaseName>name</DatabaseName>
        <SchemaName>name</SchemaName>
        <ObjectName>name</ObjectName>
        <ObjectType>type</ObjectType>
        <TSQLCommand>command</TSQLCommand>
    </EVENT_INSTANCE>

    Best of both worlds?

    The above two methods are ones that I’ve already used successfully and have experience with – and each one offer pro’s and con’s.  I like the detail and “query-ability” of the data I get from the scheduled job solution, but the DDL trigger is simpler and doesn’t rely on scheduled jobs running.  The trigger also clearly has a big advantage in that it’s going to pick up the actual event as it happens, whereas the job may miss stuff between executions. This may or may not be important to you.

    There may be a nice way of combining the two though.  Using Event Notifications, or DDL triggers to insert a payload onto a Service Broker queue, you could have a live and asynchronous system which gathers more detail than what’s available in the EVENTDATA.  I.e. you could have an activated procedure on the end of the SB queue which uses the index name to populate additional details, for example. Let me know in the comments if you give this a go, or if you can see any gotchas.

    Cheers,
    Dave

     

  • PowerShell script to update Azure firewall rule

    PowerShell script to update Azure firewall rule

    *Updated 2017-06-28

    I’ve recently moved house, and as a result had to change my broadband plan from cable to ADSL (sad face). This also means I’ve gone from having a fixed IP address to a dynamically assigned one. Usually, this wouldn’t be a problem, except when it comes to connecting to the several Azure servers that I manage on a daily basis. Now I need to use the Azure Portal to manually change each server’s firewall settings at least once or twice a week. Painfull…

    So I quickly threw together this PS script to do the job for me and thought others out there might find it useful too.

    How’s it work?

    The script accepts an array of Azure SQL Server names, finds your external IP address using ipinfo.io, and then loops through the list of servers. You’ll need to provide a default rule name or modify the function call to pass it in (maybe include it in the array if it’s different for each server?).

    It then checks the current IP address of the specified rule and, if it’s different to your external IP address, updates the firewall rule for you. #Magic

    Import-Module SQLPS -DisableNameChecking 
    Import-Module Azure 
    
    # Run Get-AzurePublishSettingsFile first to download the publish settings file for your Azure subscription
    # Full instructions here: https://docs.microsoft.com/en-us/powershell/module/azure/get-azurepublishsettingsfile?view=azuresmps-4.0.0
    
    Import-AzurePublishSettingsFile "C:\My_oresome_path\Sweet-as-publish-settings-file.publishsettings" # &lt;-- put the path to your publish settings file here
    
    # Now just add your server names to this array... or get fancy and look them up somehow, 
    # whether from a simple text file or something more exotic.
    [array]$AzureServers = @('servername01','servername02','servername03','servername04'); 
    
    # Just a little function to get your current external/public IP address
    function Get-MyIpAddress
    {
        $ip = Invoke-RestMethod http://ipinfo.io/json | Select -exp ip
        return $ip;
    }
    
    # This function does the work of changing the whitelist if necessary
    function Update-MyAzureFirewallRule 
    {
        Param (
            [Parameter(Mandatory=$True,ValueFromPipeline=$True,ValueFromPipelinebyPropertyName=$True)]
            [string]$ServerName,
            [string]$RuleName = 'Put_Your_Rule_Name_Here',
            [string]$IpAddress
            )
    
        # Gets the current rule (so we can see what the IP address is currently set to)
        $CurrentRule = Get-AzureSqlDatabaseServerFirewallRule -RuleName $RuleName -ServerName $ServerName; 
        $CurrentIp = $CurrentRule.StartIpAddress
    
        # If your current IP doesn't match what's in the whitelist, then update it
        if ($CurrentIp -ne $IpAddress)
        {
            Write-Host "Setting firewall rule '$RuleName' on server '$ServerName' to IP address '$IpAddress' (was '$CurrentIp')..."
            Set-AzureSqlDatabaseServerFirewallRule -StartIPAddress $IpAddress -EndIPAddress $IpAddress -RuleName $RuleName -ServerName $ServerName;
        }
        
    }
    
    if ($IpAddress = Get-MyIpAddress)
    {
        Write-Host "Your IP address is $IpAddress"
    
        foreach ($s in $AzureServers) 
        {
            Update-MyAzureFirewallRule -ServerName $s -IpAddress $IpAddress;
        }
    }

    This post provided the inspiration, which I then tweaked it to suit my needs. Like I said; it’s quick-and-dirty, so use at your own risk. 😉  I’m no PowerShell guru either, so feel free to let me know if you improve on it.

    Cheers,
    Dave

  • SQL Server version information from build numbers

    SQL Server version information from build numbers

    Just a quick post in case anyone else finds this useful; for a report I’ve created, I wanted to have SQL Server versions displayed in a nice, readable form, rather than the usual build number format (e.g. 11.0.3128.0).

    Server dashboard report

    So I found the very useful SQL Server Builds page (thanks to whoever maintains that), and proceeded to write a function which accepts the build number, and spits out a friendlier “version string”, down to the service pack and cumulative update level of resolution. (I didn’t need anything lower than that, but you could easily add it in yourself).

    Here’s the file, feel free to download, use, hack to pieces, etc. 😉

    function

    Cheers,
    Dave

  • Customising SQL Management Studio 2012

    I’m setting up my new work laptop, and decided to have a look around the net for what the cool kids were doing these days as far as customising SQL Management Studio 2012.  Surprisingly not a lot seems to have changed (i.e. changing font, adding keyboard shortcuts, and using a trendy dark theme are still the most common tweaks).

    Aesthetics

    So I found a dark theme that I actually might give a chance (I’ve tried a few, but always seem to revert to the default after a few hours).  I’m still not sold on the dark development environment – but there are certainly enough colour combinations available from sites like http://studiostyl.es/.

    SQL Server Management Studio 2012

    Productivity

    I also added my usual battery of keyboard shortcuts; sp_helptext, sp_whoisactive, and a wrapper script that I unimaginatively call sp_helpindex2 (which auto-detects the version of SQL running, and then calls the corresponding version of Kimberly Tripps improved sp_helpindex proc).

    Some other minor tweaks include scripting options (Tools -> Options -> SQL Server Object Explorer -> Scripting), for example adding descriptive headers, checking for object existence, etc.

    There are loads of other tweaks you can make – do yourself a favour and take the time to read through every page of options… I’m sure you’ll find more ways of improving your environment.

    Templates

    I then needed to get my templates setup (which, as I detailed in my previous post, I use to store frequently used scripts and code snippets).  This takes a little jiggery-pokery since SSMS doesn’t natively let you specify a new location for these files.

    What you need to know before starting is how Management Studio handles these template directories; in the following default path, C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn\ManagementStudio\SqlWorkbenchProjectItems, there will be a directory called “SQL” which contains all the default templates.

    SSMS Template Directories

    When you open the templates pane in SSMS it compares the contents of this directory to your user template directory (e.g. C:\Users\username\AppData\Roaming\Microsoft\SQL Server Management Studio\11.0\Templates, or explorer window #2 in the screenshot above), and “syncs” up any differences.  I’m sure this is over-simplifying it, but it’s a good-enough explanation for the purposes of this post. Window #3 shown above is the directory containing my custom scripts library in my Sky Drive folder.

    So what you want to do is automagically “replace” the contents of #1 with #3 – SQL will take care of keeping #2 in sync (I hope…).

    To do this you need to create a symbolic link from #3 to #1.  To learn about symbolic links, read this.  They are essentially shortcuts… but on steroids.  Here’s the command I use to setup my environment:

    cd "C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn\ManagementStudio\SqlWorkbenchProjectItems"
    ren Sql Sql_Old
    mklink /D Sql "C:\My Stuff\SkyDrive\Scripts\SQL\SSMS_Templates\Sql"

    Obviously your paths (and username) will differ, so make the appropriate changes before trying this yourself.  Please be sure about what you’re doing (never just run something from the Internet without 100% understanding what it does). I take no responsibility if you somehow manage to brick your system – YMMV, etc, etc.

    All going to plan, you’ll now have SSMS using your own source of templates. 🙂

    Cheers,
    Dave