DROP/CREATE vs. CREATE/ALTER in SQL Server

Over the last two years, as I’ve been working more closely on our application database (a database with nearly 20 years of history), one simple question keeps coming up as I work on old scripts and prepare new changes.

Should we use a DROP/CREATE or CREATE/ALTER pattern for Stored Procedures, Functions, Views, and Triggers?

Because of the tooling we had used previously, every script I had worked with consistently used the DROP/CREATE pattern when doing updates. For example:

IF OBJECT_ID('dbo.Proc123', 'P') IS NOT NULL
   DROP PROC dbo.Proc123
GO
CREATE PROC dbo.Proc123
...
GO

GRANT EXECUTE ---
GO

And I never questioned it. When you manage hundreds of client databases that all share the same schema and send updates multiple times a year to on-premise clients, it was just easier to make sure things were cleaned up and then create the PROC/Trigger/etc. again.

But then I started seeing a new pattern from a number of DBAs and teachers. I think I first noticed it with Adam Machanic’s sp_WhoIsActive and soon after with Brent Ozar’s sp_Blitz family of scripts. They used a different approach similar to this:

IF OBJECT_ID('dbo.Proc123', 'P') IS NULL
  EXEC ('CREATE PROC dbo.Proc123 AS SELECT ''stub version, to be replaced''') 
GO

ALTER PROC dbo.Proc123
  ...
GO
 
GRANT EXECUTE --- 
GO

Knowing these folks to be particularly smart in many things SQL, I figured there was a reason, although my Google-Foo only turned up a few opinions. There was, as best as I could tell, no hard and fast rule for what was ‘right’.

In most cases, the reason most people tended towards the CREATE/ALTER approach was to retain any previously set GRANT’s on the object. That makes a lot of sense and is compelling enough for me to start switching.

But then, during Erin Stellato‘s PASS  2017 Pre-Con on Query Store, she mentioned the most compelling reason I’ve heard thus far.  It’s also something that I’m sure translates into other DMV-like tuning data in versions prior to SQL 2016 that don’t have query store.

Query Store works by tracking the identity of an object (Object_ID) and the performance metrics related to that object. If you are working to tune something like a PROC and want to track its performance in Query Store, you’ll be shooting yourself in the foot if you DROP/CREATE the PROC each time because it will get a new Object_ID and the results of your changes won’t be tied to the previous metrics in the plan cache. It turns out that this is also mentioned as a best practice in the Query Store documentation.

While this should seem obvious once you hear it, this alone was a compelling reason to switch our processes and begin using CREATE/ALTER instead.

One final note

There is a new feature in SQL 2016 CP3, called ‘CREATE OR ALTER’. It allows you to ask the SQL engine to do the CREATE/ALTER logic for you so that you don’t have to write the CREATE check first when rolling out changes. Unfortunately we support versions prior to SQL 2016 so we cannot yet move to make this our default pattern.  Someday!

Long-Running Query Alerts With sp_WhoIsActive

Who doesn’t love sp_WhoIsActive? Nearly every single DBA talk I have attended in the last two years mentions as a “must have!” I do the same during my DBA Newbie talk at SQL Saturdays.

About 18 months ago I came across this post by Tara Kizer and started logging activity every 30 seconds on most of our servers using sp_WhoIsActive. That simple job has helped us diagnose the root cause of server performance issues numerous times and I often query the results just to get a quick idea of queries that are consistent resource hogs. Our application is heavy on ad hoc reporting and this gives us a window into reports being run in ways we didn’t expect so that we can evaluate and tune the base queries better.

Our SaaS environment has hundreds of databases and thousands of users. Inevitably, one of two things usually causes us to scramble when server performance tanks.  One of those ad hoc reports runs amuck and consumes more resources then we have, or we have a partner use an API in a way we never intended, suddenly bringing the server to a halt. Each of these occurrences provides a learning opportunity to improve our application and processes and prevent repeat failures. But we’re not perfect… and I hate when a customer informs us of a problem before we know about it.

I’ve tried a number of suggestions for monitoring long-running queries, from the actual server performance alert to our commercial monitoring software. In many cases I just didn’t have the control over the alerts I wanted (or the alert fired inconsistently) and I kept thinking I could come up with a way to do this with sp_WhoIsActive that was scriptable across all of my servers and fairly low maintenance.

On my first attempt, I assumed I was missing the “right way” to use sp_WhoIsActive from a scheduled SPROC.  When run from within a SPROC, the context of the SPROC and sp_WhoIsActive are different. Therefore, you can’t use a local temp table or table variable to pass information back and forth. Instead, the only solution I found was to use a global temp table and that just “felt wrong”. In my search for solutions I found this post by Kendra Little on outputting the results of sp_WhoIsActive to a global temp table for further processing and that set my mind at ease a bit to choose a similar route.

Armed with all of the above, I’ve created the first version of my long-running query email alert. I’m sure there will be some updates and improvements in the coming months, but I wanted to share this in case someone else finds it useful as a starting example.

First Things First – Initial Setup

This solution expects a standard database on all of our servers to hold monitoring and utility information called DBA_stats, an idea taken from that post by Tara Kizer above.  This table holds both my regular sp_WhoIsActive output (what I run every 30 seconds) and a second table that holds a log of long-running queries.

That second table has a few additional columns that help me manage simple alerting, sending an email at a low threshold (5 minutes) and again at a high threshold (30 minutes) if the query is still running.

***NOTE: The following ASSUMES that you already have sp_WhoIsActive installed. If not, go do that first!  🙂

[sql]

USE master
GO

SET NOCOUNT ON;
DECLARE @retention int = 7,
@destination_table varchar(500) = ‘WhoIsActiveLogging’,
@longrunning_table VARCHAR(500) = ‘LongRunningQueries’,
@destination_database sysname = ‘DBA_stats’,
@schema varchar(max),
@SQL nvarchar(4000),
@createSQL NVARCHAR(500),
@alterSQL NVARCHAR(2000),
@parameters nvarchar(500),
@exists bit;

SET @destination_table = @destination_database + ‘.dbo.’ + @destination_table;
SET @longrunning_table = @destination_database + ‘.dbo.’ + @longrunning_table;

If(db_id(@destination_database) IS NULL)
BEGIN;
PRINT ‘Creating stats database: ‘ + @destination_database;
SET @createSQL = N’CREATE DATABASE ‘ + @destination_database + ‘;

ALTER DATABASE ‘ + @destination_database + ‘ SET RECOVERY SIMPLE;’;

EXEC(@createSQL);
END;

–create the logging table
IF OBJECT_ID(@destination_table) IS NULL
BEGIN;

PRINT ‘Creating periodic logging table: ‘ + @destination_table;

EXEC sp_WhoIsActive @get_transaction_info = 1, @get_outer_command = 1, @get_plans = 1, @format_output=0, @return_schema = 1, @schema = @schema OUTPUT;
SET @schema = REPLACE(@schema, ‘<table_name>’, @destination_table);
EXEC(@schema);
END;

–create the long-running query table
IF OBJECT_ID(@longrunning_table) IS NULL
BEGIN;

PRINT ‘Creating long-running queries table: ‘ + @longrunning_table;

EXEC sp_WhoIsActive @get_transaction_info = 1, @get_outer_command = 1, @get_plans = 1, @format_output=0, @return_schema = 1, @schema = @schema OUTPUT;
SET @schema = REPLACE(@schema, ‘<table_name>’, @longrunning_table);
EXEC(@schema);

SET @alterSQL = N’
ALTER TABLE ‘+ @longrunning_table + ‘ ADD
id INT IDENTITY CONSTRAINT PKC_ID PRIMARY KEY CLUSTERED,
email_sent BIT CONSTRAINT DF_email_sent DEFAULT 0,
email_time DATETIME NULL,
email2_sent BIT CONSTRAINT DF_email2_sent DEFAULT 0,
email2_time DATETIME NULL;

CREATE NONCLUSTERED INDEX IX_SessionID_LoginName_DatabaseName_StartTime ON ‘+ @longrunning_table +’ (session_id, login_name, database_name,start_time);
‘;

EXEC(@alterSQL);
END;

–create index on collection_time
SET @SQL = ‘USE ‘ + QUOTENAME(@destination_database) + ‘; IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(@destination_table) AND name = N”cx_collection_time”) SET @exists = 0′;
SET @parameters = N’@destination_table varchar(500), @exists bit OUTPUT’;
EXEC sp_executesql @SQL, @parameters, @destination_table = @destination_table, @exists = @exists OUTPUT;

IF @exists = 0
BEGIN;
SET @SQL = ‘CREATE CLUSTERED INDEX cx_collection_time ON ‘ + @destination_table + ‘(collection_time ASC)’;
EXEC (@SQL);
END;
GO

[/sql]

 Readability Matters – Pretty HTML Emails

Once we start getting data into the long-running table I want to send email alerts. Raw query output doesn’t look good in email, so we need something to turn this into an HTML table. I looked at number of simple solutions for this task and landed on this example from StackOverflow.com.  The code below has a few tweaks and some CSS styles to help it look more readable in email.

[sql]

USE DBA_Stats
GO

PRINT ‘Creating Stored Procedure QueryToHtmlTable’;
GO

IF OBJECT_ID(‘QueryToHtmlTable’,’P’) IS NOT NULL
DROP PROCEDURE QueryToHtmlTable
GO
— Description: Turns a query into a formatted HTML table. Useful for emails.
— Any ORDER BY clause needs to be passed in the separate ORDER BY parameter.
— =============================================
CREATE PROC QueryToHtmlTable
(
@query nvarchar(MAX), –A query to turn into HTML format. It should not include an ORDER BY clause.
@orderBy nvarchar(MAX) = NULL, –An optional ORDER BY clause. It should contain the words ‘ORDER BY’.
@html nvarchar(MAX) = NULL OUTPUT –The HTML output of the procedure.
)
AS
BEGIN
SET NOCOUNT ON;

IF @orderBy IS NULL BEGIN
SET @orderBy = ”
END

SET @orderBy = REPLACE(@orderBy, ””, ”””);

DECLARE @realQuery nvarchar(MAX) = ‘
DECLARE @headerRow nvarchar(MAX);
DECLARE @cols nvarchar(MAX);

SELECT * INTO #tableSQL FROM (‘ + @query + ‘) sub;

SELECT @cols = COALESCE(@cols + ”, ””””, ”, ””) + ”[” + name + ”] AS ””td”””
FROM tempdb.sys.columns
WHERE object_id = object_id(”tempdb..#tableSQL”)
ORDER BY column_id;

SET @cols = ”SET @html = CAST(( SELECT ” + @cols + ” FROM #tableSQL ‘ + @orderBy + ‘ FOR XML PATH(””tr””), ELEMENTS XSINIL) AS nvarchar(max))”

EXEC sys.sp_executesql @cols, N”@html nvarchar(MAX) OUTPUT”, @[email protected] OUTPUT

SELECT @headerRow = COALESCE(@headerRow + ””, ””) + ”” + name + ””
FROM tempdb.sys.columns
WHERE object_id = object_id(”tempdb..#tableSQL”)
ORDER BY column_id;

SET @headerRow = ”” + @headerRow + ””;

SET @html = ”<html>
<head>
<style type=”text/css”>
table {
color: #333;
font-family: Helvetica, Arial, sans-serif;
width: 2000px;
border-collapse:
collapse; border-spacing: 0;
}

td, th {
border: 1px solid transparent; /* No more visible border */
height: 30px;
}

th {
background: #DFDFDF; /* Darken header a bit */
font-weight: bold;
}

td {
background: #FAFAFA;
text-align: center;
}

/* Cells in even rows (2,4,6…) are one color */
tr:nth-child(even) td { background: #F1F1F1; }

/* Cells in odd rows (1,3,5…) are another (excludes header cells) */
tr:nth-child(odd) td { background: #FEFEFE; }
</style>
</head>
<body>
<table width=”2000″ border=”1″>” + @headerRow + @html + ”</table>
</body>
</html>”;
‘;

EXEC sys.sp_executesql @realQuery, N’@html nvarchar(MAX) OUTPUT’, @[email protected] OUTPUT;
END;
GO

[/sql]

Doing the Work – Finding Long-Running Queries

This solution runs sp_WhoIsActive every minute and saves the output into a global temp table. From there, I look for any processes that have been running for more than the low threshold setting. Any of the processes that have not been identified and stored previously get logged, output to an HTML table, and an email alert sent.

Next, I take a second look at the table for anything that’s been running longer than the high threshold.  If a second email alert has not been sent for these processes, we output the same data and send the email. If two alerts have already been sent for these processes, I don’t do anything else at the moment. One of the next updates to this script will send an alert to our DevOps notification system for anything running longer than some final threshold (or maybe just the high threshold).

[sql]

USE DBA_stats;
GO

IF OBJECT_ID(‘LongRunningQueriesAlert’,’P’) IS NOT NULL
DROP PROCEDURE LongRunningQueriesAlert
GO

PRINT ‘Creating Stored Procedure: LongRunningQueriesAlert’;
GO

CREATE PROCEDURE [dbo].[LongRunningQueriesAlert]
@email_Subject VARCHAR(255) = ‘Long-Running Queries on ‘,
@low_threshold_min VARCHAR(2) = ‘5’,
@high_threshold_min VARCHAR(2) = ’30’,
@dbmail_profile VARCHAR(128) = ‘DB Alerts’,
@email_recipients VARCHAR(500) = ‘[email protected]
AS
BEGIN

SET NOCOUNT ON;

DECLARE @server_name VARCHAR(255),
@schema NVARCHAR(4000),
@SQL NVARCHAR(4000),
@lrq_table VARCHAR(255),
@html NVARCHAR(MAX),
@low_threshold_subject VARCHAR(255),
@high_threshold_subject VARCHAR(255);

SET @server_name = @@SERVERNAME;
SET @email_Subject = @email_Subject + @server_name;

 

/*
Setting this to a global temp table so that it’s available to the select after it is
created and we insert data
*/
SET @lrq_table = QUOTENAME(‘##LongRunningQueries_’ + CAST(NEWID() as varchar(255)));

–create the logging temp table
IF OBJECT_ID(@lrq_table) IS NULL
BEGIN;
EXEC sp_WhoIsActive @get_transaction_info = 1,
@get_outer_command = 1,
@get_plans = 1,
@format_output=0, — Don’t format output so that it works in an email
@return_schema = 1,
@schema = @schema OUTPUT;
SET @schema = REPLACE(@schema, ‘<table_name>’, @lrq_table);

EXECUTE sp_executesql @schema;
END;

— Run WhoIsActive again and put results into the table
EXEC sp_WhoIsActive @get_transaction_info = 1,
@get_outer_command = 1,
@get_plans = 1,
@format_output=0,
@destination_table = @lrq_table,
@not_filter = ‘PITTPROCWIN01’, @not_filter_type = ‘host’;

/*
Insert any new long-running queries that haven’t existed before

The WHERE clause below is very specific at the moment and not very flexible.
Improvements to what we ignore and how we specify it are needed.
*/
SET @SQL = N’
INSERT INTO LongRunningQueries ([session_id], [sql_text], [sql_command], [login_name], [wait_info], [tran_log_writes], [CPU], [tempdb_allocations], [tempdb_current], [blocking_session_id], [reads], [writes], [physical_reads], [query_plan], [used_memory], [status], [tran_start_time], [open_tran_count], [percent_complete], [host_name], [database_name], [program_name], [start_time], [login_time], [request_id], [collection_time])
SELECT tempLRQ.*
from ‘ + @lrq_table + N’ tempLRQ
LEFT JOIN LongRunningQueries LRQ ON
LRQ.session_id = tempLRQ.session_id
AND LRQ.login_name = tempLRQ.login_name
AND LRQ.database_name = tempLRQ.database_name
AND LRQ.start_time = tempLRQ.start_time
WHERE LRQ.session_id IS NULL
AND tempLRQ.start_time < DATEADD(MINUTE,-‘ + @low_threshold_min + N’,GETDATE())
AND tempLRQ.database_name NOT in (”master”,”msdb”,”tempdb”,”DBA_Stats”)
AND tempLRQ.program_name NOT LIKE ”%Service Broker%”
AND tempLRQ.program_name <> ”SQBCoreService.exe”’;

EXEC sp_executesql @SQL;

/*
Now send the emails for any new long-running queries
*/
—- Using the new SPROC, format the output as HTML for the email,
EXEC QueryToHtmlTable @query = N’SELECT id, LRQ.session_id, LRQ.sql_text,LRQ.blocking_session_id, LRQ.reads, LRQ.writes, LRQ.status, LRQ.host_name, LRQ.database_name, LRQ.program_name,
LRQ.start_time FROM dbo.LongRunningQueries LRQ
WHERE LRQ.email_sent = 0′ ,
@orderBy = N” ,
@html = @html OUTPUT

IF(LEN(@html) > 1)
BEGIN
SET @low_threshold_subject = @email_Subject + ‘ – >’ + @low_threshold_min + ‘ minute(s)’;
—- Now send the email
EXEC msdb.dbo.sp_send_dbmail
@profile_name = @dbmail_profile,
@recipients = @email_recipients,
@subject = @low_threshold_subject,
@attach_query_result_as_file = 0,
@importance = ‘Normal’,
@body = @html,
@body_format=’html’

/*
Update the table to specify that all new queries have had a notification sent
*/
UPDATE dbo.LongRunningQueries SET email_sent = 1, email_time = GETDATE() WHERE email_sent = 0;
END

 

/*
Now get a list of queries that are still running after the second threshold time has elapsed.
Someone REALLY needs to get on these.
*/
CREATE TABLE #HighThresholdQueries (
id INT
);

SET @SQL = N’INSERT INTO #HighThresholdQueries ( id )
SELECT id FROM dbo.LongRunningQueries LRQ
INNER JOIN ‘ + @lrq_table + N’ tempLRQ ON
LRQ.session_id = tempLRQ.session_id
AND LRQ.login_name = tempLRQ.login_name
AND LRQ.database_name = tempLRQ.database_name
AND LRQ.start_time = tempLRQ.start_time
WHERE tempLRQ.start_time < DATEADD(MINUTE,-‘+ @high_threshold_min + N’,GETDATE())
AND lrq.email2_sent = 0′;

EXEC sp_executesql @SQL;

/*
Now send the emails for any long-running queries that have persisted
past the high threshold
*/
— Reset the variable for reuse
SET @html = ”;

—- Using the new SPROC, format the output as HTML for the email,
EXEC QueryToHtmlTable @query = N’SELECT id, LRQ.session_id, LRQ.sql_text,LRQ.blocking_session_id, LRQ.reads, LRQ.writes, LRQ.status, LRQ.host_name, LRQ.database_name, LRQ.program_name,
LRQ.start_time FROM dbo.LongRunningQueries LRQ
WHERE LRQ.id in (select id from #HighThresholdQueries)’ ,
@orderBy = N” ,
@html = @html OUTPUT

IF(LEN(@html) > 1)
BEGIN
SET @high_threshold_subject = @email_Subject + ‘ – >’ + @high_threshold_min + ‘ minute(s)’;
—- Now send the email second email
EXEC msdb.dbo.sp_send_dbmail
@profile_name = @dbmail_profile,
@recipients = @email_recipients,
@subject = @high_threshold_subject,
@attach_query_result_as_file = 0,
@importance = ‘High’,
@body = @html,
@body_format=’html’

/*
Update the table to track that a second email has been sent for a query that has
been running for an extended period of time
*/
UPDATE LongRunningQueries SET email2_sent = 1, email2_time = GETDATE() WHERE id in (select id from #HighThresholdQueries) AND email_sent = 1 AND email2_sent = 0;

END

/*
Drop Temporary Tables
*/
DROP TABLE #HighThresholdQueries;

SET @SQL = N’DROP TABLE ‘ + @lrq_table;
EXEC sp_executesql @SQL;

END;
GO

[/sql]

Ready, Set – Schedule The Job

With all three pieces in place, setup a scheduled job to call the LongRunningQueriesAlert SPROC on a regular basis. I do it every minute and haven’t seen an impact at this point. This allows me to be alerted to a problem within a minute after it crosses any of the thresholds. Adjust as you see fit.

One of the biggest initial benefits has actually been on our development boxes. I’ve been notified a number of times when various SQL is run in development that leaves open transactions unknowingly. That might seem mundane but it’s been very helpful as the team has been growing.

Rinse and Repeat

Having run this for a few weeks, I plan to make the following changes to my scripts given what I know about our environment.

  1. Adjust the threshold times to be a little longer, maybe as high as 10 minutes for the first alert. We have a few jobs that could run that long in big databases. As expected, these queries tend to cause an alert email almost every day.
  2. Modify the schedule to not include our maintenance window in most cases. The script will catch anything that persists from the maintenance period as soon as I start it up again in the early morning.
  3. Make the filtering more easily injectable. sp_WhoIsActive provides some of the filtering that I’m not taking advantage of, and some it will come from improvements in my queries.

Shipping Database Changes

T-SQL Tuesday is a monthly event where SQL Server bloggers write a post based on a subject chosen by the month’s host. This month’s #tsql2sday event is being hosted by James Anderson (b|t) and this months topic is: Shipping Database Changes.

This month’s #tsql2sday topic couldn’t have come at a better time.  We’re once again faced with some of the biggest changes in recent memory to hundreds of client databases (many that we host, but some that we don’t), so I’m anxious to see what others are doing.  But if I’m going to take, it only feels right to try and give something back.

I’ve been at EnergyCAP for almost 13 years.  The database and application was started ~18 years ago by a team at Enron (it’s a long story…). They worked to transition the product from Sybase to SQL Server.  And although the product was never used by a paying Enron customer, in three years of development they went from version zero of the database to version 39 when our CEO re-acquired the IP.

From then until now, we’ve tried a lot, learned a lot and still have plenty of things we “wish” we could do better.  (insert your favorite Animaniac gazing wide-eyed at the “town of Continuous Integration” with little hearts bubbling in the air…).

Anyway, I’ll compare our journey to the basic stages of primary education.  Somehow, that seems a fitting comparison as I think about where we’ve been, what we’ve learned along the way, and where we anxiously hope to be in years to come.

Preschool Bliss

The initial process was, honestly, pretty simple.  For two reasons.  There weren’t many clients to support yet (most of which never touched the database outside of the application) and our expectations were pretty low.  The process looked like this (generally):

Step 1: Change model in Erwin based on developer needs.
Step 2: Export the “create new” database script
Step 3: Run a myriad of batch files that would take the new script, a source database for “core-pop” data, and some other mojo, resulting in an “upgrade” script that would:

  • Rename all of the existing tables to ‘TablenameOld’
  • Recreate all tables (including any new ones)
  • Select/Insert all data from old table to new table (turn off identities, etc.)
  • Add constraints, indexes, triggers and SPROCs.
  • Delete old tables
  • Verify any “core-pop” data (defined properties/data that our app requires) and insert/update as appropriate
  • Set appropriate permissions

Step 4: Take “upgrade” script and run it on client databases.
Step 5: Pray… Wait! That was supposed to be step 4!!

This was not a transactional script… of course who would want the entire database to be whacked and recreated in a transaction… but I digress.  Our only defense was to make sure we had a backup (or our self-hosted clients did) before the script ran.  Once we were past statement #1, all bets were off.

And we felt the pain of that.  We didn’t upgrade often because it took a long time AND correcting data when things bombed was cumbersome.  There were MANY times we had client call in need of immediate assistance because the script failed and their backup wasn’t good.  Fun times and all of that.  From 2004 (when I showed up) until 2010 there were 7 database modifications, slightly more than one a year.

The Grade School Days

At some point in 2010 we finally said “enough is enough” (no, really, I’m pretty sure that’s exactly how the conversation went after a long night of database upgrades recreations).

Although I don’t believe we had used RedGate SQL Compare yet (was it even around in 2010??), our DBA took the challenge to move away from the “recreate as upgrade” process and to start doing a better job tracking what had actually changed.  Because he’s wicked smart with this SQL stuff, he took the first set of scripts and modified them to take some hand coded upgrade parts of his, fetch and recreate the SPROCs, mash in the triggers that were exported from the modeling software, and eventually into a more traditional “upgrade” script.  I can still remember the first time I ran an actual “upgrade” script on our largest database… and promptly called him in fits of joy because it only took ~10 minutes. (again, insert googly-eyed, heart spewing, Animaniac here…)

It was good progress.

Some things we learned along the way that improved our ability to scale to hundreds of databases with more confidence, included

  • Making sure the beginning of the scripts actually verified they were running against the correct version of the database.  Did you know it turns out not all clients read instructions? No… seriously…
  • Finally taking this “upgrade” approach allowed us to decouple app releases from database version updates.  If we needed to make incremental changes to the database as we worked on the new release (we had long release cycles at the time) we could create incremental upgrade scripts that were easily rolled up into one master upgrade when a new version of the product shipped.
  • At some point in here, we also introduced better error checking and transactional changes.  Since the changes were usually smaller in scale, we could gain some control over “getting out safely” when things failed. It didn’t always work, but it was a start.

The Middle School Days (Current Status)

Starting around 2012 we investigated using RedGate SQL Compare to help with this process.  Since we generally script out the initial implementation of the changes before updating the model, it was easy to make a new database with our “create new” script, apply those changes, and then do the compare.  In doing so, a lot of the error checking we had introduced previously was a standard part of the scripts the tool created.  As I’ve said before, this wasn’t a perfect solution, but it got us another step further in the processes.

In general, the biggest issues we hit continue to be client customizations to the database (even ones we sanction) and an ever growing set of core-pop data that we manage and have to proactively defend against client changes.  This is an area we just recently admitted we need to take a long, hard look at and figure out a new paradigm.

I should mention that it was also about this time that we were finally able to proactively get our incremental changes into source  control.  All of our final scripts were in source somewhere, but the ability to use SQL Compare and SQL Source Control allowed our developers to finally be a second set of eyes on the upgrade process.  No longer were we weeding through 50K lines of SQL upgrade just to try and find what changed.  Diffing whole scripts doesn’t really provide any good context… especially when we couldn’t guarantee that the actions in the script were in the same order from release to release.  This has been another huge win for us.

I’m not a fan of everything RedGate, but these two products have been invaluable to us.  Even for cases when a client messes up an upgrade and we’re stuck with a partial setup, using SQL Compare has saved our bacon many times.

Looking Towards High School

In all, I’d say that we’ve been in a protracted time of middle school.  Being small and resource constrained, it took us longer to get through SQL pre-algebra and composition classes than some.  We’ve had our fits of wishing we were up walking the halls with the big kids, doing automated database deployments or including more of the development staff in the process for backup.

In fact, right now we’re about to make some of the most drastic changes to our schema and process in all of the time I’ve been here and it’s given us another chance to talk about our dreams and desires for the process.  We’ve even included our small (but growing) DevOps team into more of the discussions, looking for ways we can help each other.

What about you?

What resonates with you about our journey?  Are you a little further along and have tips?  Please… share!

In the meantime, I can’t wait to read all of the other posts on this topic to see what we might be able to glean.

Easy, Free Lead Magnets for WordPress

Do you want to grow your blog by creating a Lead Magnet?

Connect with new potential readers by offering something in exchange for their contact information, a name and email address.

When my wife, Laura, was ready to offer her first lead magnet landing page I thought it would be an easy task.  I thought, “Surely, there’s a decent, free WordPress plugin that allows me upload a file somewhere, provide a nice custom form and sync to Mailchimp.

Sounds easy, right??

Although there are many lead magnet plugins most of them didn’t work exactly like we wanted, nor were they free.  Other successful bloggers embed a form from their for-fee marketing websites.  We weren’t ready to start paying monthly fees just to put a PDF online.

So, I decided to create a way for Laura to offer multiple Lead Magnets for free.

There are posts out there on how to accomplish this. Many of the posts discuss how to modify your welcome emails in programs like Mailchimp and attach your file to the RSS email.  While that works and is a decent option for an initial welcome email, it doesn’t allow you to have multiple landing pages and lead magnets. In fact, none of the free or low-cost plugins that focus only on lead magnets manage multiple downloads/attachments and sync with email list providers.  (I’m sure there is one out there, but after an hour of trying multiple plugins I decided to blaze our own trail.)

But Wait! Before You Get to Work…

One potential risk of using this method is that emails will be coming from whatever server your WordPress site is hosted on.  In some cases, the main email providers (Gmail, Yahoo, AOL, etc.) will be suspicious of email coming from a random server claiming to be from you.  There are ways around this, too, but it is generally “safer” to set up a trusted SMTP email provider and use them.  Any of the main SMTP providers have plugins that make it easy to integrate with them, and then any email that WordPress sends will use that new connection.  Some of the popular options are:

  • GMail SMTP (may require a developer or Apps for Business account with Google)
  • Mailgun (Account required, but you can send up to 10,000 emails free each month!)
  • SendGrid (Account required. Starts at $9.95/month)

Free WordPress Lead Magnet Setup

Our solution was to use Contact Form 7 to create the embedded form.

Typically bloggers use this plugin to put a contact form on their website that allows visitors to submit a contact form to them.  But that’s not the only way it can be used.  With a few simple changes you can have the email sent back to the user and you can attach your lead magnet to it.  Sweet!

Before you start making your lead magnet form, make sure you’ve uploaded the lead magnet file through the “media” module in WordPress.  Once uploaded, copy the file URL. You’ll attach this URL to the email that WordPress will send to all of the people who sign up at your site.

Setup

  1. Find the new “Contact” item in your WordPress Admin menu and click “Add New”
  2. On the Form tab, remove the Subject and Message items from the contact form.  All you want is the Email (“your-email”) and Name (“your-name”) field.

  3. On Mail tab, setup the email that will get sent when someone submits the form.  By default, this is set up to send a submitted form to you, the blog owner.  In this case, however, you’ll configure it to send an email back to the user with an email attachment.  Use the following setup as a guide.
    • To: Use the included ‘mail-tags’ and set it to “[your-name] <[your-email]>”
    • From: Put your name and email. This is what will show in the visitor’s email client
    • Subject: Make it compelling, personal, and related to the content they registered for!
    • Additional Headers: An easy way to see who is registering for your download is to include you as a BCC.  Once you’re satisfied it’s coming through and working, you could edit this and remove yourself.  If it’s popular, you’ll get a lot of email.
    • Message Body: As most people learn, the more personal you can make this, the better. Email clients like GMail are really good at filtering out stuff that looks like an advertisement. Whatever you type in this box is exactly what they’ll see in their email.
    • File Attachments: THIS IS THE KEY!  Take the URL of the file that you copied earlier and paste it in this box.  The trick is that you need to make this relative URL to your website.  Depending on your setup you may have to play around a little, but in most cases, deleting everything before the word “uploads” will probably work.  The great thing is that when you save the form, it will tell you if it can’t find the file based on the path you gave it.
  4. Save the form!  Assuming you don’t have any errors, you’ll get a shortcode at the top of the screen to copy into your lead magnet Post/Page.

  5. Head over to your lead magnet post and paste the shortcode into the appropriate place on the page.

  6. Finally, save your post and give it a look!

Additional Tips

Save your leads

In step #3 above, I show how you can put your email into the form as a Blind Carbon Copy (BCC:).  While this gives you some assurance that the forms are indeed being downloaded, it generates a lot of email with the same attachment over and over again… all looking as if they’re coming from you.  It gets old quickly.

Instead, consider taking that BCC: tag out and installing a plugin that will save all of your form submissions in your WordPress DB.  We chose Contact Form Submissions.  It’s not a perfect plugin, but it does the job and allows us to know that things are still working.  The biggest problem is that it looks like all of the submissions are coming from you because, generally, CF7 is used to allow visitors to submit forms to you.  Not a big deal for us.

Integrate with your email list

Obviously the point of getting leads to download your stuff is so that they’ll give you their contact information.  At some regular interval (or with other plugins that I might cover later), you need to make sure these leads are getting added to your email lists.  They don’t do you much good just hanging out in your WordPress Admin screen.  🙂

Now go… lead those magnets!!!

What? All of the constraints are disabled? (T-SQL Tuesday #88)

T-SQL Tuesday is a monthly event where SQL Server bloggers write a post based on a subject chosen by the month’s host. And although this is only the second actual post of this blog (fret ye not, there’s more on the way), I’m delighted to finally be participating more in the SQL community that has helped me tremendously over the years.

This month’s #tsql2sday event is being hosted by Kennie Nybo Pontoppidan (b|t) and this months topic is: The daily (database-related) WTF.

The topic couldn’t be more timely for me, actually.  My first post detailed the beginning of a long journey into overhauling our 18-ish year old DB schema and architecture for our flagship application.  Now undergoing its third major revision, the EnergyCAP database has served us well for many years and thousands of clients… but the time is right for digging in and pulling this classic into the age of modern SQL Server.  I have a queue of blog posts about how we’ve gone about this work and how we’re solving problems anew (for us at least), but one discovery during the process still causes me to shake my head.

So sit right back and you’ll hear a tale… (a tale of a fateful trip… sorry, couldn’t resist)

The lay of the land

Our database isn’t huge by most accounts, but at 331 tables in a standard database, it’s not small either.  There was significant funding behind the original design back in the good old Enron days (yes, that Enron…) and so they made plenty of good design choices that still serve us well today.

As you’d expect, one of those choices was being diligent about foreign key setup and maintenance, largely facilitated by a data modeling tool.  At the time they chose ErWin.  All of these years later, I can’t tell you exactly why they chose it or if there were any other options considered.  I don’t even know anyone on the team from 1999 that went about taking a solid database design and bringing it to life with ErWin, modeling the design documents in the application and then back out into creation and migration scripts.  Most of it was executed well.

Except for that one thing that never set well with me… that one thing that I never really spent time digging into. (Thank you, Mr. Hindsight…)

Although we had Foreign Key constraints neatly setup and maintained in the schema, helping us to draw beautiful ER diagrams, all of our data integrity logic was held in triggers.  I also never thought to ask why all of our FKs had names like “r_0013”.

The WTF moment

And so imagine my surprise a few months ago, having finally freed up some time to seriously dig into our DB, when I came across one of our typical migration scripts trying to understand some recent changes.  I noticed something I’d never seen before in a part of the script that was trying to modify a Primary Key for some testing we were doing.  First we had to drop some Foreign Keys so that we could modify the Primary Key indexes.  But then I saw this as the FKs were being added back in:

[sql]ALTER TABLE [dbo].[abc] WITH NOCHECK ADD CONSTRAINT…[/sql]

Huh?  “WITH NOCHECK”?  Um… has that always been there?  Was this just a fluke because these were one-off testing scripts?  Surely we wouldn’t disable all of our constraints, right?  I mean, besides data integrity there are lots of potential small Query Engine wins wrapped up in all of those constraints.

Still looking at the screen somewhat baffled, I picked up my phone and dialed the extension of our Developer/part-time DBA.

“Afternoon Joe! Hey, I have another one of those ‘out of left field’ questions for you.”  (I pause to give him time to roll his eyes.)

“I’m looking at these testing scripts and I noticed that we were adding back all of the Foreign Keys with NOCHECK.  Have we always been doing that or is this just some stuff in this script?”

After a slight pause Joe responds, “Um, well, yeah…” It’s apparent that he’s quickly getting the gist of my question.  I’m also sure he’s wondering why I’m asking this now after being around this joint for years.  “All of our FK constraints have always been created with NOCHECK.  I guess it was just an ‘old school’ way of doing the DB design back in the day… relying on triggers for the data integrity and better, human readable constraint failure messages.  We’ve just never changed things over all these years.”

Wow.  721 Foreign Key constraints and not one of them is enabled.  The Query Engine can’t rely on one of them to help it make a decision.  We can’t be assured that any of the related data is 100% reliable.

“OH!” I say, realizing why those triggers have never really sat well with me.  “In all of these years, Joe, I honestly never made that connection… why we have all of the data integrity logic in triggers.  Let me ask you something else.  What about the names like ‘r_0013’ or ‘IF345TableABC’.  I’ve never asked what the significance of the numbering scheme was for FKs and Indexes?”

“Well, there isn’t any.  Those are just auto-generated from the default templates that came with ErWin.” (crickets…)

“Really?  All of these years I assumed there was some secret code or standard that helped us determine those names… and it’s just defaults? Man, I’m going to have to digest this one for a bit.  Thanks for humoring me… as always.”

But wait, it gets better (or worse)!

Perspective really matters when a discovery like this is being made.  At the same time I was working through all of this we were spinning up a new project and one engineer downloaded the newest copy of ErWin to try.  He quickly created a model for the DB we needed to get the project rolling and within a few hours created the initial scripts.  To be honest, I didn’t examine the scripts closely at the time.

This past Friday however, I had to dig into the database to fix a design problem and so I opened up the creation script that was generated from ErWin.  Mind you, the developer hadn’t changed any templates or default settings.  Sure enough, I saw a similar pattern to how ErWin functions with a version that’s nearly 20 years old.

  • Indexes have (nearly) useless names
  • Foreign key constraints are named with an incrementing number rather than any helpful information AND the cascading logic is missing
  • All true constraint logic is done in triggers

At least the FK constraints aren’t disabled! 🙂

Where do we go from here?

This “new” revelation is just one in a line of DB updates we’re making and testing right now.  While re-enabling the constraints and making sure all business logic persists as expected (sans triggers) isn’t trivial, it’s now near the top of our list.  There is some trigger logic that is legitimate so we have to figure out how to carefully unravel this puzzle.

Ultimately we hope to better utilize the strengths of the SQL Server Query Engine by working with it.  We also hope to clean up some issues with referential integrity that have gone undetected because triggers were sometimes disabled for big data loads leaving bad data behind.

But more than anything, it’s caused us to talk more openly about the tools we use, how far along are we in “mastering” those tools, and if there is a way to better measure “good” design from “bad” design.

And of course, check any program defaults. Sigh.

Tuning a large, legacy SQL Server DB – A Journal

The Backstory

As noted in the About page, I’ve worked at EnergyCAP for over 12 years (as of this post).  We’re a leader in our industry… and have been for more than 35 years.  But with any long-term history comes long-term technical debt, especially when not handled well team-to-team, year-to-year.  For us one of the major pieces of technical debt comes in the form of our SQL Server database.

About 21 years ago Enron acquired the original company, Omnicomp, and infused new life into the product and technology that the original application was built on.  SQL Server 6.5 was chosen as the storage engine and a new staff of DBAs was tasked with designing the schema and DB logic to support the application.  At the time it was a C++ installed Windows application.

After Enron collapsed, our CEO, Steve Heinz,  was able to buy back the IP out of bankruptcy and start a new company to continue serving loyal customers.  Enron gave us a great foundation, but unfortunately most of the talent was gone and the company was bootstrapped from the ground up to get to where we are today.

It’s the “where we are today” part that has become a bit problematic at times.  Don’t get me wrong, I still believe we’ve held to many good design tenets over the years.  The folks in Houston, and those that have have stayed with us since, have continued to hold fast on many good choices even when others advocated for “faster” ways.  But EnergyCAP is still a complex database that hasn’t had a thorough review in about 15 years.  We’ve known instinctively for years that there has to be room for improvement, but resources and time just always seem to slip away from digging in and doing the hard work.  And on man, it’s hard work… eh?

Sizing up the Data

To give you an idea of what we’re dealing with, here are some basic stats about the database that backs our core application.

  • There are currently 351 tables in an unaltered DB schema
  • There are 1367 indexes (yeah… do that math…)
  • There are 27 tables that have 8 or more indexes… the top 5 of which account for 116 of those indexes.  Obviously most of these are foreign key indexes and, as it turns out, are never, ever used. 🙁
  • This translates into ~15 tables using 75% of the reserved data, although the features a client uses can skew this some
  • Batch requests/sec on our busiest servers only run at about 50 for 95th percentile week over week. Again, you’ll see later why this doesn’t jive with user experience
  • Although the database is generally in 3NF, has solid FK constraints and other design principles, all of the constraints are disabled and integrity is enforced with triggers. (the astute among you will immediately see the problem here… more on this later)
  • Reporting on data is the #1 thing our clients use EnergyCAP for. We have nearly 400 “canned” reports that ship with the software, each of which has at least 20 different filters that can be set.  Trying to do OLAP well while other users expect OLTP success  in day-to-day work has proven to be difficult for us.
  • The majority of clients have their own copy of the database, although we have modified the database over the years to allow multi-tenant deployments. Most multi-tenant deployments have less than 100 clients per database.

The Problem

So why am I spending time to document my research and journey as I attempt to improve something that’s generally been solid for thousands of clients for 15+ years?  It’s simple.

In all of my time at EnergyCAP, holding numerous positions and heading the development team for 8+ years, I’ve never been satisfied with some of the answers tell ourselves (and clients) when things aren’t performing as well as we think they should.  I can say in good faith that we’ve never lied to a client about the issues.  What I’ve come to learn is that we simply didn’t understand some parts of SQL Server as well as we thought we did, and even after further research came away with the wrong answer.

Admittedly the problems are not always the fault of the database.  Particularly when a client hosts their own application we tend to see more issues than when it’s in our environment.  This is usually because the clients have no dedicated technical resources to maintaining the database and it’s simply one among many.

But still, there are far too many times we see error logs come through and we know that some of the core features of the application just aren’t performing as well as they should be under the given load.

In short, our servers often appear to be underutilized, but our client experience at some of the busiest parts of the day “feel” otherwise… and I’ve never been satisfied with it!

The Catalyst

And maybe more importantly… why now?  If I’ve been with the company for so long, worked with most of our clients in some form or fashion, and I’ve had a leading role in forming the product… what’s changed to make me dive in now?

In January 2016 I attended my second CodeMash and was able to make it for the Pre-Compiler sessions.  One of the all-day training sessions I sat in on was about SQL Server.  Over the course of six hours, ideas and concepts that ranged from basic server setup to queries that revealed information on server internals totally transformed the way I thought about SQL Server.  I left that session unwilling to look at our problems and give another status quo answer moving forward.

In fact, I realized that I had been lying to myself for years about how much I really knew about the core technology of our application.  I had relied too long on my experience and natural understanding of how the server worked.  I was hooked on digging in more than I ever had.

And the most helpful part of this new journey is that we host hundreds of clients in a SaaS environment which has allowed us to see first hand how some of the challenges above can lead to less-than-desirable performance and user experience.  We have a direct proving ground for any changes.  Sweet!

The Journey Begins Now

And now it’s time to dive in.  Over the last 15 months I’ve taken whatever time I could get to dig into the database side of EnergyCAP.  I’ve learned a ton, attended more classes, read hundreds of blog posts, watched Pluralsight courses, went to my first SQL Saturday event, started to put my voice out there on the #sqlhelp thread on Twitter… determined that I’d do my best to contribute back to the community that has done such an amazing job pushing me forward.  At the same time I’m pretty certain that I’m still just trying to pass SQL Kindergarten at this point.  Nothing I say here will be earth shattering… and there’s a really good chance some of it will still be plain wrong.

But hopefully, as I document this first project and others that are spurred on from it, someone will gain insight from my journey, improve upon it, and pass it on.

So grab your comfy hiking boots and some snacks… because this is going to take a while.

Topics I plan to cover include the following.  As the posts are made active, I’ll update the links below as necessary.

  • Performance monitoring, what we’ve tried, and how we’ve landed on using Splunk for now
  • Examining Indexes to simplify the database and improve IO
  • The tooling I’ve found helpful
  • Service Broker and how it totally transformed how legacy processes are run
  • How we’ve tested the changes over time and how we’ve measured “success”