Shipping Database Changes

T-SQL Tuesday is a monthly event where SQL Server bloggers write a post based on a subject chosen by the month’s host. This month’s #tsql2sday event is being hosted by James Anderson (b|t) and this months topic is: Shipping Database Changes.

This month’s #tsql2sday topic couldn’t have come at a better time.  We’re once again faced with some of the biggest changes in recent memory to hundreds of client databases (many that we host, but some that we don’t), so I’m anxious to see what others are doing.  But if I’m going to take, it only feels right to try and give something back.

I’ve been at EnergyCAP for almost 13 years.  The database and application was started ~18 years ago by a team at Enron (it’s a long story…). They worked to transition the product from Sybase to SQL Server.  And although the product was never used by a paying Enron customer, in three years of development they went from version zero of the database to version 39 when our CEO re-acquired the IP.

From then until now, we’ve tried a lot, learned a lot and still have plenty of things we “wish” we could do better.  (insert your favorite Animaniac gazing wide-eyed at the “town of Continuous Integration” with little hearts bubbling in the air…).

Anyway, I’ll compare our journey to the basic stages of primary education.  Somehow, that seems a fitting comparison as I think about where we’ve been, what we’ve learned along the way, and where we anxiously hope to be in years to come.

Preschool Bliss

The initial process was, honestly, pretty simple.  For two reasons.  There weren’t many clients to support yet (most of which never touched the database outside of the application) and our expectations were pretty low.  The process looked like this (generally):

Step 1: Change model in Erwin based on developer needs.
Step 2: Export the “create new” database script
Step 3: Run a myriad of batch files that would take the new script, a source database for “core-pop” data, and some other mojo, resulting in an “upgrade” script that would:

  • Rename all of the existing tables to ‘TablenameOld’
  • Recreate all tables (including any new ones)
  • Select/Insert all data from old table to new table (turn off identities, etc.)
  • Add constraints, indexes, triggers and SPROCs.
  • Delete old tables
  • Verify any “core-pop” data (defined properties/data that our app requires) and insert/update as appropriate
  • Set appropriate permissions

Step 4: Take “upgrade” script and run it on client databases.
Step 5: Pray… Wait! That was supposed to be step 4!!

This was not a transactional script… of course who would want the entire database to be whacked and recreated in a transaction… but I digress.  Our only defense was to make sure we had a backup (or our self-hosted clients did) before the script ran.  Once we were past statement #1, all bets were off.

And we felt the pain of that.  We didn’t upgrade often because it took a long time AND correcting data when things bombed was cumbersome.  There were MANY times we had client call in need of immediate assistance because the script failed and their backup wasn’t good.  Fun times and all of that.  From 2004 (when I showed up) until 2010 there were 7 database modifications, slightly more than one a year.

The Grade School Days

At some point in 2010 we finally said “enough is enough” (no, really, I’m pretty sure that’s exactly how the conversation went after a long night of database upgrades recreations).

Although I don’t believe we had used RedGate SQL Compare yet (was it even around in 2010??), our DBA took the challenge to move away from the “recreate as upgrade” process and to start doing a better job tracking what had actually changed.  Because he’s wicked smart with this SQL stuff, he took the first set of scripts and modified them to take some hand coded upgrade parts of his, fetch and recreate the SPROCs, mash in the triggers that were exported from the modeling software, and eventually into a more traditional “upgrade” script.  I can still remember the first time I ran an actual “upgrade” script on our largest database… and promptly called him in fits of joy because it only took ~10 minutes. (again, insert googly-eyed, heart spewing, Animaniac here…)

It was good progress.

Some things we learned along the way that improved our ability to scale to hundreds of databases with more confidence, included

  • Making sure the beginning of the scripts actually verified they were running against the correct version of the database.  Did you know it turns out not all clients read instructions? No… seriously…
  • Finally taking this “upgrade” approach allowed us to decouple app releases from database version updates.  If we needed to make incremental changes to the database as we worked on the new release (we had long release cycles at the time) we could create incremental upgrade scripts that were easily rolled up into one master upgrade when a new version of the product shipped.
  • At some point in here, we also introduced better error checking and transactional changes.  Since the changes were usually smaller in scale, we could gain some control over “getting out safely” when things failed. It didn’t always work, but it was a start.

The Middle School Days (Current Status)

Starting around 2012 we investigated using RedGate SQL Compare to help with this process.  Since we generally script out the initial implementation of the changes before updating the model, it was easy to make a new database with our “create new” script, apply those changes, and then do the compare.  In doing so, a lot of the error checking we had introduced previously was a standard part of the scripts the tool created.  As I’ve said before, this wasn’t a perfect solution, but it got us another step further in the processes.

In general, the biggest issues we hit continue to be client customizations to the database (even ones we sanction) and an ever growing set of core-pop data that we manage and have to proactively defend against client changes.  This is an area we just recently admitted we need to take a long, hard look at and figure out a new paradigm.

I should mention that it was also about this time that we were finally able to proactively get our incremental changes into source  control.  All of our final scripts were in source somewhere, but the ability to use SQL Compare and SQL Source Control allowed our developers to finally be a second set of eyes on the upgrade process.  No longer were we weeding through 50K lines of SQL upgrade just to try and find what changed.  Diffing whole scripts doesn’t really provide any good context… especially when we couldn’t guarantee that the actions in the script were in the same order from release to release.  This has been another huge win for us.

I’m not a fan of everything RedGate, but these two products have been invaluable to us.  Even for cases when a client messes up an upgrade and we’re stuck with a partial setup, using SQL Compare has saved our bacon many times.

Looking Towards High School

In all, I’d say that we’ve been in a protracted time of middle school.  Being small and resource constrained, it took us longer to get through SQL pre-algebra and composition classes than some.  We’ve had our fits of wishing we were up walking the halls with the big kids, doing automated database deployments or including more of the development staff in the process for backup.

In fact, right now we’re about to make some of the most drastic changes to our schema and process in all of the time I’ve been here and it’s given us another chance to talk about our dreams and desires for the process.  We’ve even included our small (but growing) DevOps team into more of the discussions, looking for ways we can help each other.

What about you?

What resonates with you about our journey?  Are you a little further along and have tips?  Please… share!

In the meantime, I can’t wait to read all of the other posts on this topic to see what we might be able to glean.

Easy, Free Lead Magnets for WordPress

Do you want to grow your blog by creating a Lead Magnet?

Connect with new potential readers by offering something in exchange for their contact information, a name and email address.

When my wife, Laura, was ready to offer her first lead magnet landing page I thought it would be an easy task.  I thought, “Surely, there’s a decent, free WordPress plugin that allows me upload a file somewhere, provide a nice custom form and sync to Mailchimp.

Sounds easy, right??

Although there are many lead magnet plugins most of them didn’t work exactly like we wanted, nor were they free.  Other successful bloggers embed a form from their for-fee marketing websites.  We weren’t ready to start paying monthly fees just to put a PDF online.

So, I decided to create a way for Laura to offer multiple Lead Magnets for free.

There are posts out there on how to accomplish this. Many of the posts discuss how to modify your welcome emails in programs like Mailchimp and attach your file to the RSS email.  While that works and is a decent option for an initial welcome email, it doesn’t allow you to have multiple landing pages and lead magnets. In fact, none of the free or low-cost plugins that focus only on lead magnets manage multiple downloads/attachments and sync with email list providers.  (I’m sure there is one out there, but after an hour of trying multiple plugins I decided to blaze our own trail.)

But Wait! Before You Get to Work…

One potential risk of using this method is that emails will be coming from whatever server your WordPress site is hosted on.  In some cases, the main email providers (Gmail, Yahoo, AOL, etc.) will be suspicious of email coming from a random server claiming to be from you.  There are ways around this, too, but it is generally “safer” to set up a trusted SMTP email provider and use them.  Any of the main SMTP providers have plugins that make it easy to integrate with them, and then any email that WordPress sends will use that new connection.  Some of the popular options are:

  • GMail SMTP (may require a developer or Apps for Business account with Google)
  • Mailgun (Account required, but you can send up to 10,000 emails free each month!)
  • SendGrid (Account required. Starts at $9.95/month)

Free WordPress Lead Magnet Setup

Our solution was to use Contact Form 7 to create the embedded form.

Typically bloggers use this plugin to put a contact form on their website that allows visitors to submit a contact form to them.  But that’s not the only way it can be used.  With a few simple changes you can have the email sent back to the user and you can attach your lead magnet to it.  Sweet!

Before you start making your lead magnet form, make sure you’ve uploaded the lead magnet file through the “media” module in WordPress.  Once uploaded, copy the file URL. You’ll attach this URL to the email that WordPress will send to all of the people who sign up at your site.

Setup

  1. Find the new “Contact” item in your WordPress Admin menu and click “Add New”
  2. On the Form tab, remove the Subject and Message items from the contact form.  All you want is the Email (“your-email”) and Name (“your-name”) field.

  3. On Mail tab, setup the email that will get sent when someone submits the form.  By default, this is set up to send a submitted form to you, the blog owner.  In this case, however, you’ll configure it to send an email back to the user with an email attachment.  Use the following setup as a guide.
    • To: Use the included ‘mail-tags’ and set it to “[your-name] <[your-email]>”
    • From: Put your name and email. This is what will show in the visitor’s email client
    • Subject: Make it compelling, personal, and related to the content they registered for!
    • Additional Headers: An easy way to see who is registering for your download is to include you as a BCC.  Once you’re satisfied it’s coming through and working, you could edit this and remove yourself.  If it’s popular, you’ll get a lot of email.
    • Message Body: As most people learn, the more personal you can make this, the better. Email clients like GMail are really good at filtering out stuff that looks like an advertisement. Whatever you type in this box is exactly what they’ll see in their email.
    • File Attachments: THIS IS THE KEY!  Take the URL of the file that you copied earlier and paste it in this box.  The trick is that you need to make this relative URL to your website.  Depending on your setup you may have to play around a little, but in most cases, deleting everything before the word “uploads” will probably work.  The great thing is that when you save the form, it will tell you if it can’t find the file based on the path you gave it.
  4. Save the form!  Assuming you don’t have any errors, you’ll get a shortcode at the top of the screen to copy into your lead magnet Post/Page.

  5. Head over to your lead magnet post and paste the shortcode into the appropriate place on the page.

  6. Finally, save your post and give it a look!

Additional Tips

Save your leads

In step #3 above, I show how you can put your email into the form as a Blind Carbon Copy (BCC:).  While this gives you some assurance that the forms are indeed being downloaded, it generates a lot of email with the same attachment over and over again… all looking as if they’re coming from you.  It gets old quickly.

Instead, consider taking that BCC: tag out and installing a plugin that will save all of your form submissions in your WordPress DB.  We chose Contact Form Submissions.  It’s not a perfect plugin, but it does the job and allows us to know that things are still working.  The biggest problem is that it looks like all of the submissions are coming from you because, generally, CF7 is used to allow visitors to submit forms to you.  Not a big deal for us.

Integrate with your email list

Obviously the point of getting leads to download your stuff is so that they’ll give you their contact information.  At some regular interval (or with other plugins that I might cover later), you need to make sure these leads are getting added to your email lists.  They don’t do you much good just hanging out in your WordPress Admin screen.  🙂

Now go… lead those magnets!!!

What? All of the constraints are disabled? (T-SQL Tuesday #88)

T-SQL Tuesday is a monthly event where SQL Server bloggers write a post based on a subject chosen by the month’s host. And although this is only the second actual post of this blog (fret ye not, there’s more on the way), I’m delighted to finally be participating more in the SQL community that has helped me tremendously over the years.

This month’s #tsql2sday event is being hosted by Kennie Nybo Pontoppidan (b|t) and this months topic is: The daily (database-related) WTF.

The topic couldn’t be more timely for me, actually.  My first post detailed the beginning of a long journey into overhauling our 18-ish year old DB schema and architecture for our flagship application.  Now undergoing its third major revision, the EnergyCAP database has served us well for many years and thousands of clients… but the time is right for digging in and pulling this classic into the age of modern SQL Server.  I have a queue of blog posts about how we’ve gone about this work and how we’re solving problems anew (for us at least), but one discovery during the process still causes me to shake my head.

So sit right back and you’ll hear a tale… (a tale of a fateful trip… sorry, couldn’t resist)

The lay of the land

Our database isn’t huge by most accounts, but at 331 tables in a standard database, it’s not small either.  There was significant funding behind the original design back in the good old Enron days (yes, that Enron…) and so they made plenty of good design choices that still serve us well today.

As you’d expect, one of those choices was being diligent about foreign key setup and maintenance, largely facilitated by a data modeling tool.  At the time they chose ErWin.  All of these years later, I can’t tell you exactly why they chose it or if there were any other options considered.  I don’t even know anyone on the team from 1999 that went about taking a solid database design and bringing it to life with ErWin, modeling the design documents in the application and then back out into creation and migration scripts.  Most of it was executed well.

Except for that one thing that never set well with me… that one thing that I never really spent time digging into. (Thank you, Mr. Hindsight…)

Although we had Foreign Key constraints neatly setup and maintained in the schema, helping us to draw beautiful ER diagrams, all of our data integrity logic was held in triggers.  I also never thought to ask why all of our FKs had names like “r_0013”.

The WTF moment

And so imagine my surprise a few months ago, having finally freed up some time to seriously dig into our DB, when I came across one of our typical migration scripts trying to understand some recent changes.  I noticed something I’d never seen before in a part of the script that was trying to modify a Primary Key for some testing we were doing.  First we had to drop some Foreign Keys so that we could modify the Primary Key indexes.  But then I saw this as the FKs were being added back in:

[sql]ALTER TABLE [dbo].[abc] WITH NOCHECK ADD CONSTRAINT…[/sql]

Huh?  “WITH NOCHECK”?  Um… has that always been there?  Was this just a fluke because these were one-off testing scripts?  Surely we wouldn’t disable all of our constraints, right?  I mean, besides data integrity there are lots of potential small Query Engine wins wrapped up in all of those constraints.

Still looking at the screen somewhat baffled, I picked up my phone and dialed the extension of our Developer/part-time DBA.

“Afternoon Joe! Hey, I have another one of those ‘out of left field’ questions for you.”  (I pause to give him time to roll his eyes.)

“I’m looking at these testing scripts and I noticed that we were adding back all of the Foreign Keys with NOCHECK.  Have we always been doing that or is this just some stuff in this script?”

After a slight pause Joe responds, “Um, well, yeah…” It’s apparent that he’s quickly getting the gist of my question.  I’m also sure he’s wondering why I’m asking this now after being around this joint for years.  “All of our FK constraints have always been created with NOCHECK.  I guess it was just an ‘old school’ way of doing the DB design back in the day… relying on triggers for the data integrity and better, human readable constraint failure messages.  We’ve just never changed things over all these years.”

Wow.  721 Foreign Key constraints and not one of them is enabled.  The Query Engine can’t rely on one of them to help it make a decision.  We can’t be assured that any of the related data is 100% reliable.

“OH!” I say, realizing why those triggers have never really sat well with me.  “In all of these years, Joe, I honestly never made that connection… why we have all of the data integrity logic in triggers.  Let me ask you something else.  What about the names like ‘r_0013’ or ‘IF345TableABC’.  I’ve never asked what the significance of the numbering scheme was for FKs and Indexes?”

“Well, there isn’t any.  Those are just auto-generated from the default templates that came with ErWin.” (crickets…)

“Really?  All of these years I assumed there was some secret code or standard that helped us determine those names… and it’s just defaults? Man, I’m going to have to digest this one for a bit.  Thanks for humoring me… as always.”

But wait, it gets better (or worse)!

Perspective really matters when a discovery like this is being made.  At the same time I was working through all of this we were spinning up a new project and one engineer downloaded the newest copy of ErWin to try.  He quickly created a model for the DB we needed to get the project rolling and within a few hours created the initial scripts.  To be honest, I didn’t examine the scripts closely at the time.

This past Friday however, I had to dig into the database to fix a design problem and so I opened up the creation script that was generated from ErWin.  Mind you, the developer hadn’t changed any templates or default settings.  Sure enough, I saw a similar pattern to how ErWin functions with a version that’s nearly 20 years old.

  • Indexes have (nearly) useless names
  • Foreign key constraints are named with an incrementing number rather than any helpful information AND the cascading logic is missing
  • All true constraint logic is done in triggers

At least the FK constraints aren’t disabled! 🙂

Where do we go from here?

This “new” revelation is just one in a line of DB updates we’re making and testing right now.  While re-enabling the constraints and making sure all business logic persists as expected (sans triggers) isn’t trivial, it’s now near the top of our list.  There is some trigger logic that is legitimate so we have to figure out how to carefully unravel this puzzle.

Ultimately we hope to better utilize the strengths of the SQL Server Query Engine by working with it.  We also hope to clean up some issues with referential integrity that have gone undetected because triggers were sometimes disabled for big data loads leaving bad data behind.

But more than anything, it’s caused us to talk more openly about the tools we use, how far along are we in “mastering” those tools, and if there is a way to better measure “good” design from “bad” design.

And of course, check any program defaults. Sigh.

Tuning a large, legacy SQL Server DB – A Journal

The Backstory

As noted in the About page, I’ve worked at EnergyCAP for over 12 years (as of this post).  We’re a leader in our industry… and have been for more than 35 years.  But with any long-term history comes long-term technical debt, especially when not handled well team-to-team, year-to-year.  For us one of the major pieces of technical debt comes in the form of our SQL Server database.

About 21 years ago Enron acquired the original company, Omnicomp, and infused new life into the product and technology that the original application was built on.  SQL Server 6.5 was chosen as the storage engine and a new staff of DBAs was tasked with designing the schema and DB logic to support the application.  At the time it was a C++ installed Windows application.

After Enron collapsed, our CEO, Steve Heinz,  was able to buy back the IP out of bankruptcy and start a new company to continue serving loyal customers.  Enron gave us a great foundation, but unfortunately most of the talent was gone and the company was bootstrapped from the ground up to get to where we are today.

It’s the “where we are today” part that has become a bit problematic at times.  Don’t get me wrong, I still believe we’ve held to many good design tenets over the years.  The folks in Houston, and those that have have stayed with us since, have continued to hold fast on many good choices even when others advocated for “faster” ways.  But EnergyCAP is still a complex database that hasn’t had a thorough review in about 15 years.  We’ve known instinctively for years that there has to be room for improvement, but resources and time just always seem to slip away from digging in and doing the hard work.  And on man, it’s hard work… eh?

Sizing up the Data

To give you an idea of what we’re dealing with, here are some basic stats about the database that backs our core application.

  • There are currently 351 tables in an unaltered DB schema
  • There are 1367 indexes (yeah… do that math…)
  • There are 27 tables that have 8 or more indexes… the top 5 of which account for 116 of those indexes.  Obviously most of these are foreign key indexes and, as it turns out, are never, ever used. 🙁
  • This translates into ~15 tables using 75% of the reserved data, although the features a client uses can skew this some
  • Batch requests/sec on our busiest servers only run at about 50 for 95th percentile week over week. Again, you’ll see later why this doesn’t jive with user experience
  • Although the database is generally in 3NF, has solid FK constraints and other design principles, all of the constraints are disabled and integrity is enforced with triggers. (the astute among you will immediately see the problem here… more on this later)
  • Reporting on data is the #1 thing our clients use EnergyCAP for. We have nearly 400 “canned” reports that ship with the software, each of which has at least 20 different filters that can be set.  Trying to do OLAP well while other users expect OLTP success  in day-to-day work has proven to be difficult for us.
  • The majority of clients have their own copy of the database, although we have modified the database over the years to allow multi-tenant deployments. Most multi-tenant deployments have less than 100 clients per database.

The Problem

So why am I spending time to document my research and journey as I attempt to improve something that’s generally been solid for thousands of clients for 15+ years?  It’s simple.

In all of my time at EnergyCAP, holding numerous positions and heading the development team for 8+ years, I’ve never been satisfied with some of the answers tell ourselves (and clients) when things aren’t performing as well as we think they should.  I can say in good faith that we’ve never lied to a client about the issues.  What I’ve come to learn is that we simply didn’t understand some parts of SQL Server as well as we thought we did, and even after further research came away with the wrong answer.

Admittedly the problems are not always the fault of the database.  Particularly when a client hosts their own application we tend to see more issues than when it’s in our environment.  This is usually because the clients have no dedicated technical resources to maintaining the database and it’s simply one among many.

But still, there are far too many times we see error logs come through and we know that some of the core features of the application just aren’t performing as well as they should be under the given load.

In short, our servers often appear to be underutilized, but our client experience at some of the busiest parts of the day “feel” otherwise… and I’ve never been satisfied with it!

The Catalyst

And maybe more importantly… why now?  If I’ve been with the company for so long, worked with most of our clients in some form or fashion, and I’ve had a leading role in forming the product… what’s changed to make me dive in now?

In January 2016 I attended my second CodeMash and was able to make it for the Pre-Compiler sessions.  One of the all-day training sessions I sat in on was about SQL Server.  Over the course of six hours, ideas and concepts that ranged from basic server setup to queries that revealed information on server internals totally transformed the way I thought about SQL Server.  I left that session unwilling to look at our problems and give another status quo answer moving forward.

In fact, I realized that I had been lying to myself for years about how much I really knew about the core technology of our application.  I had relied too long on my experience and natural understanding of how the server worked.  I was hooked on digging in more than I ever had.

And the most helpful part of this new journey is that we host hundreds of clients in a SaaS environment which has allowed us to see first hand how some of the challenges above can lead to less-than-desirable performance and user experience.  We have a direct proving ground for any changes.  Sweet!

The Journey Begins Now

And now it’s time to dive in.  Over the last 15 months I’ve taken whatever time I could get to dig into the database side of EnergyCAP.  I’ve learned a ton, attended more classes, read hundreds of blog posts, watched Pluralsight courses, went to my first SQL Saturday event, started to put my voice out there on the #sqlhelp thread on Twitter… determined that I’d do my best to contribute back to the community that has done such an amazing job pushing me forward.  At the same time I’m pretty certain that I’m still just trying to pass SQL Kindergarten at this point.  Nothing I say here will be earth shattering… and there’s a really good chance some of it will still be plain wrong.

But hopefully, as I document this first project and others that are spurred on from it, someone will gain insight from my journey, improve upon it, and pass it on.

So grab your comfy hiking boots and some snacks… because this is going to take a while.

Topics I plan to cover include the following.  As the posts are made active, I’ll update the links below as necessary.

  • Performance monitoring, what we’ve tried, and how we’ve landed on using Splunk for now
  • Examining Indexes to simplify the database and improve IO
  • The tooling I’ve found helpful
  • Service Broker and how it totally transformed how legacy processes are run
  • How we’ve tested the changes over time and how we’ve measured “success”