Tuesday, July 22, 2014

Back in the game...

Well, that was a long hiatus.  In the last 5 months I've completed 3 short-term DBA contracts and been everywhere from Chester to Manchester, Stoke-on-Trent to Coventry, Trafford to Milton Keynes.  What with my busy day-to-day role and being Dad to my three Lords of Misrule - my kids - I haven't found time to actually sit down and write.  So here I am, in a charming(!) hotel room in a town called Binley, finally finding time to re-acquaint myself with my keyboard.

Luckily, my American employers over at http://www.mssqltips.com have been very understanding about my complete lack of output for the last 6 months.  I'm making amends, I've just sent over a new, fresh article on using OLE automation extended stored procedures in SQL Server - procedures which will allow you to access OLE objects such as FileSystemObject, enabling you to call methods and get properties from these objects back into SQL Server.  A little easier than using PowerShell, and hopefully of use to some data hacks out there.  Watch this space.

What else this week?  Once again, I find myself supporting unsupported SQL Server installations - my primary database servers this time are 2008, which isn't too bad, but there's a few linked servers including SQL Server 2000.  I've just unlearned how to use DATABASEPROPERTYEX, I've no wish to go rooting through archived TechNet articles again, but that's the way it goes sometimes.

I was shocked to find out Monsters Inc. is 13 years old!  The actress who played Boo is now approaching 15, a far cry from the 2-year-old who had to be chased around the set with a mic as she wouldn't stand in one place long enough to say her lines.  This film is more than three times older than my youngest child ... I can feel the grey hairs growing already.  I'll soon be buying one of those horrible grey hair cardigans and perfecting the comb-over, then there's no turning back.

I was sad that I couldn't attend SQLBits XII (http://www.sqlbits.com/) this year, despite registering to attend months ago.  What with the timing of a new contract and my recent wedding, it simply wasn't possible to go.  I'm hoping to catch up with some live sessions over on the website and see if I can get hold of last year's, too.

Recently I read a great book by Isaiah Hankel PhD, called 'Black Hole Focus'.  I'm a sucker for self-help books, and recognise my weakness, but this truly is a great volume.  One key lesson I took from it was how to motivate yourself - Dr. Hankel suggests creating a motivation board, a board that you hang on your wall with your goals in a brainstorm around your name, or the words 'My Goals'.  A little cheesy, yes, but I'm giving it a try - I got the Pritt-Stick and scissors out on Sunday night and created my first collage since art lessons at school.  And ... it works.  I've hung it in front of my office desk (at home) where I'm now spending three days a week.  When you're looking at it every day, it really helps you to focus, and I've got a clear plan for the next few months.

Next up, identifying those skills which are going rusty and getting a plan together to toughen up my training in them.  I'm thinking data science, particularly with respect to statistics and using R.  I have a strong suspicion the new 'BI', the new 'Big Data' is going to be 'Data Science', and analysts who can interpret data, analyse it and draw conclusions (MI, in other words, and exactly what I'm lined up to do for the next three months) are going to be in high demand.

Tuesday, February 25, 2014

Getting Around Strict GROUP BY Restrictions in SQL Server

SQL Server adheres to the ANSI-SQL standard, in that column names in queries with GROUP BY clauses (aggregate) queries must be included in the GROUP BY or in the aggregate clause itself.  

However, sometimes you don't want to group by a certain column.  Why would that be so?  Say, for example, you want to query the msdb.dbo.backupset column for information on your last successful backup for each database, and you also want to return the backup size for that backup, too.  Grouping the database name on MAX(backup_date) is fine, but throw in the backup size and you'll end up with a separate row for each database name, distinct backup size and maximum backup date.  I'll show you what I mean.  

Let's try writing a query to return the database name, backup size and the last backup date, grouped by database name.

SELECT   database_name, backup_size, MAX(backup_finish_date) 
FROM   msdb.dbo.backupset 
GROUP BY  database_name, backup_size



All very well ... except we have multiple rows per database.  So, let's try the next logical step - putting a MAX(backup_size) into the mix:

SELECT   database_name, MAX(backup_size), MAX(backup_finish_date) 
FROM   msdb.dbo.backupset 
GROUP BY  database_name



Did this work?  On the face of it, yes.  But let's not be so hasty - let's query the whole dataset on our interesting columns to see if the information presented is, in fact, true:

SELECT database_name, backup_size, backup_finish_date 
FROM msdb.dbo.backupset 


No!  As you can see, the former query with two MAXes returned a row saying that SANDBOX was last backed up on 2014-02-25 20:22:28.000 with a size of 9325877248 bytes.  This is not true.  The last backup date was 2014-02-25 20:22:28.000 with a backup size of 2727645184 bytes, as seen below.  The double MAXes have mangled the result set, displaying incorrect data!  Why?  Because we simply concatenated both MAX values - in the former query, they are independent, and we need them to be coupled.



Why is this query so difficult to write?  This is because T-SQL has a strict adherence to the GROUP BY standards defined in ANSI-SQL unlike some other database engines, such as MySQL.  The way to get around this is to query back on the data in a different way - a self-join, but on a slightly altered GROUP BY syntax.  This will effectively allow us to JOIN one aggregated query on the data (by backup_finish_date) on another aggregated query (by backup_size), as follows:


SELECT b1.database_name, b2.backup_size, b1.last_backup_date
FROM (
SELECT b.database_name, 
MAX(b.backup_finish_date) [last_backup_date]
FROM msdb.dbo.backupset b
GROUP BY b.database_name ) b1 
INNER JOIN 
(
SELECT b.database_name, b.backup_size, 
b.backup_finish_date 
FROM msdb.dbo.backupset b ) b2
ON b1.database_name = b2.database_name 
AND b1.last_backup_date = b2.backup_finish_date 




Why does this work?  Basically, the first sub-query contains the last backup date per database.  By inner joining on both the database name and the last backup date to the ungrouped result set from msdb.dbo.backupset, we can bring the backup size for that database name and backup date (the maximum per database) into the result set, resulting in a listing of database name, backup size and last backup date (with the backup size correct for the backup date).

This is one way of getting around the GROUP BY restrictions in SQL Server - I'd be keen to hear more.  If you have any alternatives, please feel free to leave a comment below.


Thursday, February 20, 2014

Update - New Consultancy Launched!

I am pleased to announce the expansion of Optimal Computing into database consultancy - find the site over at http://www.optimalcomputing.co.uk .  You can get in touch by e-mailing enquiries@optimalcomputing.co.uk or calling on +44 (0) 161 731 0037.  Optimal Computing specialises in data management, with an offering at the moment on SIMS application support for schools and colleges. 

More Stuff Published!

The good people over at www.MSSQLTips.com have been kind enough to publish many of my articles over the last couple of years, and the articles have been building up.  Find my profile over here --> http://www.mssqltips.com/sqlserverauthor/100/derek-colley/ .


I Went Contracting!

Took the leap in September and I'm contracting now, instead of being a permanent DBA.  On my second contract so far and doing well, having fun and earning money.  If you know of any contracts in the North West of England, do get in touch - you can find out all about me on my LinkedIn profile which redirects from here -> http://www.derekcolley.co.uk



Get In Touch!
I might be an introverted, slightly aggressive, permanently angry misanthrope in real life but that's just my DBA side showing through :-)  No, really, I'm uber-social and welcome contact from anyone in the IT arena - from 'let's be friends' (recruiters, normally) to 'Help!  It's all falling down around my ears!'.  You can reach me at derek@derekcolley.co.uk or follow me on Twitter: @dcolleySQL. 




Friday, December 13, 2013

SQL Server - Auditing sessions over time - Code example

So today I had a requirement to provide a quick and dirty way of collecting data on the sessions connected to SQL Server.  Specifically, I had to write a job that collected basic data on the current sessions, and if these sessions had not been previously recorded, record them into a table.  If the session was identical in all ways other than the session ID, then a counter called 'session count' is incremented for that session.

The challenging part here was providing what is basically a mashup of an INSERT and UPDATE statement, sometimes called an UPSERT.  In SQL Server 2008 and later versions, this functionality is provided using the MERGE statement.  My requirement here is to update a table with values if they exist; otherwise, insert them.  

So here's an example of how to use this MERGE statement in the context of my requirement. The example code is below.  You can modify it to suit your purpose - for example, write out to a permanent table, put it in a job, query the table in a different way, monitor requests rather than sessions, etc. or re-write it entirely, or simply use it as an example of MERGE for reference.

There's also a self-join in there which provides the columns on which to match.  This isn't strictly essential but I needed to match on columns then only update the last login time and last request end time if matched.    The outer query simply aggregates the results in the target table to provide a good view of the data.

Note this won't collect data on sessions which are opened and closed between running the collection.  So if you modify this code by e.g. using a permanent table and putting this code into a job (can't use a temp table in this context), run the job frequently.  If you run it e.g. every 15 minutes and a session is created and destroyed between the job execution intervals, you won't capture the data.



 -- ---------------------------------------------------------------------------------------------  
 -- Script to audit sessions on the server, accumulate session counts per distinct group of   
 -- values from sys.dm_exec_sessions, will ignore duplicate sessions (by session_id and unique   
 -- set of values), will increment when a new session is started that has the key values   
 -- matching, will insert when a brand new session is opened. Uses MERGE.  
   
 -- Run inside a stored procedure, or on the query window, multiple times over the course of X   
 -- hours to get a full list of sessions.  
 -- Add filters if necessary on the final SELECT.   
 -- Change #Logins to a permanent table if required.  
   
 -- Works only in SQL Server 2008, R2, 2012 and 2014.  
   
 -- Author: Derek Colley, 13/12/2013  
 -- ---------------------------------------------------------------------------------------------  
   
 -- TRUNCATE TABLE #Logins -- for debug  
   
 IF NOT EXISTS ( SELECT name FROM tempdb.sys.tables WHERE name LIKE ('#Logins%') )  
     CREATE TABLE #Logins (   
         uqid INT IDENTITY(1,1),   
         session_id INT,   
         last_login_time DATETIME,   
         host_name NVARCHAR(128),   
         program_name NVARCHAR(128),   
         client_interface_name NVARCHAR(32),   
         login_name NVARCHAR(128),   
         last_request_end_time DATETIME )  
   
 DECLARE @CurrentSessions TABLE (   
     session_id INT,   
     login_time DATETIME,   
     host_name NVARCHAR(128),   
     program_name NVARCHAR(128),   
     client_interface_name NVARCHAR(32),   
     login_name NVARCHAR(128),   
     last_request_end_time DATETIME )   
   
 INSERT INTO @CurrentSessions  
     SELECT        session_id, login_time, host_name, program_name, client_interface_name,   
                 login_name, last_request_end_time  
     FROM        sys.dm_exec_sessions  
     WHERE        session_id <> @@SPID   
     AND            session_id > 50  
       
 MERGE INTO #Logins   
 USING (   
         SELECT        c.session_id, c.login_time, c.host_name, c.program_name, c.client_interface_name,   
                     c.login_name , c.last_request_end_time  
         FROM        @CurrentSessions c  
         LEFT JOIN    #Logins l   
         ON            c.session_id = l.session_id   
         AND            c.host_name = l.host_name   
         AND            c.program_name = l.program_name   
         AND            c.client_interface_name = l.client_interface_name   
         AND            c.login_name = l.login_name    ) AS CurrentSessions  
     ON        #Logins.session_id = CurrentSessions.session_id   
     AND        #Logins.host_name = CurrentSessions.host_name   
     AND        #Logins.program_name = CurrentSessions.program_name   
     AND        #Logins.client_interface_name = CurrentSessions.client_interface_name   
     AND        #Logins.login_name = CurrentSessions.login_name  
 WHEN MATCHED THEN   
     UPDATE          
     SET            last_login_time = CurrentSessions.login_time,  
                 last_request_end_time = CurrentSessions.last_request_end_time  
 WHEN NOT MATCHED THEN   
     INSERT        (    session_id, last_login_time, host_name, program_name, client_interface_name,   
                     login_name, last_request_end_time )      
     VALUES        (session_id, login_time, host_name, program_name, client_interface_name,   
                 login_name, last_request_end_time );  
   
 SELECT        COUNT(*) [session_count], MAX(last_login_time) [last_login_time], host_name, program_name,   
             client_interface_name, login_name, MAX(last_request_end_time) [last_request_end_time]  
 FROM        #Logins   
 GROUP BY    host_name, program_name, client_interface_name, login_name  
 ORDER BY    COUNT(*) DESC   
   
   
   
        
           
       
   
   




Monday, June 24, 2013

Vendors! Wake up and hire a DBA!


Time to have a rant about third-party vendor databases.  Yes, I'm looking at you Sage, VersionOne, and most especially, YOU, Hybris CRM.

Yes, you.  Mr. 'Let's not use clustered indexes.'  'Let's model our schemata using ORM.'  'Let's call our primary key column 'PK' on almost every table.  Let's not bother naming any of our constraints.  And using stored procedures or functions - that's so 1990s, man.

I mean, why NOT use BIGINT for every single integer column.  Or VARCHAR(255).  No, wait a minute, that's way too efficient.  Let's use NVARCHAR(255) instead.  For everything.

Take a look at this, part of the table definition from a vanilla Hybris CRM installation:


CREATE TABLE [dbo].[addresses](
[hjmpTS] [bigint] NULL,
[TypePkString] [bigint] NOT NULL,
[PK] [bigint] NOT NULL,
[createdTS] [datetime] NOT NULL,
[modifiedTS] [datetime] NULL,
[OwnerPkString] [bigint] NULL,
[aCLTS] [int] NULL,
[propTS] [int] NULL,
[p_dateofbirth] [datetime] NULL,
[p_middlename2] [nvarchar](255) NULL,
[p_streetname] [nvarchar](255) NULL,
[p_contactaddress] [tinyint] NULL,
[titlepk] [bigint] NULL,
[p_phone1] [nvarchar](255) NULL,
[p_remarks] [nvarchar](255) NULL,
[p_firstname] [nvarchar](255) NULL,
[p_phone2] [nvarchar](255) NULL,
[originalpk] [bigint] NULL,
[p_fax] [nvarchar](255) NULL,
[p_shippingaddress] [tinyint] NULL,
[p_streetnumber] [nvarchar](255) NULL,
[p_gender] [bigint] NULL,
[p_url] [nvarchar](255) NULL,
[p_district] [nvarchar](255) NULL,
[countrypk] [bigint] NULL,
[p_lastname] [nvarchar](255) NULL,
-- continues on for a loooooong time...

'p_gender' - BIGINT??? Really?  So there's 2^64 -1, or 18,446,744,073,709,551,615 possible genders, are there?  On what planet?  Or how about 'PK' for the primary key?  In column 3, of all places?  I sincerely hope the user isn't intending to put some long ASP.NET-generated URL in 'p_url', since it's limited to 255 characters.  And 'p_dateofbirth' clearly needs to store time information too, to the millisecond, as that's important when profiling your customers.

I'm dreading the day when we actually start using your hated system in production and your table scans and parameter-sniffed execution plans slam into my servers like a colossal tidal wave of shit.  I can absolutely guarantee that when I ring the vendor to complain I'll get through to some poor sod on a mere handful of Vietnamese Dong per hour telling me 'prease to call back in mornink'.  Believe me when I say I'm beginning to batten down the hatches now, and when 'main screen turn on' I'll be hiding in the server room, watching our monitoring software with my fist in my mouth.

Sage, you're not off the hook.  No matter how hard you insist, a database is not a 'company'.  It's a database.  I admire your overall design, placing your metadata in a separate DB and company / entity-specific information in separate DBs.  I don't appreciate you writing metadata to 'master' and bursting into tears when I remove it.  Nor, frankly, are your table structures much cop either.  It's not hilarious to dress up your SQL Server-related errors in another error wrap, spreading confusion and delay among the various support teams:

'An unexpected error has occurred.  SQL code 208.' (Sage)
=
'Msg 208, Level 16, State 1, Line 1 - Invalid object name ...' (SQL Server)

Why not just return the second message?  The problem will land in the lap of the DBAs anyway, won't it?  Save time!  'An unexpected error has occurred' simply means 'Something went wrong and I (the developer) can't be fucked to write a coherent error message, preferring to let support teams scramble to look up SQL Server error codes in sys.messages until the application dies a death and they switch to something written properly.'

Vendors!  Wake up!  It's time to start reconsidering old-fashioned concepts like 'testing' and 'good design'.  I know we're all about Agile now, but 30 years ago we had good, solid texts like 'The Theory of Relational Databases' (David Maier - free eText here) full of things like ... wait for it ... relations!  Functional dependencies!  Normalisation!  I know, yawn, yawn, why not just let nHibernate create it all for us, yeah, yeah ... But there's good, solid systems out there right now using databases built on these basic principles.  

Vendors, don't give up your day job.  Develop great apps and leave the database administration to the professionals.


The information and views expressed in this article are my own and do not represent the opinion or views of my employer or any third-party. 


Thursday, June 13, 2013

Killer Code - Nested Aggregates using CTEs 

(Or, 'what NOT to do on a production SQL Server box')


Yesterday, I was experimenting with pulling out some figures in SQL Server using windowed functions.  SQL Server 2012 comes with a couple of really neat system functions called LAG and LEAD, and these are useful for selecting values from n rows behind or in front of a particular row, denoted by column.  There's other articles out there that deal with using LAG/LEAD, so I'm not going to cover it here, save to give an example of using this and the requirement I was aiming at.

For reasons of confidentiality I sadly cannot reproduce here the exact code I was using.  However here's a simplified example.  Here's an initial table configuration:

CREATE TABLE dbo.TestData (
DateCreated DATE,
ArbitraryValue NUMERIC(16,2) )

INSERT INTO dbo.TestData
SELECT '2013-06-01', 342.14 UNION ALL
SELECT '2013-06-01', 659.45 UNION ALL
SELECT '2013-06-01', 283.49 UNION ALL
SELECT '2013-06-01', 903.34 UNION ALL
SELECT '2013-06-01', 129.06 UNION ALL
SELECT '2013-06-01', 756.65 UNION ALL
SELECT '2013-06-01', 239.05 UNION ALL
SELECT '2013-06-01', 194.52 UNION ALL
SELECT '2013-06-01', 804.44 UNION ALL
SELECT '2013-06-01', 116.69 


So, what I wanted was to have a third column, produced during a SELECT, which gave me the difference between each ArbitraryValue and the value in the row preceding it.  Using LAG, this was quite easy:

SELECT         DateCreated, ArbitraryValue,
ArbitraryValue - LAG(ArbitraryValue, 1, NULL)
                  OVER ( ORDER BY DateCreated ASC ) [Difference]
FROM dbo.TestData
ORDER BY DateCreated ASC

Next, I decided I wanted a fourth column, to work out the average of the differences - I'm sure there's a mathematical term for this not dissimilar to standard deviation but I'm not a mathematician - and I tried defining a fourth column that averaged the values in all preceding rows using the AVG aggregate, inside a subquery (with DateCreated standing in as a key).  The purpose of this column was to, over the DateCreated column, illustrate the regression to the mean of the differences in the ArbitraryValue column, with the latest value being the 'best' average of the differences available:

SELECT t1.DateCreated, t1.ArbitraryValue,
t1.ArbitraryValue - LAG(t1.ArbitraryValue, 1, NULL) 
          OVER ( ORDER BY t1.DateCreated ASC ) [Difference],
        ( SELECT AVG(t2.ArbitraryValue)
FROM dbo.TestData t2
WHERE t2.DateCreated < t1.DateCreated
        ) [Average_Difference]
FROM         dbo.TestData t1
ORDER BY t1.DateCreated

This works wonderfully - for a small result set.  When I tested this with absolute values in rows, there was no problem.  The execution plan doesn't look bad either - table scans, but then again I haven't defined an index.

So, I modified the query to use aggregates in place of absolute values, as this was the actual requirement.  Let's modify the table definition and content as follows which will illustrate what I mean:

TRUNCATE TABLE dbo.TestData
ALTER TABLE dbo.TestData DROP COLUMN ArbitraryValue
INSERT INTO dbo.TestData ( DateCreated )
SELECT '2013-06-01' UNION ALL SELECT '2013-06-01' 
        UNION ALL SELECT '2013-06-01' UNION ALL 
        SELECT '2013-06-01' UNION ALL SELECT '2013-06-02' 
        UNION ALL SELECT '2013-06-02' UNION ALL 
SELECT '2013-06-03' UNION ALL SELECT '2013-06-03' 
        UNION ALL SELECT '2013-06-03' UNION ALL 
SELECT '2013-06-04' UNION ALL   SELECT '2013-06-05' 
        UNION ALL SELECT '2013-06-05' UNION ALL 
        SELECT '2013-06-05' UNION ALL SELECT '2013-06-06' 
        UNION ALL SELECT '2013-06-06' UNION ALL 
        SELECT '2013-06-06' UNION ALL SELECT '2013-06-06' 
        UNION ALL SELECT '2013-06-07' UNION ALL 
        SELECT '2013-06-07' UNION ALL SELECT '2013-06-08' 
        UNION ALL SELECT '2013-06-08' UNION ALL 
        SELECT '2013-06-08' UNION ALL SELECT '2013-06-09' 
        UNION ALL SELECT '2013-06-09' UNION ALL 
SELECT '2013-06-10' 

So now I have a table with ten distinct values in it, a total of 25 rows.  I now want to modify my SELECT query to give me a count of the number of rows per distinct row (i.e. a row count per distinct day), and calculate the difference, day to day, of these values.  In reality, this example had a direct correlation that a row was inserted into the table for an event E.  I needed to count the events, grouped by day, the differences between these counts, and the average of these differences.  This, unfortunately, didn't work, throwing up an interesting error message:

SELECT    t1.DateCreated, COUNT(t1.DateCreated),
   COUNT(t1.DateCreated) 
           - LAG(COUNT(t1.DateCreated), 1, NULL) 
           OVER ( ORDER BY t1.DateCreated ) [Difference],
(  SELECT  AVG(COUNT(t2.DateCreated))
   FROM    dbo.TestData t2
   WHERE   t2.DateCreated < t1.DateCreated 
        )  [Average_Difference]
FROM    dbo.TestData t1
GROUP BY   t1.DateCreated 


Msg 130, Level 15, State 1, Line 3
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.


This is due to the SELECT AVG(COUNT(t2.DateCreated)).  Fair enough.  SQL Server won't allow this as by itself, this isn't a column - it's a statement, and pointless since AVG(COUNT(t2.DateCreated)) == COUNT(t2.DateCreated) for a single value.  However, not only have I defined a WHERE clause, I've included it as an extra column using the subquery.  While logically this should be fine, it's still a case of 'computer says no'.

I decided to get around this restriction by writing the query as follows, using a CTE for the outer query in place of the subquery, meaning I didn't have to calculate the AVG and COUNT in the same statement:

;WITH t2 (DateCreated, DateCount, Difference) AS (
SELECT t1.DateCreated, COUNT(*),
COUNT(*) - LAG(COUNT(*), 1, NULL) 
        OVER ( ORDER BY t1.DateCreated ) [Difference]
        FROM dbo.TestData t1
GROUP BY t1.DateCreated )
SELECT t3.DateCreated, t3.DateCount, t3.Difference, 
( SELECT AVG(t2.Difference) 
FROM t2 
WHERE t2.DateCreated <= t3.DateCreated 
        )  [Average_Difference]
FROM         t2 t3
ORDER BY t3.DateCreated ASC

For a small result set, this worked beautifully, returning the average difference in the fourth column.  The problem came when I tried to scale it up.  Result sets for sub-10,000 rows take a few seconds at most.  Running this query on a set of 320,000 rows (specified using a WHERE clause which I've omitted from the example above, for clarity) took around 10sec.  Scaling it up a bit, I ran this query on 10,000,000 rows (about a year's worth of data) to get the results back in about 1m 30sec.  And finally, I attempted to run this for three years of data, estimated about 30m rows.

Oh dear. 

Immediate, total devastation.  A quick look at Spotlight showed all 16 cores going flat-out at 100% CPU.  Alarms started going off.  A quick check of IOSphere, the FusionIO software, showed I had managed to achieve what 256 threads of SQLIO at full-tilt could not - complete saturation of FusionIO read-write capability (and bear in mind this can cope with about 24Gbps, around 500k IOPS read, 500k IOPS write - see here for more, http://www.fusionio.com/products/iodrive2-duo/).  I stopped the query immediately, and went for the autopsy.

The execution plan showed massive activity with scalar computations with the nested 'call' to t2.Difference causing the first COUNT(*) (in line 3) and the second COUNT(*) (in line 3) to be executed for each and every row returned.  This, combined with the simple arithmetic in line 3, combined with the cost of working with the CTE, combined with the subquery AVG calculation, created a massive amount of set-based load that smashed into the CPU schedulers like an iron fist.  

I would love to reproduce the execution plans here but shouldn't due to NDA-related reasons.  I've got my three years of average differences by using Excel to generate the fourth column instead of SQL.  And I'm sure (and if anyone's reading this, they may post) there's fifty different ways of refactoring this query.  My next read is going to be Itzik Ben-Gan's 'Microsoft SQL Server 2012 High Performance T-SQL Using Window Functions', so next time I can do it properly.

Lessons learned - don't test on production, no matter how tempting.  And don't nest aggregates.

Wednesday, May 15, 2013

From 0 to sysadmin in 30 seconds...

This is a quick tip for anyone put on the spot like I was today...

Picture the scene.  It's five to five, I'm packing up my laptop and finding my headphones.  I'm about to leave the office.  In a hurry, one of our engineers comes over and explains how the on-site supplier is having a problem accessing an obscure, unsupported SQL Server database.  They can't get access.

So I go over to have a look.  It's a SQL Server 2005 DB on the application server.  I've never seen it before.  The engineer explains how they have a username, but the password is lost.  SQL Server Management Studio isn't installed.  No-one knows how to retrieve or reset the password for the user they know of.

All faces turn to you.  What do you do?

In summary:

Start -> Run -> cmd
sqlcmd -Slocalhost -E
(if you're lucky, you'll get in, as 2005 has local admins as sysadmins by default).
If this doesn't work - sqlcmd -Slocalhost -E -A

You're in.  If you're not, get local admin access on the machine first.  Now...

create login 'me' with password 'Pass1234';
go
exec sp_addsrvrolemember @loginame = 'me', @rolename = 'sysadmin';
go
exit
sqlcmd -Slocalhost -Ume -PPass1234

You're now sysadmin.

alter login 'someLogin' with password = 'new password';
(this is the username they've supplied.  Create a new password).
go
exit

Job done.  Coat on, go home.