Problem when upgrading to SQL 2008 with a renamed 'sa' account

  • mattjk

    Mr or Mrs. 500

    Points: 548

    Hi all,

    I wanted to post about a problem we recently experienced trying to upgrade a SQL 2005 Instance to SQL 2008, and a work-around for it. Both for the benefit of the user's on this forum, but more importantly so that this post gets out onto Google so that others won't have to go through what I did trying to fix it!

    The short version:

    Don't attempt to upgrade an existing SQL install to SQL 2008 with a renamed sa account. The upgrade will fail. Rename your sa account back to "sa", perform the upgrade, then you can re-rename it away from "sa" again.

    If you've already tried to upgrade to SQL 2008 with a renamed sa account, it failed, and now your DB Engine won't start, see "The Workaround" below.

    The long version:

    What we were doing:

    Upgrading a SQL 2005 Standard x64, upgrading to SQL 2008 Standard x64.

    What happened:

    The upgrade install fails on the DB Engine with this error message:

    "Wait on the Database Engine recovery handle failed. Check the SQL Server error log for potential causes."

    The summary installer error log (C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\yyyymmdd_hhmmss\Summary_ _yyyymmddd_hhmmss.txt) showed:

    [font="Courier New"]Detailed results:

    Feature: Database Engine Services

    Status: Failed: see logs for details

    MSI status: Passed

    Configuration status: Failed: see details below

    Configuration error code: 0x4BDAF9BA@1306@24

    Configuration error description: Wait on the Database Engine recovery handle failed. Check the SQL Server error log for potential causes.

    Configuration log: C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20080829_104530\Detail.txt[/font]

    Not very useful :-S

    The detailed installer log (C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\yyyymmdd_hhmmss\Summary_ _yyyymmddd_hhmmss.txt) showed - buried a long way down:

    [font="Courier New"]2008-08-29 10:54:01 Slp: Configuration action failed for feature SQL_Engine_Core_Inst during timing ConfigRC and scenario ConfigRC.

    2008-08-29 10:54:01 Slp: Wait on the Database Engine recovery handle failed. Check the SQL Server error log for potential causes.

    2008-08-29 10:54:01 Slp: Configuration action failed for feature SQL_Engine_Core_Inst during timing ConfigRC and scenario ConfigRC.

    2008-08-29 10:54:01 Slp: Microsoft.SqlServer.Configuration.SqlEngine.SqlEngineConfigException: Wait on the Database Engine recovery handle failed. Check the SQL Server error log for potential causes.

    2008-08-29 10:54:01 Slp: at Microsoft.SqlServer.Configuration.SqlEngine.SqlServerServiceBase.WaitSqlServerStart(Process processSql)

    2008-08-29 10:54:01 Slp: at Microsoft.SqlServer.Configuration.SqlEngine.SqlEngineDBStartConfig.UpgradeSQLServerSystemDatabases(EffectiveProperties properties, Boolean isConfiguringTemplateDBs, Boolean fShutdown)

    2008-08-29 10:54:01 Slp: at Microsoft.SqlServer.Configuration.SqlEngine.SqlEngineDBStartConfig.Upgrade_ConfigRC(Version fromVersion, EffectiveProperties properties, Boolean shutdownRightAway)

    2008-08-29 10:54:01 Slp: at Microsoft.SqlServer.Configuration.SqlEngine.SqlEngineDBStartConfig.RepairFailedUpgrade(ConfigActionTiming timing, Dictionary`2 actionData)

    2008-08-29 10:54:01 Slp: at Microsoft.SqlServer.Configuration.SqlEngine.SqlEngineDBStartConfig.Repair(ConfigActionTiming timing, Dictionary`2 actionData, PublicConfigurationBase spcb)

    2008-08-29 10:54:01 Slp: at Microsoft.SqlServer.Configuration.SqlConfigBase.SqlFeatureConfigBase.Execute(ConfigActionScenario scenario, ConfigActionTiming timing, Dictionary`2 actionData, PublicConfigurationBase spcbCurrent)

    2008-08-29 10:54:01 Slp: at Microsoft.SqlServer.Configuration.SqlConfigBase.SlpConfigAction.ExecuteAction(String actionId)

    2008-08-29 10:54:01 Slp: at Microsoft.SqlServer.Configuration.SqlConfigBase.SlpConfigAction.Execute(String actionId, TextWriter errorStream)

    2008-08-29 10:54:01 Slp: Exception: Microsoft.SqlServer.Configuration.SqlEngine.SqlEngineConfigException.

    2008-08-29 10:54:01 Slp: Source: Microsoft.SqlServer.Configuration.SqlServer_ConfigExtension.

    2008-08-29 10:54:01 Slp: Message: Wait on the Database Engine recovery handle failed. Check the SQL Server error log for potential causes.[/font]

    Again, not very useful.

    Next step was the SQL Server error log - a copy of this is placed in the Setup folder the above installer logs were taken from - e.g. (C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\yyyymmdd_hhmmss\SQLServer_ERRORLOG_yyyy-mm-ddThh.mm.ss.txt). This showed, towards the very end:

    [font="Courier New"]2008-08-29 10:54:01.05 spid7s Executing msdb.dbo.sp_syspolicy_create_purge_job

    2008-08-29 10:54:01.14 spid7s Error: 515, Severity: 16, State: 2.

    2008-08-29 10:54:01.14 spid7s Cannot insert the value NULL into column 'owner_sid', table 'msdb.dbo.sysjobs'; column does not allow nulls. INSERT fails.

    2008-08-29 10:54:01.16 spid7s Error: 912, Severity: 21, State: 2.

    2008-08-29 10:54:01.16 spid7s Script level upgrade for database 'master' failed because upgrade step 'sqlagent100_msdb_upgrade.sql' encountered error 515, state 2, severity 16. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the 'master' database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.

    2008-08-29 10:54:01.19 spid7s Error: 3417, Severity: 21, State: 3.

    2008-08-29 10:54:01.19 spid7s Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

    2008-08-29 10:54:01.20 spid7s SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.[/font]

    So now I knew vaguely what the problem was - the msdb.sp_syspolicy_create_purge_job stored proc failed, due to an attempt to insert a null SID into the owner_sid column of the msdb.sysjobs table.

    A bit of digging lead me to the SQL 2008 install/upgrade script that is being executed when the error above occurred - E:\Microsoft SQL Server\MSSQL10. \MSSQL\Upgrade\sqlagent100_msdb_upgrade.sql. Another copy also lives in ...MSSQL\Install\.

    Inside this script, the stored proc that failed (sp_syspolicy_create_purge_job) is created. It starts on line 45301.

    Examining this stored proc, the only part of it that tries to insert anything into the sysjobs table is a call is made to msdb.dbo.sp_add_job, starting on line 45332:

    [font="Courier New"]DECLARE @jobId BINARY(16);

    EXEC @ReturnCode = msdb.dbo.sp_add_job

    @job_name=@job_name,

    @enabled=1,

    @notify_level_eventlog=0,

    @notify_level_email=0,

    @notify_level_netsend=0,

    @notify_level_page=0,

    @delete_level=0,

    @owner_login_name=N'sa',

    @job_id = @jobId OUTPUT;[/font]

    And there was our problem.

    MS have hard-coded the name of the sa account into this stored procedure call (@owner_login_name=N'sa'), but we had renamed our sa account to something else. So, when sp_add_job was called, it tries to get the SID of the user with username "sa", which returns null as no user named "sa" exists. It then tries to insert this null SID into the sysjobs table, which causes the original error, because the owner_sid column of sysjobs can't be null.

    I should note that any subsequent attempt to start the DB engine then fails, as it attempts to re-run the same set of upgrade scripts, which again fail and cause the DB engine to shut-down.

    The work-around:

    The obvious work-around is to temporarily rename the sa account back to "sa", and then run a repair install. However, you can't rename sa, as you can't start the DB engine.

    Next I tried hacking the install/upgrade scripts referenced above to manually insert the name of our sa user instead of "sa". This also didn't work as SQL seemed to ignore the change - obviously it is running these scripts from somewhere other than the filesystem (resource DB maybe?).

    I was almost at the point of nuking the install and starting again when I had one final idea:

    I created a domain user with the username "sa", and added this account to our Database Administrators group (should also work if you add this account to any group with sysadmin priviledges on your SQL server).

    I then started up the DB Service and... viola, upgrade scripts executed without a hitch, and the service started! My faux "sa" user was good enough for the scripts to execute just fine.

    Final step was to delete this domain user again, rename the "real" sa account back to "sa", and then run a repair install (to be safe, and also to fix a few things that got broken by the interrupted original setup).

    Once that was all done, everything appears to be normal again.

    I hope that helps someone out there - it took me over a day to get this fixed!

    Cheers,

    Matt

  • Gail Shaw

    SSC Guru

    Points: 1004454

    Could you post this as a bug on Connect please?

    http://connect.microsoft.com

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • mattjk

    Mr or Mrs. 500

    Points: 548

    GilaMonster (8/29/2008)


    Could you post this as a bug on Connect please?

    Done - https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=364566

    Thanks for posting that, I didn't know about Connect as a method of reporting bugs / feedback to MS.

  • Gail Shaw

    SSC Guru

    Points: 1004454

    The PSS engineers have looked into this and posted the following on the subject:

    http://blogs.msdn.com/psssql/archive/2008/09/10/upgrade-for-sql-server-2008-can-fail-if-you-have-renamed-the-sa-account.aspx

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • mattjk

    Mr or Mrs. 500

    Points: 548

    GilaMonster (9/10/2008)


    The PSS engineers have looked into this and posted the following on the subject:

    http://blogs.msdn.com/psssql/archive/2008/09/10/upgrade-for-sql-server-2008-can-fail-if-you-have-renamed-the-sa-account.aspx

    Thanks for posting that, good to see they're working on it.

    Bah, I see you get a mention by name for finding the other places sa is hard-coded, but what do I get for the bug report, diagnosis and work-around... nothing! 🙂

  • Peter Shire

    SSC-Addicted

    Points: 457

    More people than you realize recognize you as the source of this info. We appreciate it. Good job.

  • Gail Shaw

    SSC Guru

    Points: 1004454

    mattjk (9/10/2008)


    Bah, I see you get a mention by name for finding the other places sa is hard-coded, but what do I get for the bug report, diagnosis and work-around... nothing! 🙂

    I can mail Bob and ask him to change the attribution if you like. Was contemplating it anyway. All I did was report it direct.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • mattjk

    Mr or Mrs. 500

    Points: 548

    Thanks for the thoughts guys - Gail, it's OK, my post was tongue-in-cheek 🙂

    I'm doing well on the bugs at the moment though! I was the first person to post the details of the recent 12th August VMWare ESX time-bomb bug on the VMWare forums, and now this one 🙂

  • amybvt

    SSC Rookie

    Points: 49

    We hit this same problem when our sa account was disabled, not renamed.

  • dsslim

    SSC Enthusiast

    Points: 129

    Matt,

    Thanks very much for your write-up. It saves my day !

    Btw, how did you get to try out the Domain\sa in the sysadmin group ?

  • mattjk

    Mr or Mrs. 500

    Points: 548

    dsslim (1/13/2011)


    Thanks very much for your write-up. It saves my day !

    My pleasure.

    Btw, how did you get to try out the Domain\sa in the sysadmin group ?

    I'm not sure I understand what you mean here - can you clarify please?

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply