The Hibernate ORM and SQL Server

  • Comments posted to this topic are about the item The Hibernate ORM and SQL Server

    Thank you,

    Yell McGuyer
    DataYell.com

  • I've never worked on product/project where there was a DBA. I am typical 1 of the developers and the DBA so I have to think about both.

    Sometimes I have flitted around ORMs and I prefer projects that don't use them where you can easily see what SQL is running and tune it if it turns out to be bad. Especially if it's customer doing the complaining.

    The article reads very much like an in house system if solving problems were only that easy.

  • Unfortunately my friend trying to stop this problem before it happens is like trying to ice skate uphill. Database Developers and DBA job requisitions are usually created once the damage is already done when people are running around like a chicken with their heads cut off because their code/infrastructure won't scale. Even when they do hire you, your involvement is usually a reactionary measure and you get left out of the conception planning phases.

    While there are handful of places that will proactively involve their data experts early and often, the reality is this: We are to the IT field what Proctologists are to the medical field - we get paid pretty decent amount of money to deal with other people's sh...

    As far as ORMs are concerned, it doesn't help that the same company that makes SQL Server also promotes the use of ORMs (albeit from different teams inside the company).

  • I'll probably never get into using this but thanks for the nice read.

  • I've found that ORMs can be used sensibly and effectively. Hibernate is one of the more powerful ones.

    Unfortunately I've rarely seen any attempts to configure them beyond a basic install. The inevitable consequence is that they get sullied in the same way that a poorly structured unindexed data model will sully a database

  • I agree that hibernate is extremely powerful, and can allow a company to stand up a fully functioning data driven app very quickly. My hope is that more companies understand the importance of hiring, and including database professionals from the beginning of projects. But, we all know that that is not the typical workflow. Hopefully some of the lessons I learned will help both someone that inherits an application using an ORM, and someone that is thinking of building one.

    Thank you,

    Yell McGuyer
    DataYell.com

  • Gabriel P (12/14/2016)


    Unfortunately my friend trying to stop this problem before it happens is like trying to ice skate uphill. Database Developers and DBA job requisitions are usually created once the damage is already done

    Very true. You don't have to be in this business long to understand that. I will keep fighting the good fight, and cleaning up technical messes with a snow shovel along the way.

    Thank you,

    Yell McGuyer
    DataYell.com

  • Great article this is exactly the boat we are in now that the database is large 6 TB ....things are starting to grind to a halt so we're going to put everything on enterprise flash drives to give development time to come up with some new approaches to their queries .

    We've generally had to clear the plan cache once a month but in the last six months it's getting down to once or twice a week I've been thinking of taking some of the worst report style Queries since this is a combined system and running them with option compile using parameters for the largest clients since our data is heavily skewed

    I'm just now beginning to look into filtered statistics since our data is skewed and also running some of the worst queries on a regular basis with option recompile

    Also starting to look into filtered statistics the general advice to let the optimizer do it's thing just doesn't work in all cases

  • Indianrock,

    Are you able to convert any of the poorly performing hibernate queries to better written stored procedures? Can the app devs consume results from a procedure for some of the worst offenders?

    We were able to relieve quite a bit of database strain by taking the worst performing hibernate queries and converting them to well written sprocs.

    Thank you,

    Yell McGuyer
    DataYell.com

  • Development is considering all options. The SSD thing was just to buy them some time.

    In the meantime I may take a couple of the worst queries ( combined OLTP/OLAP system )

    and look into the stored procedure route.

    Here's one of our worst offenders where I've been looking into filtered stats and the Option(recompile ) hint. Note the outer and even a CROSS join. The plan includes a 'no join predicate" warning.

    I find long-running plans in the cache that have specific keywords I know are found in the query. From there I run a profiler trace to get the exact call as the app sends the query

    exec sp_executesql N'SELECT TOP 2000 YADA YADA YADA

    Lastly I have a small home-grown app that coverts that to a "runnable" query with the parameter values all filled out. My immediate goal is to find some small wins that don't require code changes.

    SELECT ResolveFollowUpInvalidDataError02.WORK_QUEUE_ITEM_ID AS PrimaryKey,ResolveFollowUpInvalidDataError02.TRANS_SEQ_NUM AS VersionNumber,SecuredOrganization13.SHORT_NAME AS ClientShortName,FollowUpDataError14.ERROR_MESSAGE AS ErrorMessage,ResolveFollowUpInvalidDataError02.CREATED_DATE_TIME AS CreatedDateTime,Account16.FINANCED_DATE AS FinancedDate,Account16.BOOKED_DATE AS BookedDate,MIN(DateAdd(Day, FollowUpActivityDefinition212.SCHEDULED_OFFSET_DAYS, Account16.FINANCED_DATE)) AS ScheduledDate,Account16.CUSTOM_ATTRIBUTE_1 AS AccountNumber,Account16.CUSTOM_ATTRIBUTE_2 AS LoanNumber,Account16.CUSTOM_ATTRIBUTE_3 AS LoanSuffix,Account16.CUSTOM_ATTRIBUTE_4 AS Branch,AccountOwnershipDocSummary17.BUSINESS_UNIT_LONG_NAME AS BusinessUnit,AccountProperty15.MODIFIED_MANUFACTURER_ID AS ManufacturerId,Account16.CATEGORY AS AccountCategory,Account16.SUB_CATEGORY AS AccountSubCategory,AccountProperty15.EXPECTED_TITLING_STATE_ABBR AS ExpectedTitlingState,RecordedLegalEntity18.FULL_NAME AS BorrowerName,FollowupEntity19.ENTITY_CODE AS DealerId FROM WORK_QUEUE_ITEM AS ResolveFollowUpInvalidDataError02 INNER JOIN ORGANIZATION AS SecuredOrganization13 ON ResolveFollowUpInvalidDataError02.SECURED_ORGANIZATIONORGANIZATION_ID=SecuredOrganization13.ORGANIZATION_ID INNER JOIN (ERROR AS FollowUpDataError14 INNER JOIN (PROPERTY AS AccountProperty15 INNER JOIN (ACCOUNT AS Account16 LEFT OUTER JOIN ACCOUNT_OWNERSHIP_DOC_SUMMARY AS AccountOwnershipDocSummary17 ON Account16.ACCOUNT_ID=AccountOwnershipDocSummary17.ACCOUNT_ID LEFT OUTER JOIN LEGAL_ENTITY AS RecordedLegalEntity18 ON Account16.ACCOUNT_ID=RecordedLegalEntity18.ACCOUNT_ID LEFT OUTER JOIN LEGAL_ENTITY AS FollowupEntity19 ON Account16.FOLLOWUP_ENTITYLEGAL_ENTITY_ID=FollowupEntity19.LEGAL_ENTITY_ID CROSS JOIN (PORTFOLIO_SECTION_DEFINITION AS PortfolioSection111 INNER JOIN FOLLOW_UP_ACTIVITY_DEFINITION AS FollowUpActivityDefinition212 ON PortfolioSection111.PORTFOLIO_SECTION_DEFINITION_ID=FollowUpActivityDefinition212.PORTFOLIO_SECTION_DEFINITION_ID)) ON AccountProperty15.ACCOUNT_ID=Account16.ACCOUNT_ID) ON FollowUpDataError14.ACCOUNT_PROPERTYPROPERTY_ID=AccountProperty15.PROPERTY_ID) ON ResolveFollowUpInvalidDataError02.ERROR_ID=FollowUpDataError14.ERROR_ID WHERE ((ResolveFollowUpInvalidDataError02.CONCRETE_TYPE IN ( 'Fdi.Workflow.Po.ResolveFollowUpInvalidCustomerDataError', 'Fdi.Workflow.Po.ResolveFollowUpInvalidDealerDataError') AND ResolveFollowUpInvalidDataError02.BUSINESS_PROCESS_STATUS = 'Open' AND RecordedLegalEntity18.PRECEDENCE = 1 AND ResolveFollowUpInvalidDataError02.SECURED_ORGANIZATIONORGANIZATION_ID = 20) AND ((SecuredOrganization13.CONCRETE_TYPE IN ( 'Fdi.Po.SecuredOrganization', 'Fdi.Po.FdiOrganization', 'Fdi.Po.BusinessPartner', 'Fdi.Po.Vendor', 'Fdi.Po.Client', 'Fdi.Po.Jurisdiction')))) GROUP BY ResolveFollowUpInvalidDataError02.WORK_QUEUE_ITEM_ID,ResolveFollowUpInvalidDataError02.TRANS_SEQ_NUM,SecuredOrganization13.SHORT_NAME,FollowUpDataError14.ERROR_MESSAGE,ResolveFollowUpInvalidDataError02.CREATED_DATE_TIME,Account16.FINANCED_DATE,Account16.BOOKED_DATE,Account16.CUSTOM_ATTRIBUTE_1,Account16.CUSTOM_ATTRIBUTE_2,Account16.CUSTOM_ATTRIBUTE_3,Account16.CUSTOM_ATTRIBUTE_4,AccountOwnershipDocSummary17.BUSINESS_UNIT_LONG_NAME,AccountProperty15.MODIFIED_MANUFACTURER_ID,Account16.CATEGORY,Account16.SUB_CATEGORY,AccountProperty15.EXPECTED_TITLING_STATE_ABBR,RecordedLegalEntity18.FULL_NAME,FollowupEntity19.ENTITY_CODE ORDER BY 3,6

    option (

    RECOMPILE

    )

    /*

    select count(*) from work_queue_item where SECURED_ORGANIZATIONORGANIZATION_ID=20 Result: 3,928,819

    CREATE STATISTICS [WORK_QUEUE_ITEM_SECURED_ORGANIZATIONORGANIZATION_ID_Chase] ON [dbo].[WORK_QUEUE_ITEM]([SECURED_ORGANIZATIONORGANIZATION_ID])

    WHERE ([Client_id]=(20))

    */

  • Implemented before I was hired. Hopefully buried before I retire.

    My key lookups are due to a past propensity to add columns without considering normalization, a slavish reliance upon surrogate PKs in a domain model, and a failure to define unique constraints (lots of select distinct). I also have to gapple with EAV designs, implicit foreign keys, and the removal of 'duplicates' (quoted due to lack of a unique constraint). My key lookups tell me what natural key was needed, but a redesign would require NHibernate's mappings to be changed, which the developers dread.

    I cannot always deduce the intent of a traced update, because NHibernate was and largely still is configured to update every column. After insisting that 'feature' be turned off for t-shooting needs, NHibernate's profiling tool uncovered a bug that resulted in 7 additional, unnecessary updates being submitted. I suspect there are others...

    I have found plan guides can help, apart from statements with that parameter list from hell problem. Because its generated SQL is written for the least common denominator, I don't expect optimal performance and I do expect avoidable SQL complexities. Using sprocs for the 'complicated' queries is done, but on occasion they have to use optimize for unknown. The need for that hint is typically due to sparse columns and skewed distributions, caused by column additions leading to violations of 2nd normal form.

    Now that we are going towards continuous deployment, I am hoping to put a stake in its heart. Continuous deployment will require better normalization, and I am fairly certain that will require destabilizing NHibernate (carpe diem ;).

  • It's been awhile since I used hibernate.

    It used to declare it's parameters at the length of the input so a value of "DAVE" would ne submitted as NVARCHAR(4), whereas a value of "STEVE" would evaluate to NVARCHAR(5). This meant that what should have been exactly the same query plan turned into thousands.

    I wrote it up in http://www.sqlservercentral.com/articles/nHibernate/86913/ and I also ran into a collation problem in http://www.sqlservercentral.com/articles/Performance+Tuning/72874/

    The secret with these tools is to know your enemy. Install them in your test lab, learn how to fix the issues, become an expert. That way when any foot dragging occurs fixing problems you can point directly to the solution. In some cases the "impossible to fix" problem can ne fixed by setting the appropriate boolean flag in the config file

  • SoHelpMeCodd and David Poole: yes yes yes, exactly.

    Lots of select distincts, usually combined with TOP. IN statements with 50+ elements. Parameters declared at string length. I've made a little progress getting them to declare all chars as varchar(8000) but keep asking for them to pin it to the column length.

    Fortunately not too many wide tables but one has grown to 130 columns and 30 indexes at 100 million records. The DEV/QA/Release team does virtually all code and schema changes via their release process.

    We have serious data skew between small/large clients and estimated/actual rows WAY off.

    This morning I actually started putting a few filtered stats into production and ran some of the big reports with Option(Recompile). Normally this would have to be run through several QA environments ( where the load is not like prod at all ), then a change control board who wouldn't understand what I was doing anyway. For any index changes, cost threshold for parallelism etc etc this process takes months.

    They all seem paralyzed as cpu spikes and customers dial in. There is the plan to put everything on SSD drives soon. Today I finally decided to implement some small, stealth changes in prod I can back out of quickly if need be -- fortunately, I can retire anytime now 🙂

    Then there is all the blocking in the Aspstate DB, but that's not ORM-related.

  • Indianrock (12/14/2016)


    My immediate goal is to find some small wins that don't require code changes.

    SELECT ResolveFollowUpInvalidDataError02.WORK_QUEUE_ITEM_ID AS PrimaryKey,ResolveFollowUpInvalidDataError02.TRANS_SEQ_NUM AS VersionNumber,SecuredOrganization13.SHORT_NAME AS ClientShortName,FollowUpDataError14.ERROR_MESSAGE AS ErrorMessage,ResolveFollowUpInvalidDataError02.CREATED_DATE_TIME AS CreatedDateTime,Account16.FINANCED_DATE AS FinancedDate,Account16.BOOKED_DATE AS BookedDate,MIN(DateAdd(Day, FollowUpActivityDefinition212.SCHEDULED_OFFSET_DAYS, Account16.FINANCED_DATE)) AS ScheduledDate,Account16.CUSTOM_ATTRIBUTE_1 AS AccountNumber,Account16.CUSTOM_ATTRIBUTE_2 AS LoanNumber,Account16.CUSTOM_ATTRIBUTE_3 AS LoanSuffix,Account16.CUSTOM_ATTRIBUTE_4 AS Branch,AccountOwnershipDocSummary17.BUSINESS_UNIT_LONG_NAME AS BusinessUnit,AccountProperty15.MODIFIED_MANUFACTURER_ID AS ManufacturerId,Account16.CATEGORY AS AccountCategory,Account16.SUB_CATEGORY AS AccountSubCategory,AccountProperty15.EXPECTED_TITLING_STATE_ABBR AS ExpectedTitlingState,RecordedLegalEntity18.FULL_NAME AS BorrowerName,FollowupEntity19.ENTITY_CODE AS DealerId FROM WORK_QUEUE_ITEM AS ResolveFollowUpInvalidDataError02 INNER JOIN ORGANIZATION AS SecuredOrganization13 ON ResolveFollowUpInvalidDataError02.SECURED_ORGANIZATIONORGANIZATION_ID=SecuredOrganization13.ORGANIZATION_ID INNER JOIN (ERROR AS FollowUpDataError14 INNER JOIN (PROPERTY AS AccountProperty15 INNER JOIN (ACCOUNT AS Account16 LEFT OUTER JOIN ACCOUNT_OWNERSHIP_DOC_SUMMARY AS AccountOwnershipDocSummary17 ON Account16.ACCOUNT_ID=AccountOwnershipDocSummary17.ACCOUNT_ID LEFT OUTER JOIN LEGAL_ENTITY AS RecordedLegalEntity18 ON Account16.ACCOUNT_ID=RecordedLegalEntity18.ACCOUNT_ID LEFT OUTER JOIN LEGAL_ENTITY AS FollowupEntity19 ON Account16.FOLLOWUP_ENTITYLEGAL_ENTITY_ID=FollowupEntity19.LEGAL_ENTITY_ID CROSS JOIN (PORTFOLIO_SECTION_DEFINITION AS PortfolioSection111 INNER JOIN FOLLOW_UP_ACTIVITY_DEFINITION AS FollowUpActivityDefinition212 ON PortfolioSection111.PORTFOLIO_SECTION_DEFINITION_ID=FollowUpActivityDefinition212.PORTFOLIO_SECTION_DEFINITION_ID)) ON AccountProperty15.ACCOUNT_ID=Account16.ACCOUNT_ID) ON FollowUpDataError14.ACCOUNT_PROPERTYPROPERTY_ID=AccountProperty15.PROPERTY_ID) ON ResolveFollowUpInvalidDataError02.ERROR_ID=FollowUpDataError14.ERROR_ID WHERE ((ResolveFollowUpInvalidDataError02.CONCRETE_TYPE IN ( 'Fdi.Workflow.Po.ResolveFollowUpInvalidCustomerDataError', 'Fdi.Workflow.Po.ResolveFollowUpInvalidDealerDataError') AND ResolveFollowUpInvalidDataError02.BUSINESS_PROCESS_STATUS = 'Open' AND RecordedLegalEntity18.PRECEDENCE = 1 AND ResolveFollowUpInvalidDataError02.SECURED_ORGANIZATIONORGANIZATION_ID = 20) AND ((SecuredOrganization13.CONCRETE_TYPE IN ( 'Fdi.Po.SecuredOrganization', 'Fdi.Po.FdiOrganization', 'Fdi.Po.BusinessPartner', 'Fdi.Po.Vendor', 'Fdi.Po.Client', 'Fdi.Po.Jurisdiction')))) GROUP BY ResolveFollowUpInvalidDataError02.WORK_QUEUE_ITEM_ID,ResolveFollowUpInvalidDataError02.TRANS_SEQ_NUM,SecuredOrganization13.SHORT_NAME,FollowUpDataError14.ERROR_MESSAGE,ResolveFollowUpInvalidDataError02.CREATED_DATE_TIME,Account16.FINANCED_DATE,Account16.BOOKED_DATE,Account16.CUSTOM_ATTRIBUTE_1,Account16.CUSTOM_ATTRIBUTE_2,Account16.CUSTOM_ATTRIBUTE_3,Account16.CUSTOM_ATTRIBUTE_4,AccountOwnershipDocSummary17.BUSINESS_UNIT_LONG_NAME,AccountProperty15.MODIFIED_MANUFACTURER_ID,Account16.CATEGORY,Account16.SUB_CATEGORY,AccountProperty15.EXPECTED_TITLING_STATE_ABBR,RecordedLegalEntity18.FULL_NAME,FollowupEntity19.ENTITY_CODE ORDER BY 3,6

    option (

    RECOMPILE

    )

    /*

    select count(*) from work_queue_item where SECURED_ORGANIZATIONORGANIZATION_ID=20 Result: 3,928,819

    CREATE STATISTICS [WORK_QUEUE_ITEM_SECURED_ORGANIZATIONORGANIZATION_ID_Chase] ON [dbo].[WORK_QUEUE_ITEM]([SECURED_ORGANIZATIONORGANIZATION_ID])

    WHERE ([Client_id]=(20))

    */

    I totally understand finding some small wins that don't require code changes. We still do that. I had to sign off on a ticket to add yet another index to our largest widest and most used table just to prevent another scan. That makes 14 indexes. All of this just because app code changes for that section are incredibly painful.

    I would love to see the homegrown app that that coverts it to a "runnable" query with the parameter values all filled out

    Thank you,

    Yell McGuyer
    DataYell.com

  • SoHelpMeCodd,David.Poole, and IndianRock,

    You are all talking like you are looking right at this system here.

    David.Poole, what you are describing with the plan cache is exactly what is happening. I'm going to read your other two posts now.

    Thank you,

    Yell McGuyer
    DataYell.com

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply