Relational database - rarely more efficient to store snippets of non relational data against a record - thoughts?

  • Hi

    Is it fair to say that as much as you would want a relational database there are rare occasions when it is more efficient (for performance reasons) e.g. to hold a set of integers as a string in a field for splitting in managed app code rather than holding it relationally and either returning multiple records due to a different integer per record OR having to undertake a secondary lookup to the db server.

    What is critical is performance and ideally avoiding a secondary db call.

    I am always thinking relational, however, in this situation could I please ask for your thoughts as it feels against the grain and normal thinking.

    Thanks

    Mark

  • We're talking pretty hypothetical here, but sure, there are situations where normalized storage isn't the most efficient way to manage information. That's actually part of why all the NoSQL and Document databases have come about. Structured storage makes sense most of the time, but not all the time.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I'm not sure I understand what is meant by the application having to make secondary database calls. For example, if you have a SQL Server database containing invoices with multiple line items, then you can query all data elements related to that invoice in one execution, either as a single joined tablular style resultset or a stored proceure call that returns multiple resultsets, one for the invoice header and one containing line items.

    I've seen LINQ or ORM style applications do odd things like loop and make a separate query for each line item, but that has nothing to do with a limitation of SQL Server or relational databases in general; it's a limitation of how ORM tools function or the lack of expertise of the developer who programs them.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks Eric for response.

    Your right, to avoid a secondary call it could be wrapped up in one call (store procedure etc) but in the example you would have say the item name (as a trans in the invoice) duplicated for each integer on the resultset.More records returned to the calling apg the than previously, whereas holding them as a string as an extra field means same number of records returned.

  • I can give you a real world example of where storing a delimited string made the most sense. This is something I built just a few months ago.

    We have a DotNetNuke (DNN) website with lots of custom modules. These custom modules are super robust and let you add your own properties which are exposed as a collection. This is incredibly handy for things like allowing site admins to customize the display and various other things that are "module wide" settings. These settings as you can imagine are designed to hold a single value and behind the scenes the values are stored as nvarchar. So far so good right?

    Well we have a module that displays documents. These documents are stored in another database and each document has a given Authorization level and document type. The way this module works is that it displays a filterable and sortable grid of all documents. However, the settings in the module indicate which authorization levels and which types are allowed to be shown. I could have created my own tables for authorization levels and doc types with a key to the DNN generated instanceID of the installed module. That would have worked and would have been fairly easy, however it would have been a bit overkill and circumvented the custom settings that are available. It also would mean another query each time the module is loaded to get the values from both tables. Instead, what I did was a comma delimited list of both and put each of them into their own custom setting. Using DelimitedSplit8K this is super fast because I know the lists are not every going to be greater than 20-30 (they are closer to 5-8 today).

    So in this example I would argue that yes, in this edge case it is actually more efficient to store a delimited list of values (violating 1NF) than it would be to store these values each in their own tables.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • SQL Server stored procedure calls can accomodate multi-valued input paramters, if that's what you're talking about here. By this, I mean you can pass in a string of multiple invoice numbers, and the procedure returns a resultset containing data for that set of invoices. This input parameter could be in the form of a delimited or XML string.

    If a relational database otherwise seems to be a natural for for your organization's needs, then I wouldn't switch to a NoSQL alternative just to work around this type of case usage requirement.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (7/27/2015)


    SQL Server stored procedure calls can accomodate multi-valued input paramters, if that's what you're talking about here. By this, I mean you can pass in a string of multiple invoice numbers, and the procedure returns a resultset containing data for that set of invoices. This input parameter could be in the form of a delimited or XML string.

    If a relational database otherwise seems to be a natural for for your organization's needs, then I wouldn't switch to a NoSQL alternative just to work around this type of case usage requirement.

    Are you replying to my post or the OP?

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I was starting to think that MARS was a good idea but I understand that in SQL 2014 MARS is not allowed on in memory tables so I need to check whether if the table has been implemented on disk or in memory. If on disk I am assuming MARS will still be ok in 2014. We use SQL2008R2 so not so familiar with 2014.

  • Sean Lange (7/27/2015)


    Eric M Russell (7/27/2015)


    SQL Server stored procedure calls can accomodate multi-valued input paramters, if that's what you're talking about here. By this, I mean you can pass in a string of multiple invoice numbers, and the procedure returns a resultset containing data for that set of invoices. This input parameter could be in the form of a delimited or XML string.

    If a relational database otherwise seems to be a natural for for your organization's needs, then I wouldn't switch to a NoSQL alternative just to work around this type of case usage requirement.

    Are you replying to my post or the OP?

    I'm replying to the OP; it's unclear in what context he needs to work with delimited keys. It sounds perhaps like a multi-part question, and at one point he mentioned concern about multiple database calls being required to fetch data for multiple entities.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Hughesy (7/27/2015)


    I was starting to think that MARS was a good idea but I understand that in SQL 2014 MARS is not allowed on in memory tables so I need to check whether if the table has been implemented on disk or in memory. If on disk I am assuming MARS will still be ok in 2014. We use SQL2008R2 so not so familiar with 2014.

    Post sample DDL for the table(s) in question and also more detail about how the data is to be retreived.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (7/27/2015)


    Sean Lange (7/27/2015)


    Eric M Russell (7/27/2015)


    SQL Server stored procedure calls can accomodate multi-valued input paramters, if that's what you're talking about here. By this, I mean you can pass in a string of multiple invoice numbers, and the procedure returns a resultset containing data for that set of invoices. This input parameter could be in the form of a delimited or XML string.

    If a relational database otherwise seems to be a natural for for your organization's needs, then I wouldn't switch to a NoSQL alternative just to work around this type of case usage requirement.

    Are you replying to my post or the OP?

    I'm replying to the OP; it's unclear in what context he needs to work with delimited keys. It sounds perhaps like a multi-part question, and at one point he mentioned concern about multiple database calls being required to fetch data for multiple entities.

    Ahh gotcha. I was kind of thinking the two were intertwined but perhaps not.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • @eric

    I'll try and give more context.

    Let's say current design allows for one or more items (db records and probs no more than say 20) to be printed via an app. Then a new requirement comes along to say if an items sub-item is shared by another item in the original list of items then some extra logic is required before printing.

    Note: an item would always have at least one sub item but probably no more than say 10/15.

    Hope that helps explain it a little more.

  • @eric

    Simplified it for this but basically:-

    ItemId int

    ItemDescription varchar(255)

    Currently, one or more of these items (up to 20 ish) would be loaded into memory in the app.

    The new requirement is then to use an additional table as such:-

    SubItemId int

    SubItemDescription varchar(255)

    where each Item would have one or more SubItems (no more than say 10/15).

    The consumer needs to check to see if in the original selected list of Items there are shared sub-items. If there is then some extra logic needs to occur.

    The debate is whether to use the additional table (SubItem) or someone has suggested for best performance to add an additional field to the Item table called SubItemIds with a delimited list of the subItemId's in a string. The DDL for the Item table would in this option would be:-

    ItemId int

    ItemDescription varchar(255)

    SubItemIds varchar(max) /*pipe separated list if ints*/

    with no need for the subitem table.

    The app in this situation would split the id's and do the same logic check as discussed earlier.

    hope that helps.

  • Based on what you've described above, I see no reason not to simply create a SubItem table.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • How would you retrieve the content? Stored proc with Mars or a secondary db call for the sub items.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply