The Numbers Table

  • Anyone can take any example and change it so it performance is important.

    You're absolutely right! Why do you suppose that it happens to be so easy?

    --
    Adam Machanic
    whoisactive

  • Adam Machanic (11/25/2008)


    Anyone can take any example and change it so it performance is important.

    You're absolutely right! Why do you suppose that it happens to be so easy?

    Is it because anality comes naturally to some people? 😉

  • jacroberts (11/25/2008)


    Adam Machanic (11/25/2008)


    Anyone can take any example and change it so it performance is important.

    You're absolutely right! Why do you suppose that it happens to be so easy?

    Is it because anality comes naturally to some people? 😉

    Very funy indeed, but he does have a point ofcourse. I can think of more then a few temporary implementations that never vanished and instead became entrenched and copied over and over when adding stuff to an existing system.

    Fact is that even small things add up. Besides it is not the direct 'bad' performace or even that of the near future.

    For fun, just imagine people that are new to developing that see solutions that are sub-optimal (and problably without warning comments too) where it could easily have been avoided without much extra time. It is natural for this group (a very large one in fact) to adopt to what they see being used around them (hell most of my knowledge I gathered that way).

    Then the copying starts....and no manager in the world is going to grand time to undo it all unless there is an immediate operational problem. Not long after there will be a problem, and it might not even be in the place you expect. Somewhere else in new code that however well written will just not doesn't work as fast as it could without bad code stamped all over the place.

    Starting to see a pattern here?

    Personaly I won't go anal over non-scheduled run-once stuff. But as soon as it is permanently hooked in a system I do want to see some time spend on it to make sure its not obese, be it in code or execution time!

  • peter (11/25/2008)


    jacroberts (11/25/2008)


    Adam Machanic (11/25/2008)


    Anyone can take any example and change it so it performance is important.

    You're absolutely right! Why do you suppose that it happens to be so easy?

    Is it because anality comes naturally to some people? 😉

    >>Starting to see a pattern here?

    Yes, another one's turned up.

  • jacroberts (11/25/2008)


    peter (11/25/2008)


    jacroberts (11/25/2008)


    Adam Machanic (11/25/2008)


    Anyone can take any example and change it so it performance is important.

    You're absolutely right! Why do you suppose that it happens to be so easy?

    Is it because anality comes naturally to some people? 😉

    >>Starting to see a pattern here?

    Yes, another one's turned up.

    Is it me or do you seem to be in the minority here?

  • Is it me or do you seem to be in the minority here?

    Minority here on SSC where people are a bit more educated about these things, but certainly not in the world at large. I make my living fixing performance messes for my customers, and if this kind of attitude were not common I wouldn't be able to pay my mortgage... So please, keep up the good work and spread the faith! Why worry about performance when you can hire me instead? 😀

    --
    Adam Machanic
    whoisactive

  • Minority here on SSC where people are a bit more educated about these things, but certainly not in the world at large. I make my living fixing performance messes for my customers, and if this kind of attitude were not common I wouldn't be able to pay my mortgage... So please, keep up the good work and spread the faith! Why worry about performance when you can hire me instead? 😀

    Perhaps the best statement I've seen yet!

  • Adam Machanic (11/25/2008)


    Is it me or do you seem to be in the minority here?

    Minority here on SSC where people are a bit more educated about these things, but certainly not in the world at large. I make my living fixing performance messes for my customers, and if this kind of attitude were not common I wouldn't be able to pay my mortgage... So please, keep up the good work and spread the faith! Why worry about performance when you can hire me instead? 😀

    That's how I make my living too. 🙂

  • thisisfutile (11/25/2008)


    Thanks for the professional approach in your response.

    You just made my day. Thanks for taking it the right way and thanks for the nice compliment!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Could you not just us the PK of a large table that already exists int the database instead, seems to work for me in 99% of cases.

  • Could you not just us the PK of a large table that already exists int the database instead, seems to work for me in 99% of cases.

    That "sort of" works "somewhat", but it is bad, read very bad practice!

    First, you still would need a "row_number() over ( ... )" over function, as your primary key even when it is an identity column will contain holes!

    Yes, even when you never delete records...as every failed insert will still increase the identity value. This is even true when the insert is part of a failing trainsaction!!!

    Second, a perfromace issue with that aproach is that the number of records in a single page will be rather low, thus more page reads are required to cover the same number range compared to a dedicated numbers table. The database engine after all also needs to read the other fields stored in your seed table that are in fact of absolutely no use to generate numbers!!!

    I also seen solutions (and I am still using one, but looking for a way out) that use 'master.sys.All_Columns' as a seed table and generate numbers using the row_number() function. While 'master.sys.All_Columns' works fast (is is practically a in-memory table, no physical reads from disk), it cannot be used in every environment.

    Books Online says about this:

    In SQL Server 2005 and later versions, the visibility of the metadata in catalog views is limited to securables that a user either owns or on which the user has been granted some permission. For more information, see Metadata Visibility Configuration.

    Using existing tabes is also not generic as for each database you need a new seed table to "abuse". Abuse it is because you are using a table in a way that others cannot even guess in the slightest by looking at that table. They instead see customers or cars or whatever table you have been using. If someone decides to clear all cars due to some requirement, your function all of a sudden stops working or chnages specifications and things go tits up.

    What is even worse is coding this in each and every query that needs numbers, making changes later, will become a nightmare. Use a inline table valued function that does the reading with a clearly defined interface. If you find you need to change method, change the function instead of all the different queries you use numbers in.

    If you want numbers, use a numbers table or generate the numbers out of thin air in the function.This way you can be sure that it is clear what the table is being used for and the queries become easyer to comprehend as well. The only way this can break is when people delete the table as they don't understand numbers....but that is not your fault unless it is not a DBA that did it and you forgot to set permissions properly!

    I hope this post is clear, and I bet it contains even some things some veterans here were unaware of before 🙂

  • It seems to me we all need a little focus here. Its not bad practice.

    I have a database with a table that already contains a PK with enough rows for my need. I just use this, I dont have to faff around with creating a table.

    Its just very easy and very simple.

    If performance is an issue then you use a table of numbers like you suggest, I think the whole point is use what is suitable for the query and in 99% of the cases this works.

    Also, a DBA should know what tables will have a non-interrupted PK and use those instead of others that may have had deleted records.

  • Richard Hill (11/26/2008)


    It seems to me we all need a little focus here. Its not bad practice.

    I have a database with a table that already contains a PK with enough rows for my need. I just use this, I dont have to faff around with creating a table.

    Its just very easy and very simple.

    If performance is an issue then you use a table of numbers like you suggest, I think the whole point is use what is suitable for the query and in 99% of the cases this works.

    You are taking this the wrong way man!

    I provided clear agruments as to why it is a bad idea.

    1. You have no control over non-static tables with idenity keys (even if you think you do)!

    2. Other people will be unable to guess the table is used for generating numbers just by looking at the table, unless it is a dedicated numbers table!

    3. It is not generic, meaning it has to be different on every database and with a different maximum number range. The range itself is not even documented in any obvious way, nor guaranteed for that matter!

    4. The performance will be lower then with a dedicated numbers table. The reason is the extra fields stored per record that have to be read from disk.

    5. The usage of your 'alternative numbers table' is spread over many queries, if something changes, you need to modify all these queries. Some of these queries can be part of compiled application code, so no easy feat! Use an inline table-valued user defined function (UDF) instead to provide the numbers to your queries and any application that needs it.

    The fact that it works for you does not make it a good idea to spread that practice. I provided vailid points why not to do it that way!

  • Also, a DBA should know what tables will have a non-interrupted PK and use those instead of others that may have had deleted records.

    Besides the point that you cannot be certain of this even as a DBA (see the bold tekst in my original reply)...from a technical angle you must also be very careful to make any assumtions about identity keys!

    Idenitiy keys are keys without intrinsic meaning, other then that they are inserted with ever increasing numbers. Even this later assumption is dangerous for an outsider to make, as identity rules can be reseeded. Yet another thing that can happen which breaks this assumption is the occurance of identity inserts!

    Far fetched....not really in an environment where you are not in full control. Which means unless you have a personal database or one where no other developer or DBA can work on, you cannot make strong assumptions and neither can they unless you both follow explicit agreed upon rules.

    Over time, it will be hard to guarantee the foundaton your code is depending on, and when it breaks it wont be immediately clear where the problem is. With a numbers table at least you can prevent mutation/deletion by non-dba's after you populated the table. You cannot do that with most other tables without breaking stuff.

    Think of it what you will, but my advice is to use a dedicated numbers table and not to create fuzzy dependancies that **might** cause hard to track issues later. A malfunction numbers table can do a lot of damage to your data and when that happends, it might be hidden for quite a while before someone notices. It's a seed for data-coruption spanning large timescales, the worst thing that can happen to any important data!

  • I also seen solutions (and I am still using one, but looking for a way out) that use 'master.sys.All_Columns' as a seed table and generate numbers using the row_number()

    When I need something quick and dirty I'll use master.dbo.spt_values, which exists all the way back to SQL Server 7.0. It has a column called Number which -- amazingly enough -- contains numbers. Filter on type='P' and you'll get numbers from 0-255 in SQL Server 7.0 and 2000, and from 0-2047 in SQL Server 2005 and 2008.

    select number

    from master.dbo.spt_values

    where type = 'P'

    --
    Adam Machanic
    whoisactive

Viewing 15 posts - 61 through 75 (of 106 total)

You must be logged in to reply to this topic. Login to reply