November 28, 2017 at 6:25 am
@chef423 ,
Would you post the CREATE TABLE statement (with all the indexes) for you're Customer table, please? I'd like to generate some test data for all of this because, as a very wise man once said, "One good test is worth a thousand expert opinions".
--Jeff Moden
Change is inevitable... Change for the better is not.
November 28, 2017 at 9:49 am
Jeff Moden - Tuesday, November 28, 2017 5:45 AMI hope you never take it negatively. Especially since I always learn from our discussions. 😉A couple of quick questions.
1. What would you use to maintain the separate table?
2. What are the "negative consequences" of having a persisted computed column, in this particular case?
3. Also, the original problem was to find everyone with a given birth-month. Or, at least that is my impression. I'm probably missing some logic but why are you looking only for people that have the same day of the same month (11/27)?Here's the part of the OP's original post that I'm speaking of.
Looking to write a simple query that will return the Birthday's, by month if I choose a Date Range in the month spcified
So this code (below) because I chose January, will ignore the year and day.
select cust_id, cust_fname, cust_lname, cust_dob from birthdays
where cust_active = 1 and cust_dob between '1995-01-26 00:00:00.000' and '1995-01-27 00:00:00.000'So that query (should) will return all birthdays in the month of Jan, no matter the year or date.
The obvious is the @MonthStart code above, I just dont know how to lay it out correctly.Also, no one can execute your tests because they don't have your "Referral" table.
I hope you never take it negatively. Especially since I always learn from our discussions. 😉
Absolutely not! Don't get me wrong, I like being right more than I like being wrong... but... In general, I try to work with the idea that I might be wrong.
That's not "lack of self confidence" thing, nor is it limited to SQL Server... It's been my experience that the biggest blunders are cause by people who, 1) "balls to bones", believe something that simply isn't so. 2) have egos that won't allow them to say, "I don't know", acknowledge any of their own short comings or admit to their mistakes. 3) have simply stopped making an effort to learn.
The "I might be wrong" thing is just me actively doing my best not to be one of those people...
So... While my ego enjoys being right, I know that being proven wrong is what will make be better tomorrow than I was yesterday.
A couple of quick questions.
1. What would you use to maintain the separate table?
2. What are the "negative consequences" of having a persisted computed column, in this particular case?
3. Also, the original problem was to find everyone with a given birth-month. Or, at least that is my impression. I'm probably missing some logic but why are you looking only for people that have the same day of the same month (11/27)?
a1. I'd be inclined to either run the run it out to the current month and use one of the end of month jobs to add to it over time... or... Given it's size, I might even be open to the idea of running 30 years in advance and let it be a surprise to the guy or gal that has by job 30 years from now.
a2. I knew when I wrote it I'd get called on it... Now that I am called on it, I probably should have used different phrasing... Yea, it's a 3NF violation (the transitive dependency thing) but I was actually thinking about the problems that come of using scalar UDFs when I wrote it... Clearly not applicable here. That said, I would still prefer to avoid the NF violation...
a3. I didn't actually go back the start of the thread. I got drawn into the back & forth between you and Scott and that prompted me to knock out a quick & dirty test just to satisfy my own curiosity. It wasn't necessarily intended to be an answer to the OP's actual question or satisfy his specific requirements.
That said, I did use "b_day" as the left key column due to it's higher selectivity... Looking at it today, swapping the the b_month into the left position would have allowed both "month-day" searches and "moth only" searches... Chalk it up to "midnight tunnel vision"...
Looking to write a simple query that will return the Birthday's, by month if I choose a Date Range in the month spcified
So this code (below) because I chose January, will ignore the year and day.
select cust_id, cust_fname, cust_lname, cust_dob from birthdays
where cust_active = 1 and cust_dob between '1995-01-26 00:00:00.000' and '1995-01-27 00:00:00.000'So that query (should) will return .So that query (should) will return all birthdays in the month of Jan, no matter the year or dateall birthdays in the month of Jan, no matter the year or date.
The obvious is the @MonthStart code above, I just dont know how to lay it out correctly.
See "a3" above... My post was more about my interest in the back & forth between you & Scott than the the OP question...
Also, no one can execute your tests because they don't have your "Referral" table.
Pure laziness on my part... When I wrote it, I was just trying to satisfy my own curiosity with the fewest possible key strokes. I had no intention of actually getting into the conversation. The only reason I made the post was because I thought the massive difference in IO compared minute difference in execution time was interesting.
November 28, 2017 at 11:02 am
As I stated in my comments, IF the queries are (almost) always for birthdays in a month / range of days, a clustered index on a base-dated dob should give the best performance. And if upcoming birthdays are being viewed, presumably you'd want to see them in month-day order. So the code is something like:
where monthday_of_birth >= '19000101' and monthday_of_birth < '19000201'
order by monthday_of_birth
No extra indexes required, no sort required (again, given my stated condition that table usage was (almost) always for month/month-day range.
If you need to limit to certain age(s), the year_of_birth would obviously be compared as well.
Yes, there will be index fragmentation, but since:
1) The huge overhead of additional covering indexes is not needed. If not clus by date, the "tipping point" would be easy to reach (one would expect even 1 month, roughly 8.5% of rows, to easily reach it), forcing a table scans unless you created covering index(es).
2) All tables have some fragmentation, even "sacred" identity-clustered tables, sometimes even large amounts of fragmentation. Although, yes, naturally they are less prone to it. Identity clustering definitely has its place, but it should not be default for any table.
3) Since SELECTs are orders-of-magnitude more frequent than INSERTs, sometimes by 1M to 1, the overall I/Os should still be less. Rebuilds are available, and in my case would be online anyway. The I/O to rebuild periodically should be way less than the I/O to maintain extra indexes and sort the results.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 28, 2017 at 11:07 am
ScottPletcher - Tuesday, November 28, 2017 11:02 AMAs I stated in my comments, IF the queries are (almost) always for birthdays in a month / range of days, a clustered index on a base-dated dob should give the best performance. And if upcoming birthdays are being viewed, presumably you'd want to see them in month-day order. So the code is something like:where monthday_of_birth >= '19000101' and monthday_of_birth < '19000201'
order by monthday_of_birthNo extra indexes required, no sort required (again, given my stated condition that table usage was (almost) always for month/month-day range.
If you need to limit to certain age(s), the year_of_birth would obviously be compared as well.
Yes, there will be index fragmentation, but since:
1) The huge overhead of additional covering indexes is not needed. If not clus by date, the "tipping point" would be easy to reach (one would expect even 1 month, roughly 8.5% of rows, to easily reach it), forcing a table scans unless you created covering index(es).
2) All tables have some fragmentation, even "sacred" identity-clustered tables, sometimes even large amounts of fragmentation. Although, yes, naturally they are less prone to it. Identity clustering definitely has its place, but it should not be default for any table.
3) Since SELECTs are orders-of-magnitude more frequent than INSERTs, sometimes by 1M to 1, the overall I/Os should still be less. Rebuilds are available, and in my case would be online anyway. The I/O to rebuild periodically should be way less than the I/O to maintain extra indexes and sort the results.
But why would you query the Customer table (almost) always for birthdays in a month/range of days? Remember, the op stated the actual table is a Customer table.
November 28, 2017 at 11:40 am
Lynn Pettis - Tuesday, November 28, 2017 11:07 AMScottPletcher - Tuesday, November 28, 2017 11:02 AMAs I stated in my comments, IF the queries are (almost) always for birthdays in a month / range of days, a clustered index on a base-dated dob should give the best performance. And if upcoming birthdays are being viewed, presumably you'd want to see them in month-day order. So the code is something like:where monthday_of_birth >= '19000101' and monthday_of_birth < '19000201'
order by monthday_of_birthNo extra indexes required, no sort required (again, given my stated condition that table usage was (almost) always for month/month-day range.
If you need to limit to certain age(s), the year_of_birth would obviously be compared as well.
Yes, there will be index fragmentation, but since:
1) The huge overhead of additional covering indexes is not needed. If not clus by date, the "tipping point" would be easy to reach (one would expect even 1 month, roughly 8.5% of rows, to easily reach it), forcing a table scans unless you created covering index(es).
2) All tables have some fragmentation, even "sacred" identity-clustered tables, sometimes even large amounts of fragmentation. Although, yes, naturally they are less prone to it. Identity clustering definitely has its place, but it should not be default for any table.
3) Since SELECTs are orders-of-magnitude more frequent than INSERTs, sometimes by 1M to 1, the overall I/Os should still be less. Rebuilds are available, and in my case would be online anyway. The I/O to rebuild periodically should be way less than the I/O to maintain extra indexes and sort the results.But why would you query the Customer table (almost) always for birthdays in a month/range of days? Remember, the op stated the actual table is a Customer table.
Birthday party hosting company? Other special event seller that happens more often on bdays? The real point is to cluster a table based on its actual needs and not on a predefined, super-simplistic rule of thumb. There's no problem with a clus key longer than 4 bytes and/or with multiple columns if that's what the table actually needs.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 29, 2017 at 4:38 am
I'd assume that "birthdays next month" is not only kind of queries which might be performed in the company.
I'd expect also "birthdays next week", "birthdays on Chritmas holidays", etc.
On top of it - it would be desirable if returned recordsets would be sorted like "neares birthdays first".
Considering that I'd create "yearless dob" column, which would make it easy to fulfil any possible requirements of those kinds.
_____________
Code for TallyGenerator
November 29, 2017 at 5:49 am
All the speculation is good... I was hoping the OP would return with the definition of the Customer table so that we can demonstrate the performance of our speculations but seems to have flown the coop on this one.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 31 through 38 (of 38 total)
You must be logged in to reply to this topic. Login to reply