Can anyone explain the design decision here?

Question

Can anyone explain the design decision here?

Viewing 15 posts - 16 through 30 (of 36 total)

You must be logged in to reply to this topic. Login to reply

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 1

This can be simplified - without having to use fnTally to:

   With t(n)
     As (
 Select t.n 
   From (
 Values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) As t(n)
        )
      , dates ([Year])
     As (
 Select Top (year(getdate()) - @StartYear + 1)
        @StartYear + checksum(row_number() over(Order By @@spid)) - 1
   From t t1, t t2, t t3
        )
 Select StartDate = datetimefromparts(d.[Year], 1, 1, 0, 0, 0, 0)
      , d.[Year]
   From dates           d;

This has the same issue that @JeffModen commented on - if we pass in 2022 then we get no rows. Adding a second parameter for the end year solves that problem (as long as @EndYear is greater than or equal to @StartYear):

   With t(n)
     As (
 Select t.n 
   From (
 Values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) As t(n)
        )
      , dates ([Year])
     As (
 Select Top (@EndYear - @StartYear + 1)
        @StartYear + checksum(row_number() over(Order By @@spid)) - 1
   From t t1, t t2, t t3
        )
 Select StartDate = datetimefromparts(d.[Year], 1, 1, 0, 0, 0, 0)
      , d.[Year]
   From dates           d;

This can be handled in other ways - depending on how the function is actually used.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

ScottPletcher SSC Guru Points: 101223 More actions · Answer 2

SET ANSI_NULLS ON;
SET QUOTED_IDENTIFIER ON;
GO
ALTER FUNCTION [DP].[GetYears] ( @StartYear AS INT )
RETURNS TABLE
AS
RETURN
WITH
tally10 AS (
    SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number)
),
tally1000 AS (
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1 AS number 
    FROM tally10 c1 CROSS JOIN tally10 c2 CROSS JOIN tally10 c3
)
SELECT DATEADD(YEAR, t.number, StartYear) AS StartDate, 
    YEAR(DATEADD(YEAR, t.number, StartYear)) AS Year
FROM (
    SELECT CAST(DATEADD(YEAR, @StartYear - 1900, 0) AS date) AS StartYear
) AS params
INNER JOIN tally1000 t ON t.number BETWEEN 0 AND DATEDIFF(YEAR, StartYear, GETDATE())
/*end of func*/
GO

This reply was modified 4 years, 10 months ago by ScottPletcher. Reason: Added SETs before function code

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 3

SwePeso wrote:

The point is not using exactly that table, it could be another table. The point was to demonstrate the logic in the original query, in a easier fashion.

Totally understood but... Unless you explain that every time you use it, there will be people that use master.dbo.spt_values for all their code simply because they don't know better.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ScottPletcher SSC Guru Points: 101223 More actions · Answer 4

Nvm, OOPS, I posted this same code earlier.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 5

Apparently, the Op left the building.

With the idea of doing some bullet-proofing to return no rows for "illegal" years instead of producing an error (induced "silent failure) to expand upon the original expected output and to instill the idea of documentation (it was a lack of documentation that prompted the original question on this thread), here's how I handle rewrites of such code... and, yeah, if I were in the OP's position, I'd be doing a rewrite if, for nothing else, to clearly document what the function does. And, yeah, this is the same way I write production code in real life. The only thing missing is the ticket number and description for the ticket I would have had to open to cover my time.

Details are in the code... where they should be.

 CREATE OR ALTER FUNCTION dbo.getYears(@Year INT)
/**********************************************************************************************************************
 Purpose:
 Given a 4 digit year (as an INT) that's less than the current year and between 1753 and 9999 inclusive, return a 
 column of start dates (first of the year) and 4 digit years for each year from the given 4 digit year through the 
 current year.

 For years > the current year or outside the 1753 thru 9999 years, the function will return no rows.
-----------------------------------------------------------------------------------------------------------------------
 Returns:
 StartDate (DATETIME) Note... All times are at midnight. For example: 2017-01-01 00:00:00.000
 Year      (INT)
-----------------------------------------------------------------------------------------------------------------------
 Usage Examples:
--===== Return all years from hardcoded 2017 thru the current year.
 SELECT StartDate, [Year]
   FROM dbo.getYears(2017)
;
--===== Return all years from year defined by a variable thru the current year.
DECLARE @SomeYear INT = 2017 --or SMALLINT
;
 SELECT StartDate, [Year]
   FROM dbo.getYears(@SomeYear)
;
--===== Return all years for each row of another table (Relational Multiplication).
     -- You could also substitute the hardcoded 2017 with a variable.
 SELECT obj.object_id, yr.StartDate, yr.[Year]
   FROM dbo.getYears(2017) yr
  CROSS JOIN sys.objects obj --This represents "another table"
  ORDER BY obj.object_id, yr.[Year]
;
-----------------------------------------------------------------------------------------------------------------------
 Programmer Notes:
 1. There are no dependencies in this function.
 2. This is a high performance iTVF (Inline Table Valued Function) not to be confused with slower performaning Scalar
    or mTVF (Multi-Statement Valued Function).  Show this note to the DBA if they object to any function usage.
-----------------------------------------------------------------------------------------------------------------------
 Revision History:
 Rev 00 - Date Unknown- Author Unknown
                      - Ref: https://www.sqlservercentral.com/forums/topic/can-anyone-explain-the-design-decision-here
 Rev 01 - 18 Feb 2021 - Jeff Moden
                      - Total rewrite for simplicity, add constraints to prevent failures, and document.
                      - Input/Output remains compatible with the original including zero row output under certain 
                        conditions rather than producing errors.
                      - References (for the Tally(N) cte): 
                        1. https://www.sqlservercentral.com/scripts/create-a-tally-function-fntally
                        2. https://www.itprotoday.com/sql-server/virtual-auxiliary-table-numbers
**********************************************************************************************************************/
  RETURNS TABLE WITH SCHEMABINDING AS 
   RETURN WITH
    E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))E0(N)) --Creates up to 10 rows
,Tally(N) AS (--==== "Pseudo Cursor" to create numbers from 0 to a max of 9999.
              SELECT TOP (IIF(YEAR(GETDATE()) >= @Year AND @Year BETWEEN 1753 AND 9999, YEAR(GETDATE())-@Year+1, 0))
                     ROW_NUMBER() OVER (ORDER BY @@SPID)-1
                FROM E1 a, E1 b ,E1 c, E1 d) --10^4 = 10000 max. rows
   SELECT  StartDate = DATEADD(yy,@Year-1900+t.N,0)
          ,[Year]    = YEAR(DATEADD(yy,@Year-1900+t.N,0))
     FROM Tally t
;
GO

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Kristen-173977 SSCrazy Eights Points: 8687 More actions · Answer 6

I have an actual TALLY table in my database for Numbers ... is that bad? should I be using a Function?

Obviously there is a limit on the number of values in my table, and I could inadvertently code something that needed more (I can cross join it, but on order to do that I would have to be aware that my usage needed more numbers than were available)

edwardwill SSCertifiable Points: 5969 More actions · Answer 7

Thank you so much everyone for your interesting, thoughtful and useful answers and contributions. I didn't 'leave the building' but I'm in the UK and after work stops, I stop working (I'm 64 and presenteeism no longer holds much appeal). I agree that the function I presented was poorly-documented, potentially buggy and over-engineered, but I still don't understand the syntax! However, I am not asking for anyone to do my homework for me.

Once again, many thanks.

Edward

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 8

Kristen-173977 wrote:

I have an actual TALLY table in my database for Numbers ... is that bad? should I be using a Function?
Obviously there is a limit on the number of values in my table, and I could inadvertently code something that needed more (I can cross join it, but on order to do that I would have to be aware that my usage needed more numbers than were available)

Use of a properly indexed physical Tally Table will frequently cause code that uses it to run faster and with less CPU than using cCTEs (Itzik Ben-Gans method, which I refer to as "Cascading CTEs) because it's small enough to live in memory without getting booted out of memory, especially if it sees frequent use.

The use of a physical Tally Table DOES cause some DBAs a concern because, especially with frequent use, causes a ton of Logical Reads (from memory). The reason why they concern themselves with that is that they (including me) will frequently use the number of Logical Reads as an indication of code that can probably be made better. The cCTE method produces zero reads on it's own and is almost as fast and can be as fast as using a physical Tally Table, and so a lot of people will use the cCTE method buried in a general fnTally function or as inline code in other functions.

I have both in place so I have the choice depending on the situation at hand and will also build a cCTE into a function just so that it doesn't have an external dependency. "It Depends".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 9

edwardwill wrote:

Thank you so much everyone for your interesting, thoughtful and useful answers and contributions. I didn't 'leave the building' but I'm in the UK and after work stops, I stop working (I'm 64 and presenteeism no longer holds much appeal). I agree that the function I presented was poorly-documented, potentially buggy and over-engineered, but I still don't understand the syntax! However, I am not asking for anyone to do my homework for me.
Once again, many thanks. Edward

It's already written. Look at the code I posted with all the comments... there are a couple of references under "Rev 01" in the flowerbox header. The second reference was written by the guy that invented the technique. If that doesn't do it for you, let me know... I have an introduction in one of my articles that does a hand-over-hand explanation the I can point you to.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

edwardwill SSCertifiable Points: 5969 More actions · Answer 10

WITH
    E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))E0(N))

I don't understand this syntax (please note that I'm not a DBA - my day job is writing C#). What is the E0(N) at the end? I understand (I think) that E1(N) is declaring a CTE called E1 which contains a single column N. I can more or less figure out the rest of the function. I think.

This reply was modified 4 years, 10 months ago by edwardwill.

ScottPletcher SSC Guru Points: 101223 More actions · Answer 11

SQL works based on tables. By definition, a SQL Server table must have a name.

To look at my own version of this:

SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number)

The VALUES clause is used to build a table (or a table structure). The "[AS] table_name" provides a name for that inline table. That's it. The AS is optional, so it can be written:

SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) numbers(number)

In either case, the "numbers" is a table name. That's all it is, and it must be there because SQL requires a name.

The "(number)" provides a column name. For an inline table, the columns must have names too. The SELECT could assign the names, or you can assign the names at the end, with the table name.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 12

@JeffModen - your solution and my solution are very close to the same. If we pass in 2022 as the starting year - both version return an empty set. If we pass in 2021 both version return a single row for 2021...

In my version, I include a checksum to wrap the row_number function - this is to prevent implicit conversions from int to bigint (the return value from row_number is a bigint) that could cause issues with the cardinality estimator.

As far as I can tell - the only portion of your check in the TOP statement that is needed is the check for years between 1753 and 9999. In my version we can address that the same way - or we can change to this and return a date data type: datefromparts(d.[Year], 1, 1).

My version calculates the year in the 'tally' portion so it doesn't have to be re-calculated on output. In your version, you use DATEADD twice and an additional YEAR function.

Any thoughts on these differences?

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

edwardwill SSCertifiable Points: 5969 More actions · Answer 13

Ah thanks. That explains it all. I'm going to see if I can't leverage this to do some CSV parsing (I have a task to deal with a bunch of CSV files and write the results to various tables). It's been a good week for me, learning-wise; thank you all again.

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 14

edwardwill wrote:

WITH
    E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))E0(N))
I don't understand this syntax (please note that I'm not a DBA - my day job is writing C#). What is the E0(N) at the end? I understand (I think) that E1(N) is declaring a CTE called E1 which contains a single column N. I can more or less figure out the rest of the function. I think.

E0 is a table "alias" for the derived table created by the VALUE clauses (Table Value Constructor is the correct name) and the (N) is the column name for the column created by the VALUES clause. "E0" is not a part of SQL Server... it's just my naming convention for this type of thing. Same goes for "E1".

If we extract the SELECT and use it as a single query, you can see what it does.

 SELECT *
   FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))E0(N)
;

Here are the results. As you can see, all it does is return ten 1's and it doesn't matter if it returns 1's, Nulls, or whatever. The important part is that it returns ten rows as a "Row Source". The E0 comes from the fact that 10^0 (or 10E0) is the ones-position in a number. E1 is 10^1 (or 10E1) and is the tens-position in a number.

Again, the number isn't important. We're using the rows created as a "Pseudo Cursor" to drive the ROW_NUMBER() function instead of a "loop". We are simple generating rows to power a "set based loop" known as a "Pseudo Cursor".

A much more detailed explanation can be found at the following article in the "Generating Rows" section of the article. Here's the link to that article.

https://www.sqlservercentral.com/articles/generating-test-data-part-1-generating-random-integers-and-floats-1

The source of rows can be a table that we don't use any content from or it can be a string of ten ones Cross Joined multiple times (also known as "Relational Multiplication") to create a huge "presence of rows" or "Pseudo Cursor" to use instead of a loop. It works because, behind every SELECT, is a very high speed machine language "loop".

Heh... make sure you read the two articles reference in the REV 01 comment in the code I posted. It will change your life when it comes to doing things in SQL Server. The article above can also have a major impact. Some one once asked me what the most important thing that has has led to my success in SQL Server is. My response was "The ability to very quickly make large amounts of test data for virtually any situation".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 15

edwardwill wrote:

Ah thanks. That explains it all. I'm going to see if I can't leverage this to do some CSV parsing (I have a task to deal with a bunch of CSV files and write the results to various tables). It's been a good week for me, learning-wise; thank you all again.

If you want to split strings - then you can use either STRING_SPLIT or a custom function that can be found on this site. Here is the link to the article: https://www.sqlservercentral.com/articles/reaping-the-benefits-of-the-window-functions-in-t-sql-2

I linked to the update by Eirikur - you may want to look at the original article: https://www.sqlservercentral.com/articles/tally-oh-an-improved-sql-8k-%e2%80%9ccsv-splitter%e2%80%9d-function

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs