Timing of GETDATE as a column default

Question

Post reply

Timing of GETDATE as a column default

jwminer

Old Hand

Points: 394
More actions
May 21, 2021 at 12:48 am

#3886117
Obscure fact: when your table column has GETDATE (or SYSUTCDATETIME) as a default value, the GETDATE is actually calculated when your Insert query BEGINS executing, not when the row is actually inserted into the table.

How have you handled this in practice when you want the datetime/datetime2 to more closely reflect when the data is inserted into the table? Sometimes the query takes a few seconds to run on queries where you really care about exactly when it was inserted into the table.
- This topic was modified 2 years, 11 months ago by jwminer.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Site Owners SSC Guru Points: 80320 More actions · Answer 1

Thanks for posting your issue and hopefully someone will answer soon.

This is an automated bump to increase visibility of your question.

Chittam SSC Veteran Points: 247 More actions · Answer 2

Hi,

You can use the undocumented function sys.fn_dblog() to identify the insert time from the transaction log.

Please refer to the following link for more details.

https://dba.stackexchange.com/questions/189485/how-can-i-find-time-of-an-insertion#:~:text=2%20Answers&text=You%20can%20find%20INSERT%20time,name)%20in%20the%20result%20set.

Thanks.

Jeff Moden SSC Guru Points: 1003873 More actions · Answer 3

Chittam wrote:

Hi,
You can use the undocumented function sys.fn_dblog() to identify the insert time from the transaction log.
Please refer to the following link for more details.
https://dba.stackexchange.com/questions/189485/how-can-i-find-time-of-an-insertion#:~:text=2%20Answers&text=You%20can%20find%20INSERT%20time,name)%20in%20the%20result%20set.
Thanks.

It'll be interesting to try to match rows up to log file entries using that method.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeffrey Williams SSC Guru Points: 90256 More actions · Answer 4

Since an INSERT is a single transaction - SQL cannot use a datetime other than when that transaction started. It doesn't matter how long the process actually takes - it must be treated as a single transaction, as if it all occurred at the exact same time - even if the insert were to take hours to be fully committed and hardened in the database.

If you really need to know when each row was inserted - then you would have to insert 1 row at a time so that each row's insert is a single transaction.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Jeff Moden SSC Guru Points: 1003873 More actions · Answer 5

Gosh... I agree that it's been quite a while but it seems to me that SQL Server 6.5 and 7 worked just fine in this area and I've not had cause to need such functionality since then, but I wonder why they felt compelled to change it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeffrey Williams SSC Guru Points: 90256 More actions · Answer 6

Jeff Moden wrote:

Gosh... I agree that it's been quite a while but it seems to me that SQL Server 6.5 and 7 worked just fine in this area and I've not had cause to need such functionality since then, but I wonder why they felt compelled to change it.

If SQL Server 6.5 and 7 worked that way - then it was incorrect. An insert or update statement is implemented as a single transaction that occurs at a specific time and all rows affected are affected as if the transaction occurred all at once, even if the actual insert or update were to take several hours to complete.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Eddie Wuerch SSChampion Points: 12532 More actions · Answer 7

GETDATE(), GETUTCDATE(), etc., are niladic nondeterministic functions.

A niladic function has no parameters. A nondeterministic function returns different values with the same inputs (the exact results of a GETDATE() call cannot be predicted).

In SQL Server, all niladic nondeterministic functions are fired once per resultset and the value is reused, with only one exception: NEWID(). NEWID() fires for each row in a resultset.

Eddie Wuerch
MCM: SQL

Jeff Moden SSC Guru Points: 1003873 More actions · Answer 8

Jeffrey Williams wrote:

Jeff Moden wrote:
Gosh... I agree that it's been quite a while but it seems to me that SQL Server 6.5 and 7 worked just fine in this area and I've not had cause to need such functionality since then, but I wonder why they felt compelled to change it.
If SQL Server 6.5 and 7 worked that way - then it was incorrect. An insert or update statement is implemented as a single transaction that occurs at a specific time and all rows affected are affected as if the transaction occurred all at once, even if the actual insert or update were to take several hours to complete.

Opinion noted. I strongly disagree but noted. 😀 Like I said, it's been a long time since I've needed such a thing but it would be nice to have the choice. And, don't say it doesn't need to be done. You may not know of a reason to do it but someone else (like the OP) may have a perfectly valid reason to do it just as surely as you'd need RAND() to produce a different value for each row upon insert or, even better, you have a UDF (inline or otherwise) use the return of NEWID() or the return of the output of a stored procedure. While it may not usually be something that people would need to do and would sometimes have to be very careful about how they used such a thing, it is nice to have such options (all of the above, which I've used a lot for more than one reason).

It's like saying that overloading a variable in a SELECT statement is a bad thing... when I briefly used Oracle (only 3 years), that pithy little restriction caused me a huge amount of development pain. I did it all the time in SQL Server and still do because SQL Server didn't have the ability to do string aggregation. Even the FOR XML PATH hack had some major pitfalls that were easily avoided by variable overloading.

When you said "If SQL Server 6.5 and 7 worked that way - then it was incorrect.", the only thing that was incorrect there was you thinking so. You have to understand that, as always and mostly forever, "It Depends". 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003873 More actions · Answer 9

Eddie Wuerch wrote:

GETDATE(), GETUTCDATE(), etc., are niladic nondeterministic functions.
A niladic function has no parameters. A nondeterministic function returns different values with the same inputs (the exact results of a GETDATE() call cannot be predicted).
In SQL Server, all niladic nondeterministic functions are fired once per resultset and the value is reused, with only one exception: NEWID(). NEWID() fires for each row in a resultset.

Eddie Wuerch wrote:

GETDATE(), GETUTCDATE(), etc., are niladic nondeterministic functions.
A niladic function has no parameters. A nondeterministic function returns different values with the same inputs (the exact results of a GETDATE() call cannot be predicted).
In SQL Server, all niladic nondeterministic functions are fired once per resultset and the value is reused, with only one exception: NEWID(). NEWID() fires for each row in a resultset.

That's the way it is now. Like I said, I could be incorrect but my memories were that GETDATE() used to return different values in 6.5 and 7.0. I also recall that enough people complained about it that they did fix it in 2000... again... IIRC. In fact, I also recall being disappointed that they did such a thing.

The bad parts of that are that I don't have those old versions to prove it one way or the other any more and that even if I could prove it, it no longer matters because it currently doesn't work that way. GETDATE() is now calculated once per query instead of once per row, just like RAND(), which is the dumbest implementation of all, IMHO. It's just incredible to me that you have to provide a random seed to make the RAND() function produce a different value for each row of a single query. What the hell were they thinking when they made that one??? Why couldn't they make it like NEWID()?

Ah... what do you expect from a company that incorrectly calculated that the year 1900 was a leap year in EXCEL? 😀 I'm thinking that "MS" is actually an acronym for "More_workaroundS" required. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Steve Jones - SSC Editor SSC Guru Points: 734502 More actions · Answer 10

This is one of those tough design decisions. I might say that until the transaction completes, the row isn't inserted. It shouldn't be visible, though I know you can get around that.

I agree that not calculating the time differently for different rows seems weird. I'd hope that this would be updated when there is a significant insert time, such as an ETL process. If I insert 1mm rows, they clearly weren't all inserted at once, though arguably, this helps me know this was a transaction.

As there are ordering issues for many things, I could see why timing matters, but in those cases, I'd assume when ordering of time matters, there are different transactions in play. Or the insert process needs to handle the ordering with some other column or explicit times.

Jeffrey Williams SSC Guru Points: 90256 More actions · Answer 11

Jeff Moden wrote:

When you said "If SQL Server 6.5 and 7 worked that way - then it was incorrect.", the only thing that was incorrect there was you thinking so. You have to understand that, as always and mostly forever, "It Depends". 😉

Not really - again, an insert or update is considered to have been committed all at once. At the time of the insert or update - all rows affected by that insert were affected at the same time.

The implementation details of how those rows are affected is meaningless. How would you determine the column value of each row? The insert or update would have to be ordered by some mechanism to determine when that row was inserted or updated.

It seems you are looking for a way to identify when a row was either logically or physically inserted or updated. Logically - the row was inserted/updated when the statement was executed. Not when it completed - not during the process - but at the moment the statement was executed.

As for identifying when the row was physically inserted/updated - that isn't possible since we cannot determine when that actually occurs.

Things like IDENTITY, SEQUENCE, NEWID(), NEWSEQUENTIALID() - can create different values because they are not determined by when the insert/update occurred. They are determined by the logical ordering of the data in the set being inserted OR are defined as row identifier functions in SQL Server (and therefore were specifically built to generate a different value for each row in a result set).

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Jeffrey Williams SSC Guru Points: 90256 More actions · Answer 12

Steve Jones - SSC Editor wrote:

I agree that not calculating the time differently for different rows seems weird. I'd hope that this would be updated when there is a significant insert time, such as an ETL process. If I insert 1mm rows, they clearly weren't all inserted at once, though arguably, this helps me know this was a transaction.

Logically - the rows were all inserted at the same time - at the exact moment the insert statement was executed. It really doesn't matter how long it takes to read the data or how long it takes to physically write the data to disk - logically the operation occurs when the statement was executed.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

jwminer Old Hand Points: 394 More actions · Answer 13

The multiple row example is definitely an interesting topic for debate, but the most interesting part of this issue to me is actually with a one row example. Let's say that you have a query doing a one row insert. But let's say the query may take a while to complete (could be a join, a subquery, a long resource wait, etc.). Even if you don't' call the GETDATE explicitly in your query, but let a date column populate with the default column value of GETDATE(), SQL gets the time when the query starts executing the time consuming step of the SELECT part of the query. Putting a "timestamp" column with a default value of GETDATE() falsely gives the assurance of knowing when a row was inserted and I don't think a lot of developers realize this.

Steve Jones - SSC Editor SSC Guru Points: 734502 More actions · Answer 14

That last sentence is often the key. The "don't realize this" is the issue.

The architect of SQL Server overall would say what Jeffrey said. The rows are all inserted at the same time because the transaction as to commit. If you want different times, use different transactions.