record loading

Question

Post reply

record loading

Bruin

SSCrazy

Points: 2107

December 7, 2022 at 9:27 pm

Go to Answer

#4123105

I have a table from a Linked Server that I'm pulling data and would like to know if there is a more efficient way based upon this table design. I need to full reload and not use the "t_stamp >= DATEDIFF_BIG(MS, '19691231 20:00:00', GETDATE()) - (1000 * 60 * 30))" in the where clause.

I know what the min and max for t_stamp is for the table, but that is such a large pull I was wondering if I could break up the select

and grab say 50,000 records between the min and max until no more records to process. The select is doing and Insert into a staging table for other processing.

Thanks, and if more info is needed ..

Select 
distinct ql.Line,ql.Tag_Description,sd.floatvalue as 'CaptureValue',DATEADD(s,t_stamp/1000,'1969-12-31 20:00:00')  as 'DateRecorded' 
from
SM_Query_Lookups ql,
rem.db1.dbo.sqlt_data_1_2021_08 sd
where
(t_stamp >= DATEDIFF_BIG(MS, '19691231 20:00:00', GETDATE()) - (1000 * 60 * 30))
and ql.Tag_ID = sd.tagid

CREATE TABLE [dbo].[sqlt_data_1_2021_08](
[tagid] [int] NOT NULL,
[intvalue] [bigint] NULL,
[floatvalue] [float] NULL,
[stringvalue] [nvarchar](255) NULL,
[datevalue] [datetime] NULL,
[dataintegrity] [int] NULL,
[t_stamp] [bigint] NOT NULL,
PRIMARY KEY CLUSTERED 
(
[tagid] ASC,
[t_stamp] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

Viewing 15 posts - 1 through 15 (of 56 total)

You must be logged in to reply to this topic. Login to reply

ratbak Default port Points: 1418 More actions · Answer 1

Take a look at https://techcommunity.microsoft.com/t5/sql-server-blog/sql-server-2016-minimal-logging-and-impact-of-the-batchsize-in/ba-p/385537. You may be able to take advantage of minimal logging to bulk load the data.

BCP or bulk insert are an option for loading a table from a file exported out of the source database using bcp. If the table is very large, you can use batch size to control transaction log growth.

How many rows in the source table ?

Bruin SSCrazy Points: 2107 More actions · Answer 2

The source table has 2 billion but not all records are chosen since it does a lookup ql.Tag_ID = sd.tagid

Any example of a script to do it in batches of 50K and using the min\max of T_Stamp as the driver?

THanks.

Jeff Moden SSC Guru Points: 1004617 More actions · Answer 3

Bruin wrote:

The source table has 2 billion but not all records are chosen since it does a lookup ql.Tag_ID = sd.tagid
Any example of a script to do it in batches of 50K and using the min\max of T_Stamp as the driver?
THanks.

I don't know which table is the linked server table but joining to it means that all rows must be examined. You need something more discreet.

You, apparently, have an "accidental" many-to-many join going on and you have the need to use DISTINCT because of that problem.

Consider using OPENQUERY so the query will run on the remote server and give it some decent criteria to pull the data with. It'll run a lot fast.

I also recommend that you avoid using the old fashion method of joining in whatever fix you come up with.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Bruin SSCrazy Points: 2107 More actions · Answer 4

This is the table name from the Linked Server -- rem.db1.dbo.sqlt_data_1_2021_08

SM_Query_Lookups tbl is on the local server .. The remote server contains all of the values I'm using for my insert to Staging table on

local server..

This reply was modified 2 years, 7 months ago by Bruin.

Jeff Moden SSC Guru Points: 1004617 More actions · Answer 5

I don't know the data well enough to say with conviction, but there has to be some criteria that you could use in an open query, other than a join, to bring the linked server data over in a hurry and the further filter it with a local join.

Of course, with the way things have been going today, I might just be low on caffeine.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Bruin SSCrazy Points: 2107 More actions · Answer 6

In my original query I was using this on linked server

t_stamp >= DATEDIFF_BIG(MS, '19691231 20:00:00', GETDATE()) - (1000 * 60 * 30)) along with the local join..

Bruin SSCrazy Points: 2107 More actions · Answer 7

I'm having syntax problems doing OpenQuery with pulling values..

DATEADD(s,t_stamp/1000,'1969-12-31 20:00:00') as 'DateRecorded'

Jeff Moden SSC Guru Points: 1004617 More actions · Answer 8

Gosh, Bruin... how do you expect anyone to help with such little information?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Bruin SSCrazy Points: 2107 More actions · Answer 9

Sorry Jeff.

I have been working with open query trying to get the values I need for my Insert into Staging Table. The query I have is returning some of my values but not getting the local table values.

Insert Into Tag_Data_Capture_staging

(EquipID,TagName,CaptureValue,DateRecorded)

Values needed from query

ql.Line,ql.Tag_Description,sd.floatvalue as 'CaptureValue', DATEADD(s,t_stamp/1000,'1969-12-31 20:00:00') as 'DateRecorded'

Also, If I know what the min and max for t_stamp is for the remote table and seeing how the remote table is large I was wondering if I could break up the select and grab say 50,000 records between the min and max until no more records to process. Not sure how to incorporate that into script.

SELECT 
*
FROM OPENQUERY (srv1, 'SELECT tagid,floatvalue,DATEADD(s,t_stamp/1000,''1969-12-31 20:00:00'')  as ''DateRecorded''
FROM rem.dbo.sqlt_data_4_2022_12') sd
left 
  JOIN  process.dbo.SM_Query_Lookups ql 
    on ql.Tag_ID = rem.tagid 
OPTION (MAXDOP 1);

frederico_fonseca SSCoach Points: 16237 More actions · Answer 10

I'm sure you have heard of

Dynamic SQL
Synonyms
where field in (a,b,c,d)

All the above can help you with

Build a query that fetches x number of rows only
prevent the bad practice you have of using 4 names on your objects
deal with the fact that in many many cases a join from a local table to a remote table MAY result on the full remote table being copied locally before the join is applied. (you can see if this is the case by looking at the explain plan in both local and remote servers when your query is executing)

the remaining work (e.g. build the process) its really up to you as you are the one getting paid to do this work.

Jeff Moden SSC Guru Points: 1004617 More actions · Answer 11

Bruin wrote:

Sorry Jeff.
I have been working with open query trying to get the values I need for my Insert into Staging Table. The query I have is returning some of my values but not getting the local table values.
Insert Into Tag_Data_Capture_staging (EquipID,TagName,CaptureValue,DateRecorded)
Values needed from query
ql.Line,ql.Tag_Description,sd.floatvalue as 'CaptureValue', DATEADD(s,t_stamp/1000,'1969-12-31 20:00:00') as 'DateRecorded'
Also, If I know what the min and max for t_stamp is for the remote table and seeing how the remote table is large I was wondering if I could break up the select and grab say 50,000 records between the min and max until no more records to process. Not sure how to incorporate that into script.
SELECT 
*
FROM OPENQUERY (srv1, 'SELECT tagid,floatvalue,DATEADD(s,t_stamp/1000,''1969-12-31 20:00:00'')  as ''DateRecorded''
FROM rem.dbo.sqlt_data_4_2022_12') sd
left 
  JOIN  process.dbo.SM_Query_Lookups ql 
    on ql.Tag_ID = rem.tagid 
OPTION (MAXDOP 1);

You're still joining to the OPENQUERY. That means the OPENQUERY has to read all the rows from the linked server. There's also no criteria on the linked table. How many rows are you retrieving from the linked table?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ratbak Default port Points: 1418 More actions · Answer 12

How many relevant unique tag_ids are there? Are they relatively static, or are new ones inserted continually?

Can't you insert those into a table on the linked server (or just hardcode in an in clause if very few) so that you can avoid the join back to the local table? Even if rapidly changing, pushing those to the linked server if necessary could probably still be relatively quick if indexed correctly.

Bruin SSCrazy Points: 2107 More actions · Answer 13

Bruin

SSCrazy

Points: 2107

December 8, 2022 at 11:34 pm

#4123573

there are over 500 unique tag_ids

Bruin SSCrazy Points: 2107 More actions · Answer 14

Insert Into Tag_Data_Capture_staging
(TagName,CaptureValue,DateRecorded)
select * 
from OPENQUERY(srv1, 'select tagid,floatvalue,DATEADD(s,t_stamp/1000,''1969-12-31 20:00:00'') from rem.ignition.dbo.sqlt_data_1_2021_12
where DATEADD(s,t_stamp/1000,''1969-12-31 20:00:00'') between  ''12-09-2021'' AND ''12-10-2021'' and floatvalue is not null');

I did the Insert to local table and used criteria to narrow down data that is retrieved. DO you see any better way to do the Open Query ?

Thanks for your replies.

record loading

Cookies on SQLServerCentral