Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Fast inserting\reading\deleting of millions of records Expand / Collapse
Author
Message
Posted Monday, March 24, 2014 3:58 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Yesterday @ 6:56 AM
Points: 8, Visits: 96
Hi Grant,

I've done some further testing on the inserts and it only takes about 15% longer, so not too bad. I'm bulk loading the data via a C# application, using SqlBulkCopy, in batches of around 10k records.
The explain plan on a simple insert of 1 record shows that 100% of the Operator Cost is with the index.

I've profiled the inserting of all of the data and have the following wait stats although unfortunately this is not an area that I'm totally familiar with.
I've used the query provided by Paul Randall at http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/: -

WaitType			Wait_S	Resource_S	Signal_S	WaitCount	Percentage	AvgWait_S	AvgRes_S	AvgSig_S
PREEMPTIVE_OS_WRITEFILEGATHER 66.59 66.59 0 6 28.71 11.0985 11.0985 0
LATCH_EX 58.84 58.84 0 8 25.37 7.3553 7.355 0.0003
ASYNC_IO_COMPLETION 41.94 41.94 0 10 18.08 4.1937 4.1937 0
ASYNC_NETWORK_IO 28.87 26.64 2.23 53798 12.45 0.0005 0.0005 0
WRITELOG 23.24 22.81 0.43 6255 10.02 0.0037 0.0036 0.0001
IO_COMPLETION 8.29 8.26 0.03 17390 3.57 0.0005 0.0005 0


Thanks again for your support.

Darren
Post #1553938
Posted Monday, March 24, 2014 4:49 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 1:01 PM
Points: 15,558, Visits: 27,932
That looks like hardware as a bottleneck. It's the writes that are going slow.

----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of: SQL Server 2012 Query Performance Tuning
SQL Server 2008 Query Performance Tuning Distilled
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #1553951
Posted Monday, March 24, 2014 5:16 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 1:35 PM
Points: 36,788, Visits: 31,246
I can't speak to the INSERTs but the accidental CROSS JOIN in the SELECT code will beat the hell out of the hardware making it look like a hardware problem.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1553963
Posted Monday, March 24, 2014 5:27 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Yesterday @ 6:56 AM
Points: 8, Visits: 96
Hi Jeff,

I've fixed the problem with the CROSS JOIN, it looks like we've had that code in there for a good few years.
Here's the revised code: -

 SELECT
f1.ComponentNo
,f1.SubFormAnalyID
,f1.FeatureAnalyID
,f1.RepeatNo
,f1.SimulationNo
,f1.IsVarParam
,f1.LibItemID
,f1.LibVerNo
,f1.IntValue
,f1.RealValue
,f1.TextValue
,pc.CompAnalyID
,ft.EnumTypeID
,<<@ParentNo>>
FROM dbo.<<@NW_Feature>> AS f1
JOIN dbo.<<@NW_Component>> AS nc ON f1.ComponentNo = nc.ComponentNo
JOIN dbo.PCC_Component AS pc ON nc.PhysID = pc.PhysID
JOIN dbo.Feature AS ft ON f1.FeatureAnalyID = ft.AnalyID
WHERE f1.SimulationNo <= ( SELECT MAX(SimulationNo)
FROM <<@NW_Feature>> AS nwf1
WHERE f1.ComponentNo = nwf1.ComponentNo
AND f1.SubFormAnalyID = nwf1.SubFormAnalyID
AND f1.FeatureAnalyID = nwf1.FeatureAnalyID
AND f1.RepeatNo = nwf1.RepeatNo
)
AND ft.EnumTypeID NOT IN (14,15)
;

Thanks,

Darren
Post #1553970
Posted Monday, March 24, 2014 7:01 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 1:35 PM
Points: 36,788, Visits: 31,246
So do you still need the DISTINCT?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1554016
Posted Monday, March 24, 2014 7:24 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Yesterday @ 6:56 AM
Points: 8, Visits: 96
Sorry, no...I did mean to remove that.

Darren
Post #1554020
Posted Monday, March 24, 2014 9:02 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 5:34 AM
Points: 2,859, Visits: 3,187
You mention that a lot of customers will use SQL Express with your application.

I think the capabilities of SQL Express should form part of the design limits on your application. One of these aspects is that SQL Express can use a maximum of 1GB memory and 2CPU cores, and this may well impact the decision regarding a single table or multiple tables.

My gut feel is that multiple tables are a better choice in this environment. This is because:
a) The use of SQL Express implies the use of low-end hardware. Most customers will not have the IO capability to use the capabilities of the paid-for editions of SQL Server.
b) The memory needs for a given table in a multi-table setup are likely to be less than for a single table.
c) You want to make it easy for customers to delete a simulation they do not need. Separate tables are quick to drop, row deletion is not. If you make deletion slow then customers will not do it, then they will hit the 10GB DB size limit in SQL Express.
d) Customers are unlikely to run much in the way of index maintenance. If you delete rows from a table then index maintenance is needed to reclaim space. Dropping tables needs no maintenance.


Original author: SQL Server FineBuild 1-click install and best practice configuration of SQL Server 2014, 2012, 2008 R2, 2008 and 2005. 28 July 2014: now over 30,000 downloads.
Disclaimer: All information provided is a personal opinion that may not match reality.
Concept: "Pizza Apartheid" - the discrimination that separates those who earn enough in one day to buy a pizza if they want one, from those who can not.
Post #1554081
Posted Friday, March 28, 2014 7:56 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Yesterday @ 6:56 AM
Points: 8, Visits: 96
Ed,

Thanks for your reply, I think you're correct on all points.

From everyone's feedback so far it looks like what we have at the moment could well be the best solution, but we definitely need to look at re-factoring some of our hideous dynamic SQL.

Regards,

Darren
Post #1555917
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse