Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Copy Between 2 and 3 billion rows from a heap table with no key to a new partioned table.


Copy Between 2 and 3 billion rows from a heap table with no key to a new partioned table.

Author
Message
mishka-723908
mishka-723908
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 241
Hello,

i need to copy Between 2 and 3 billion rows from a heap table with no identity to a new partioned table. The primary key is made up of a smallint value that represents a Month and a loan_number.

--The partion key is a small int value.
--I have tried the approach of selecting by the partition key, but as you can imagine because of the amount of data each select can take quite some time.
--I also tried using an import\export task it started moving some data, but it is also extremly slow.
--The new table has 24 partitions. each partition represents 24 values of the partition key from 0 - 552

Here is the definition of the partitioned table. The heap tablle is the same except it does not have the identity column created_by\date attributes.

--Partitioned table
CREATE TABLE dbo.qc_partition
(
   qccq_partition_id         BIGINT IDENTITY(1,1)         
,   pool_id            CHAR(3)         NOT NULL
,   deal_no            CHAR(5)         NOT NULL
,   group_no            CHAR(3)         NOT NULL
,   servicer            VARCHAR(4)      NULL
,   loan_id            CHAR(6)         NOT NULL
,   exloan_id         VARCHAR(18)      NOT   NULL
,   last_int_p         SMALLDATETIME   NULL
,   balance            MONEY         NULL
,   int_rate            NUMERIC(6, 3)   NULL
,   totpmt_due         MONEY         NULL
,   sched_principal      ,   sched_mnth_p         MONEY         NULL
,   mba_stat            VARCHAR(1)      NULL
,   ots_stat            VARCHAR(1)      NULL
,   payment_hist         VARCHAR(12)      NULL
,   exception         VARCHAR(1)      NULL
,   start_date         SMALLDATETIME   NULL
,   end_date            SMALLDATETIME   NULL
,   fc_end_typ         VARCHAR(1)      NULL
,   payoff_d            SMALLDATETIME   NULL
,   payoff_r            VARCHAR(1)      NULL
,   sell_date            SMALLDATETIME   NULL
,   inv_bal            MONEY         NULL
,   next_percent         NUMERIC(6, 3)   NULL
,   loss_val            MONEY         NULL
,   net_rate            NUMERIC(7, 4)   NULL
,   period            SMALLINT      NOT NULL
,   file_name         CHAR(8)         NOT NULL
,   created_by         VARCHAR(70)      NOT NULL   DEFAULT CURRENT_USER
,   created_date         DATETIME      NOT NULL   DEFAULT (GETDATE())
) ON partition_scheme_qc (period)


Does anyone have any ideas?

Thanks,
Michael
homebrew01
homebrew01
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2976 Visits: 9071
I usually handle large data moves in batches by setting up a loop and copying a set amount. Since these are during production I put in a delay to prevent hogging all the cpu. I don't know if this is practical for your situation.

InsertMore:
WAITFOR DELAY '00:00:05' -- 5 second delay allow other process some CPU

INSERT top (100000) INTO NewTable      
SELECT columns
FROM OldTable
where id between @ID_First and @ID_Last
and ID not in (select ID from NewTable) -- Not already inserted

if @@rowcount > 0 goto InsertMore





mishka-723908
mishka-723908
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 241
The Heap table does not have an Id column only the partioned table does.

thanks for the input.
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44766 Visits: 39845
I think you've made a mistake in chosing what to partition by. My recommenation would be to partition by month or year of the column that best represents the age of the row. Hopefully, the older the row, the fewer updates it will actually have and THAT is what will save your butt at reindexing time over time.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
mishka-723908
mishka-723908
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 241
Hello,

1st of all sorry for creating a duplicate topic it was an user error. I noticed it only after I created the 2nd. I was not able to delete.

The current partition key does represent Months. 1 = '01/1989', 2 = '02/1989', 3 = '03/1989'..... Any Ideas on what I can do to copy the data to the partition table?

thank you,
mishka-723908
mishka-723908
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 241
Hello,

Does anyone have any ideas on the copying of the data in my situation?

Thanks
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44766 Visits: 39845
mishka-723908 (1/10/2013)
Hello,

Does anyone have any ideas on the copying of the data in my situation?

Thanks


Yes. First, get rid of that partition key you're using and base it on the date, instead.

Then, split one month at a time off the original source table and load it into the final table. I believe you should drop most of your the partitions you made in the target table because there's a trick to loading a whole "split" table in just milliseconds if you have a new empty partition.

I haven't done this in a while so I'd have to do just like I recommend you do. Lookup the method in Books Online.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24146 Visits: 37920
Jeff Moden (1/10/2013)
mishka-723908 (1/10/2013)
Hello,

Does anyone have any ideas on the copying of the data in my situation?

Thanks


Yes. First, get rid of that partition key you're using and base it on the date, instead.

Then, split one month at a time off the original source table and load it into the final table. I believe you should drop most of your the partitions you made in the target table because there's a trick to loading a whole "split" table in just milliseconds if you have a new empty partition.

I haven't done this in a while so I'd have to do just like I recommend you do. Lookup the method in Books Online.


I agree with Jeff. You may find it by looking for a topic discussing SLIDING WINDOWS, or something to that affect.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44766 Visits: 39845
Ugh! Almost forgot because it's been a while. What are you planning on using for the PK on the new table? I ask because, on paritioned tables, any unique index must include the partitioning column. That can lead to some pretty nasy FK problems if you intend for other tables to point to the partitioed table usig DRI.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
mishka-723908
mishka-723908
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 241
Thank you guys for the replys.

Unfortunately, in the current table we do not have a date column that we can use. I am assuming that the designer of the table used the "period" attribute because that is the way it is received from the vendor. The period does represent Months. Period-1 = '01/1989', Period-2 = '02/1989', Period-3 = '03/1989'. At any given time we can insert\delete\update data for any of the periods. All periods will be used at the same level for reporting so we will not be doing sliding window.


--this is what I currently have for indexing. The Primary key is (period, external_loan_id), but I was not sure of the best way to create it on the partitioned table.
CREATE NONCLUSTERED INDEX IX_period_external_loan_id
ON qc_partition (period, external_loan_id)
ON partition_scheme_qc (period)
GO



--Current Table
CREATE TABLE dbo.qc_partition
(
   pool_id CHAR(3) NOT NULL
, deal_no CHAR(5) NOT NULL
, group_no CHAR(3) NOT NULL
, servicer VARCHAR(4) NULL
, loan_id CHAR(6) NOT NULL
, exloan_id VARCHAR(18) NOT NULL
, last_int_p SMALLDATETIME NULL
, balance MONEY NULL
, int_rate NUMERIC(6, 3) NULL
, totpmt_due MONEY NULL
, sched_principal , sched_mnth_p MONEY NULL
, mba_stat VARCHAR(1) NULL
, ots_stat VARCHAR(1) NULL
, payment_hist VARCHAR(12) NULL
, exception VARCHAR(1) NULL
, start_date SMALLDATETIME NULL
, end_date SMALLDATETIME NULL
, fc_end_typ VARCHAR(1) NULL
, payoff_d SMALLDATETIME NULL
, payoff_r VARCHAR(1) NULL
, sell_date SMALLDATETIME NULL
, inv_bal MONEY NULL
, next_percent NUMERIC(6, 3) NULL
, loss_val MONEY NULL
, net_rate NUMERIC(7, 4) NULL
, period SMALLINT NOT NULL
, file_name CHAR(8) NOT NULL
) ON primary


--Partitioned table
CREATE TABLE dbo.qc_partition
(
qc_partition_id BIGINT IDENTITY(1,1)
, pool_id CHAR(3) NOT NULL
, deal_no CHAR(5) NOT NULL
, group_no CHAR(3) NOT NULL
, servicer VARCHAR(4) NULL
, loan_id CHAR(6) NOT NULL
, exloan_id VARCHAR(18) NOT NULL
, last_int_p SMALLDATETIME NULL
, balance MONEY NULL
, int_rate NUMERIC(6, 3) NULL
, totpmt_due MONEY NULL
, sched_principal , sched_mnth_p MONEY NULL
, mba_stat VARCHAR(1) NULL
, ots_stat VARCHAR(1) NULL
, payment_hist VARCHAR(12) NULL
, exception VARCHAR(1) NULL
, start_date SMALLDATETIME NULL
, end_date SMALLDATETIME NULL
, fc_end_typ VARCHAR(1) NULL
, payoff_d SMALLDATETIME NULL
, payoff_r VARCHAR(1) NULL
, sell_date SMALLDATETIME NULL
, inv_bal MONEY NULL
, next_percent NUMERIC(6, 3) NULL
, loss_val MONEY NULL
, net_rate NUMERIC(7, 4) NULL
, period SMALLINT NOT NULL
, file_name CHAR(8) NOT NULL
, created_by VARCHAR(70) NOT NULL DEFAULT CURRENT_USER
, created_date DATETIME NOT NULL DEFAULT (GETDATE())
) ON partition_scheme_qc (period)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search