Quickest and most efficient ways to handle data

  • Hey everyone..I would like to start a discussion to understand what are the different quickest and most efficient ways to handle the following scenarios:

    1) How to move trillion records

    a) From one SQL Server table to another SQL Server table

    b) From SQL Server table to Excel or CSV file

    c) From Excel or csv file into SQL server table

    2) How to count trillion records in SQL server

    Please provide any details if you can..Thank you!!

  • rockstar283 (9/17/2013)


    Hey everyone..I would like to start a discussion to understand what are the different quickest and most efficient ways to handle the following scenarios:

    1) How to move trillion records

    a) From one SQL Server table to another SQL Server table

    I would question why you need to move a trillion rows of data. It is already in a table so there better be a REALLY good reason to move it.

    b) From SQL Server table to Excel or CSV file

    You don't. Excel can't come close to handling a trillion rows.

    c) From Excel or csv file into SQL server table

    You can't...see above.

    2) How to count trillion records in SQL server

    Unless you have a different number of fingers and toes I would suggest COUNT(*)...but you might be prepared to wait for a bit.

    There is my 2ยข.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 โ€“ Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks for the guidance mate..I had an interview today and the interviewer asked me the same questions..I answered it the best I could and suggested everything from count(*),@@rowcount,ranking functions to bulk copy, select * into etc..but he was not looking completely satisfied by my answers..so it got me thinking..

  • rockstar283 (9/17/2013)


    Thanks for the guidance mate..I had an interview today and the interviewer asked me the same questions..I answered it the best I could and suggested everything from count(*),@@rowcount,ranking functions to bulk copy, select * into etc..but he was not looking completely satisfied by my answers..so it got me thinking..

    I kind of wondered if these were interview questions. Seems a strange thing to ask since not too many places have a trillion rows of data. That is a massive amount of data. Reason I made my comment about not moving without a good reason is because even in batches that would take a crazy amount of time and need a crazy amount of disc space. I mean a trillion rows of just a single int would be 4,000,000,000,000 bytes. That is not small amount of disc space. I have never worked with anything anywhere near that large but if I did I would only move that data as an absolute last resort. It would likely be easier to remove the rows no longer needed and rename the table. Questions like that are a bit bizarre to me.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 โ€“ Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • If I was asked that in an interview, my immediate reaction would be to ask 'for what purpose?' Depending on why those rows need to be moved, I could probably suggest alternatives that did not involve moving that many rows around, because it's not something you'd ever want to do.

    A trillion rows, at minimum, is just under a TB of data (if each row is a single byte, it probably isn't). Where's that going to be stored? How's it going to be moved? How fast is the network? What's the point?

    p.s. Count_Big()

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I completely agree with you..even I will never build anything like..from the moment when you realize that the table is going to be that big..one should immediately think about partitioning it..its stupid to keep that much data in one table and then move it..the most insane thing in the interview was that after this question he asked me about 300 Terabytes of data ๐Ÿ˜€ ๐Ÿ˜€ ๐Ÿ˜€ ๐Ÿ˜€ :hehe:

  • rockstar283 (9/17/2013)


    from the moment when you realize that the table is going to be that big..one should immediately think about partitioning it.

    No.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • rockstar283 (9/17/2013)


    the most insane thing in the interview was that after this question he asked me about 300 Terabytes of data ๐Ÿ˜€ ๐Ÿ˜€ ๐Ÿ˜€ ๐Ÿ˜€ :hehe:

    Not at all insane if they have that volume of data. That is certainly a lot of data but VLDBs are different than trillion row tables.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 โ€“ Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Answers I have found so far:

    Quickest way to move the data:

    1) SSIS

    2) DELETE FROM dbo.Table1 OUTPUT deleted.* INTO dbo.Table2

    3) SELECT * INTO dbo.Table2 from dbo.Table2 where 1=0

    Quickest way to count the number of records:

    1) SP_Spaceused

    Displays the number of rows, disk space reserved, and disk space used by a table, indexed view, or Service Broker queue in the current database, or displays the disk space reserved and used by the whole database.

    USE AdventureWorks2012;

    GO

    EXEC sp_spaceused N'Purchasing.Vendor';

    GO

    2) Count(*)

    3) select SCHEMA_NAME(ST.schema_id)+'.'+ST.name TableName,

    SP.rows RowCnt

    from

    sys.tables ST

    inner join sys.partitions SP

    on (ST.object_id = SP.object_id

    and SP.index_id in (0,1))

    order by

    RowCnt desc

    4) SELECT SUM (row_count)

    FROM sys.dm_db_partition_stats

    where (index_id=0 or index_id=1)

    and object_name(object_id) = 'YourTableName'

    5) select Count_Big(*) from HumanResources.Employee

    Thanks to all who helped me to formulate above answers

    if anyone else has any other thoughts..please pitch in:-)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply