Sql server data export to csv with 100,000 row filesize limit

  • I'm using sql statements to query data from multiple tables and exporting them into csv files, then importing that file somewhere else.

    Is there a way to export the results into csv files that limit the size of each file to approx 100,000 rows each. Once reached, the export creates a new csv file and increments a number. So the first file is cusexp_{datetime}_1, then cusexp_{datetime}_2, cusexp_{datetime}_3, etc.

    Say the results return 1,000,000 rows. Export to csv files of approx 100,000 rows each. csv filename format cusexp_{datetime}_{iterationNumber}. Don't break up same [Customer ID] c.id between multiple files. So in this case I'd have around 10 or 11 csv files.

    Here is a rough sample of what I'm selecting.

    select
    [Customer Name] = c.customername
    ,[Customer ID] = c.id
    ,[Item Purchased] = o.itemname
    ,[Item Number] = o.itemNum
    ,[Item Description] = o.itemDesc
    from dbo.mystore.customermstr c
    join dbo.mystore.items o
    on c.id = o.customerid
    where c.status = 'true';
  • I could also insert them into a temp with a counter column, then export the data from that file, but I'm not sure if that will help.

  • You can give these a try:

    a) Import Export wizard.

    b) Use Excel directly, Connect to the Instance by creating a connection and retrieve the data in itself.

    c) Use SSIS.

    =======================================================================

  • I would add a RANK or DENSE_RANK to your query based on the customer ID and create a table to hold the results.  You can then use that ranking column as the start/end for each file with a simple query passing in the start and end rank.

    drop table if exists dbo.tmpCustomerOrders;

    select
    [Customer Name] = c.customername
    ,[Customer ID] = c.id
    ,[Item Purchased] = o.itemname
    ,[Item Number] = o.itemNum
    ,[Item Description] = o.itemDesc
    ,CustomerRank = dense_rank() over(Order By c.id)
    into dbo.tmpCustomerOrders
    from dbo.mystore.customermstr c
    join dbo.mystore.items o
    on c.id = o.customerid
    where c.status = 'true';
    Declare @startRank int = 1
    , @endRank int = 100000;

    select
    co.[Customer Name]
    ,co.[Customer ID]
    ,co.[Item Purchased]
    ,co.[Item Number]
    ,co.[Item Description]
    from dbo.tmpCustomerOrders
    where CustomerRank >= @startRank
    and CustomerRank < @endRank;

    Using whichever tool you want to export the results, build a loop passing in the start and end rank for each loop.  Get the max value of the customer rank to identify when you have reached the end.

    Calculate the increment to be used for each loop - which can be based on the average number of items purchased or the total number of customers.  For example, if the average number of items purchased is 2 - then increment by 50,000 for each loop or if the total number of customers is 120,000 and you want 12 files - increment by 10,000 for each loop.

    If this needs to be a repeatable process - I would use SSIS.  For an adhoc process - I would use version 7 (or higher) Powershell script with Export-Csv.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • FlyingSquid wrote:

    I'm using sql statements to query data from multiple tables and exporting them into csv files, then importing that file somewhere else.

    I have to ask "Why" here.  What is the ultimate use of those files going to be?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • based on the 100k row limit I would nearly bet this is to generate files to load to a MySQL Instance using the "Load data Local" statement (this is the wrong way to do it for very big loads!!!).

    to the OP - can you reply both Jeff questions and my own with regards to this -

    • what are you doing with the files after they are generated
    • why the 100k row limit
    • Most importantly why the need to have all rows for a customer on the same file
    • and can the 100k row limit be slightly exceeded

     

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply