OPENROWSET & BULK INSERT Skip First X Rows, Skip Last Y Rows

  • Hi all,

    I have a requirement to read csv files skipping the First X Rows and Last Y Rows using OPENROWSET (preferably) or BULK INSERT.

    We don't have control over the incoming files but the patterns are fixed. Ex: File1.CSV has always 3 rows at the top & 10 rows at the end that we need to skip.

    I understand there may be other tools (ex: SSIS etc.) that are best suited to handle these types of issues but I am curious to see if other members faced this issue and how you have overcome those issues.

    Both these have a LASTROW option but it requires us to know the row count upfront. Ideally, there is an option like LASTROW = -10 which skips the last 10 rows.

    Appreciate your help.

    Thanks

  • This was removed by the editor as SPAM

  • You can do something like the following link contains to "pre-process" a file.  If you can use xp_CmdShell (it's NOT the security risk most will say it is if you do it right), it can all be under the control of SQLServer from an SQL Agent Job or whatever.

    Another way to do it would be to bulk insert it as whole rows and then pluck what you need.  If that's a CSV or TSV file, you export "the good stuff" to a file and then import it as a delimited file directly into a table.  Or, you could split it once you've imported the whole rows.  Handling stray quotes like what Excel produces when a delimiter is found in a cell is pretty easy to handle, as well.  And, I also use the ACE drivers and a little T-SQL dity that will read spreadsheets and auto-magically self-adjust for monly changes as new columns are added, etc, without even having to look at the spreadsheet.

    If you don't actually know how to do the things I suggest, it's a bit too much to go through on a thread like this without a sample file to work with.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Thanks for your note. Can you please reshare the link? It didn't come through in your post.

    Here are two sample data sets that I made up:

    Objective: Has Header. Ignore First Row & Last Five Rows and load the file into SQL

    -- Example 1:

    month: 9/1/2021 - 9/28/2021

    hdr1,hdr2,hdr3,hdr4, hdr5

    1,2,3,4,5

    6,7,8,9,10

    11,12,13,14,15

    16,17,18,19,20

    data filters: filter1, filter2, filter3

    report name: dummy report

    4 data rows

    sales $1000

    produced: 9/29/2021

    -- Example 2: No header, Inconsistent quotes. Assumption: Format file will be supplied

    month: 9/1/2021 - 9/28/2021

    1,2,3,4,5,100 state st

    6,7,8,9,10, "3755, president drive"

    11,12,13,14,15, 200 state st

    16,17,18,19,20, 300 state st

    data filters: filter1, filter2, filter3

    report name: dummy report

    4 data rows

    sales $1000

    produced: 9/29/2021

    Thanks!

  • This was removed by the editor as SPAM

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply