Have txt/csv files, but header need to find from which row it will start

  • DBA.k

    Ten Centuries

    Points: 1085

    I have data in text files as attachments/below. SO header row position will be change and delimiter cama(,) semicolon(;) .. etc.. I need to load in each file as separate table from header and data only.. can you please give me solution any stored procedure /SSIS package code to find header row and based on that need to create table dynamically and load the data into thattable..

    Kindly help me on this task

    -----------------------------------------------------------------------------------

    This is first file

    today date: 4/28/202

    ID,Name,Sal

    10,mahi,2000

    20,ravi,3000

    30,raju,1000

    -----------------------------------------------------------------------------------------------------

    Total emp: 20


    CREATE TABLE [dbo].[NewTable1](

    [18/10/2020 date] [nvarchar](255) NULL

    )

    GO

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (N'This is test file')

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (NULL)

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (N'20-20-2020')

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (NULL)

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (N'ID,name,sal')

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (N'10,mahi,2000')

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (N'20,ravi,1000')

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (N'30,ram,3000')

    INSERT [dbo].[NewTable1] ([18/10/2020 date]) VALUES (N'total emp:3')

    As per above script data will store in a single column with cama separate or any other delimiter which is semicolon ; etc.. so first i need to find out in which row header will be there and based on header (ID,name,sal') i need to create table schema dynamically.. so may be in another table i have more columns that has to automatically create table with that header..

    Kindly help on this ....

    • This topic was modified 2 months, 1 week ago by  DBA.k.
    • This topic was modified 2 months, 1 week ago by  DBA.k.
    Attachments:
    You must be logged in to view attached files.
  • Thom A

    SSC Guru

    Points: 98644

    If the delimiter can constantly change, seems like you should be fixing what ever process is creating the files, rather than trying to dynamic handle a random and unknown delimiter.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.

  • DBA.k

    Ten Centuries

    Points: 1085

    yes thanks.. please help me how to handle to find header row and create table with that header and load data.. also how can we handle dynamic  delimiter ..

  • Thom A

    SSC Guru

    Points: 98644

    DBA.k wrote:

    yes thanks.. please help me how to handle to find header row and create table with that header and load data.. also how can we handle dynamic  delimiter ..

    Like I said, why not fix what ever creates those files?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.

  • DBA.k

    Ten Centuries

    Points: 1085

    correct but noting in my hand to fix source data.. different clients will provide different  sources.. requirement is we have to load that different sources need to load  into database DB table.. SO that i am trying to build dynamic SSIS package to find header row and based on that create tables and load data from different delimiter files

  • Thom A

    SSC Guru

    Points: 98644

    Are files from a specific client at least consistent? Then you can build the package so that the delimiter is based on the client. if not, I suggest at least feeding back to the clients to tell them to be consistent with their files.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.

  • Phil Parkin

    SSC Guru

    Points: 244589

    Your request seems to be the equivalent of "I would like a way of importing delimited text files. For each file, I do not know what the delimiter is, nor what the structure of the file is, nor on which row the header appears.  Each file should be imported into its own table, which should be created dynamically, based on the file's structure. Data types for the dynamic table should be inferred from the data which is contained in the file."

    Is that correct?

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • Jeff Moden

    SSC Guru

    Points: 996661

    You've provided one example file.  Could you provide a couple more that represent some of the other files that have different delimiters,  etc?

    Also, how many "columns" do these files generally have?  I realize they can vary but I need a feel of the number of columns they have before I recommend a method.  It would also be helpful to know if the maximum line length of these files is <= 8000 characters including the delimiters.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • m-scally

    SSC Journeyman

    Points: 99

    I have done this a couple of ways over the years.

    1. Using C# Script Task in SSIS with Arrays for the column headers and looping to find the header row. This was for csv files where the header names and positions were dynamic.
    2. Importing the file as Base64 and using SQL to split the file.

    What version of SQL Server are you using?

    If you're on 2016 + you can use STRING_SPLIT() but on earlier versions Jeff's Tally Table solution DelimitedSplit8K function is very efficient for strings up to 8000 characters.

     

  • DBA.k

    Ten Centuries

    Points: 1085

    Yes boss you are correct, really file will come may be cama , delimiter or semicolon delimiter etc... also header will be there in one 3rd row and some other file header will be there in some 7 th row.. but any how dynamically i should pick the header and delimiter load data in separate table with that file name..

  • DBA.k

    Ten Centuries

    Points: 1085

    Thanks Jeff..


    This is test file

    ...

    fhgjhjlj,nnhhjhjj

    ID,name,sal

    10,mahi,2000

    20,ravi,1000

    30,ram,3000

    total : 3


    This is test file

    ...

    fhgjhjlj;nnhhjhjj

    ID;name;sal;city

    10;mahi;2000;HYD

    20;ravi;1000;ban

    30;ram;3000;Mum

    total : 3


    Please assume like above 2 files in txt files that i need to load in DB as table structure or else this complete data is there in tables single column that also find in sql server how to find header row and create table with that row..

  • Jeff Moden

    SSC Guru

    Points: 996661

    I'm at work now and can't help just now.  I'll have a go at it tonight after work.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • DBA.k

    Ten Centuries

    Points: 1085

    Thank you.. Jeff .. Please help on this when ever you have time.. The other thread i unsubscribed..

  • DBA.k

    Ten Centuries

    Points: 1085

    I tried to C# code and something skip the rows and load data , based on variable.. but that one i configure in Package configuration .. but the requirement is we are not able to open file and see the header row to provide manually in Configuration.. that should ready and find the header automatically.. You have any idea to change in script task in below link .. please help me

    http://www.techbrothersit.com/2016/03/how-to-create-tables-dynamically-from.html

  • Jeff Moden

    SSC Guru

    Points: 996661

    Just a little feedback... I finally have some time today to work on this.  Actually, I started on it last night.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply