take some data from excel through sql

  • Hi

    I have a problem

    I want to have a .csv File with the following format , some fields like ‘actors’ , ‘directors’ , ‘date’ and ‘date of pomotion’

    The problem is that I am receiving an excel with many tabs (each tab is for a given day) with the following format (fields)

    field : “movie” and data like below

    actor#Logan Lerman#Emma Watson#Ezra Miller#director#Stephen Chbosky#

    field : “description” and data like below

    Romantic drama (2012)

    And

    tab : 2015-06-01

    how can I use an sql to take

    a.from field : “movie” to take the names after the actor and put as actors

    b.field : “movie” to take the names after the director and put as directors

    c.from field : “description” to take date which is inside the parenthesis --2012

    d.from each tab : to take the date and put as date of promotion

    thanks

  • georgheretis (5/17/2016)


    Hi

    I have a problem

    I want to have a .csv File with the following format , some fields like ‘actors’ , ‘directors’ , ‘date’ and ‘date of pomotion’

    The problem is that I am receiving an excel with many tabs (each tab is for a given day) with the following format (fields)

    field : “movie” and data like below

    actor#Logan Lerman#Emma Watson#Ezra Miller#director#Stephen Chbosky#

    field : “description” and data like below

    Romantic drama (2012)

    And

    tab : 2015-06-01

    how can I use an sql to take

    a.from field : “movie” to take the names after the actor and put as actors

    b.field : “movie” to take the names after the director and put as directors

    c.from field : “description” to take date which is inside the parenthesis --2012

    d.from each tab : to take the date and put as date of promotion

    thanks

    Are you saying that you're able to import the data from the spreadsheet directly or are you doing something else to import these "fields"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'd use SSIS for this, although it'd be rather complicated.

    I think I'd create an SSIS package with a connection to your Workbook. I'd then use a data flow with a custom script component as a source to grab the worksheet names within the workbook and load them into a recordset.

    I'd then use a ForEach Container to loop through the worksheets, with a data flow in the loop to load in the data (assuming the layout is the same on all worksheets). You can then use a Derived Column in the data flow to add the worksheet name into your data flow as a column.

    that is all fairly advanced SSIS development, and not something I'd recommend for anyone without quite a bit of SSIS development experience... but I don't know how you'd approach this from a purely T-SQL standpoint.

  • fisrt of all i finf a tool that converts the excel in csv and each tab to a separated csv file. so if the original excel file haw 40 tabs with data , now i have 40 csv files( with the date as a atb name like 2015-01-01)

    Actually i created a stored procedure and using a bulk insert i insert the csv file to the sql server. I am going to insert it then in tha table with the format i want

    and then using reporting services i am going to have th csv file

    but i am not expert in ssis so i want to do all transformations i sql server

    my main point is can i use t- sql

    to take from

    actor#Logan Lerman#Emma Watson#Ezra Miller#director#Stephen Chbosky#

    first the #Logan Lerman#Emma Watson#Ezra Miller#

    and then #Stephen Chbosky#

    also if there is way to take the date from --- >Romantic drama (2012)

    and last the date from the csv file (or tab name) if the csv filename is 'youtube 2015-01-01.csv"

    to take the 2015-01-01

    thanks in advance

  • Excellent, I'm glad you found a solution.

    I'm curious to know what tool you are using to convert the Excel worksheets to CSV files. That could be handy.

    for the string parsing, you should be able to do that in SQL. You'll probably want to use the SubString and CharIndex functions to parse the string values the way you want.

    You can find more information about T-SQL string functions here:

    https://msdn.microsoft.com/en-us/library/ms181984.aspx

  • i am using the tool "XLS to CSV Converter' see attachment

  • georgheretis (5/18/2016)


    i am using the tool "XLS to CSV Converter' see attachment

    Any chance of you attaching that as a ZIP file instead of a RAR file? Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Darn, it looks like that uses the MS Office API to connect to the workbook and do the conversion. I was hoping for a command-line utility that does not require Office to be installed, and doesn't use the ACE provider with its infuriating data type guessing behavior.

  • sestell1 (5/18/2016)


    Darn, it looks like that uses the MS Office API to connect to the workbook and do the conversion. I was hoping for a command-line utility that does not require Office to be installed, and doesn't use the ACE provider with its infuriating data type guessing behavior.

    Whenever I do any kind of import, it's usually as a character based column anyway so that nothing is left behind ... not even bad rows (which I want in a table, not in a file). With that, I almost always use IMEX=1 when I'm using ACE drivers to import from Excel and so don't have the problem you speak of.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes, I do as well. Unfortunately, clients are forever finding new ways to break that with inconsistent layouts, missing header row values, header values that span columns, etc...

    Another unfortunate side-effect I've run into using IMEX=1 and a header row to force text is you get the values as they are formatted in Excel instead of the actual value. I've had spreadsheets where numbers are formatted using accounting formats or with decimal resolution hidden, or dates formatted without the year displaying.

    Most of my development time with Excel as a source is spend working around the ACE driver. It would be so much easier if there was an SSIS provider for Excel that just provided the values as text and let me deal with formatting them within SSIS.

    (fixed spelling error)

  • sestell1 (5/18/2016)


    Darn, it looks like that uses the MS Office API to connect to the workbook and do the conversion. I was hoping for a command-line utility that does not require Office to be installed, and doesn't use the ACE provider with its infuriating data type guessing behavior.

    Python?

    Use Python to load the Excel file via Pandas module, transform & clean it in Pandas data frames and then output the results in any format you want including CSV. The Python script becomes a command-line utility that can run on Windows if you install Python 2.7 or greater in the environment. Then use SSIS/SQL to execute that Python script.

    http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html

    If you need examples, let me know.

  • xsevensinzx (5/19/2016)


    sestell1 (5/18/2016)


    Darn, it looks like that uses the MS Office API to connect to the workbook and do the conversion. I was hoping for a command-line utility that does not require Office to be installed, and doesn't use the ACE provider with its infuriating data type guessing behavior.

    Python?

    Use Python to load the Excel file via Pandas module, transform & clean it in Pandas data frames and then output the results in any format you want including CSV. The Python script becomes a command-line utility that can run on Windows if you install Python 2.7 or greater in the environment. Then use SSIS/SQL to execute that Python script.

    http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html

    If you need examples, let me know.

    Ooooh, very interesting!

    Thank you, I will definitely be looking into this!

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply