Home Forums SQL Server 2012 SQL 2012 - General Database import from text/csv file and many to many relationship RE: Database import from text/csv file and many to many relationship

  • zqckfair (7/28/2013)


    Thank you for this very well explained answer and that useful link.

    Sorry for the ambiguity, but unfortunately I'm also having problem with the import, in fact, it is my main problem.

    I'm just trying to import a CSV file to the database and getting the result shown on the second image, which is as you can see not a excel sheet (the first one is) but the real sql table, from the visual studio 2012 sql table manager, so that's what I wanted to say, the fastest way that I found to import all my articles was to add them one by one using VS2012 SQL Manager as I don't know how to import the CSV file..)

    Here is that CSV file :

    Article ID;Tags ID

    1500;7,8,9,5

    1501;4,3

    1502;8,1

    1503;8,9

    ....

    Columns are ";" separated and tags id are "," separated.

    So if I've to import this CSV file to "MyTable"

    Article ID;Tags ID

    Jeff;Moden,Zidane,Headbutt

    5;1,4,6

    This would be the structure expected after the import :

    But I do not have a clue how to do it.

    No problem. Erland's excellent article is to die with several great methods with sample code and great explanations.

    If you're interested in a "T-SQL Only" solution, we can do that quite simply, as well...

    In the following, there's a lot of simple techniques being used. That is, they're "simple" for people that are familiar with them. There's no way that I could explain them all to a "newbie" to a level that I'd like to without writing a book on the subject. With that in mind, I explained the sections at a high level and a mid level. For details on each command, I strongly recommend that you look them up in Books Online (especially the Bulk Insert... there so much more you can do with it including sequestering bad rows in a separate file without aborting the whole run, etc). I also strongly recommend that you do a Google search on "SQL Injection" and "DOS Injection" to see not only what you need to watch out for, but to understand why so many DBAs go bananas at the mere mention of dynamic SQL. The method I used prevents both DOS and SQL Injection.

    First things first. You provided two listings as to what a file may contain. I saved them as two separate files in my C:\Temp directory (you can use any drive directory or UNC that SQL Server has access to but I used C:\Temp) for demonstration purposes. I called them "JBMDemo1.csv" and "JBMDemo2.txt" respectively. Notice the differences in extensions. I just wanted to prove that the extension means nothing for these imports. If you want to run the demonstration, as is, you'll need to create these two files (you can easily copy and paste into NOTEPAD to do this).

    So, the first file, JBMDemo1.csv, contains the following:

    Article ID;Tags ID

    1500;7,8,9,5

    1501;4,3

    1502;8,1

    1503;8,9

    The second file, JBMDemo2.txt, contains the following:

    Article ID;Tags ID

    Jeff;Moden,Zidane,Headbutt

    5;1,4,6

    You also mentioned a "final" table called "MyTable". Here's the code I used to create that table for this demonstration. Note that I had to use VARCHAR for both columns because of the content of the files you provided. If you want to run the demonstration, as is, you'll need to execute the following code to create the "dbo.MyTable" table.

    CREATE TABLE dbo.MyTable

    (

    ArticleID VARCHAR(50),

    CategorieID VARCHAR(50)

    )

    ;

    Now, here's the stored procedure I created to demo all of this. Running the following code will create a stored procedure called "dbo.ImportArticleCategory". You can call it anything you want but that's what I called it for this demonstration. You'll need to change it to point to the correct table if you put it into production because I pointed it at the demonstration table called "dbo.MyTable", which we created above.

    I also did the normal thing that I do with some fairly heavy commenting. Again, being a "newbie", you should lookup the things that I used in Books Online and study them. And be sure to Google for both "DOS Injection" and "SQL Injection". EVERYONE should become keenly aware of what they are and how to prevent them because it's STILL the leading attack vector in the world of software.

    I also used your latest column names.

    Here's the code for the proc...

    CREATE PROCEDURE dbo.ImportArticleCategory

    /******************************************************************************

    Purpose:

    This stored procedure accepts a full file path, checks it for both DOS and

    SQL "Injection", imports the file to a temporary staging table, parses the

    category tags, and inserts the results into a table called "MyTable".

    Usage Examples:

    EXEC dbo.ImportArticleCategory 'C:\Temp\SomeFile.csv';

    EXEC dbo.ImportArticleCategory @pFilePath;

    Programmer's Notes'

    1. Example input file might look like the following. Semi-Colon is the

    delimiter. The CSV column will be parsed.

    Article ID;Tags ID

    1500;7,8,9,5

    1501;4,3

    1502;8,1

    1503;8,9

    Jeff;Moden,Zidane,Headbutt

    5;1,4,6

    1. (continued)The data stored in the "MyTable" table from that file will be

    the following:

    ArticleID CategorieID

    --------- -----------

    1500 7

    1500 8

    1500 9

    1500 5

    1501 4

    1501 3

    1502 8

    1502 1

    1503 8

    1503 9

    Jeff Moden

    Jeff Zidane

    Jeff Headbutt

    5 1

    5 4

    5 6

    2. If the given file path contains any DOS or SQL Injection, the proc will

    only identify that it has completed so that it gives an attacker no hints.

    3. If the given file path does not exist, an error to that nature will be

    returned.

    4. The file path may be either a "drive path" or a "UNC".

    Revision History:

    Rev 00 - 28 Jul 2013 - Jeff Moden - Initial creation and unit test.

    ******************************************************************************/

    --===== Declare the parameters for this proc.

    (@pFilePath VARCHAR(8000))

    AS

    --=============================================================================

    -- Presets

    --=============================================================================

    --===== Environmental presets

    SET NOCOUNT ON; --Suppress the autodisplay of rowcounts to prevent them

    --from being interpreted as errors by a GUI.

    SET XACT_ABORT ON; --Not required here but a good habit to get into for

    --when you use explict transactions to short-circuit

    --when an error occurs.

    --===== Declare obviously named variable(s)

    DECLARE @SQL VARCHAR(8000)

    ;

    --===== Conditionally drop the staging table to make reruns in SSMS Easier.

    -- This section may be commented out for production.

    IF OBJECT_ID('tempdb..#ImportStaging','U') IS NOT NULL

    DROP TABLE #ImportStaging

    ;

    --===== Create the staging table

    CREATE TABLE #ImportStaging

    (

    ArticleID VARCHAR(50),

    TagsID VARCHAR(8000)

    )

    ;

    --=============================================================================

    -- Load the given file into the staging table using the two columns that

    -- are separated by the semi-colon.

    --=============================================================================

    --===== Create the dynamic SQL using the given filepath.

    -- The WHERE clause prevents both SQL and DOS injection.

    SELECT @SQL = '

    BULK INSERT #ImportStaging

    FROM ''' + @pFilePath + '''

    WITH (

    FIELDTERMINATOR = '';'',

    FIRSTROW = 2,

    TABLOCK

    )

    ;'

    WHERE @pFilePath NOT LIKE '%[^a-zA-Z0-9\:. ]%'

    ;

    --===== Execute the dynamic SQL from above to do the actual file load

    -- into the staging table

    EXEC (@SQL)

    ;

    --=============================================================================

    -- Using the splitter function previously shown, split the tags for each

    -- ArticleID to produce the list you wanted just like before except that

    -- we're inserting the result into "MyTable" (for this example).

    --=============================================================================

    INSERT INTO dbo.MyTable

    (ArticleID, CategorieID)

    SELECT s.ArticleID, CategorieID = split.Item

    FROM #ImportStaging s

    CROSS APPLY dbo.DelimitedSplit8K(s.TagsID,',') split

    ;

    GO

    Ok, if you've been following along by running the code I've provided so far, run the following code to verify that the dbo.MyTable is empty.

    SELECT * FROM dbo.MyTable;

    That returns the following:

    ArticleID CategorieID

    --------- -----------

    (0 row(s) affected)

    Now, to import the first file (C:\Temp\JBMDemo1.csv), just execute the stored procedure with the proper path in quotes, like this...

    EXEC dbo.ImportArticleCategory 'C:\Temp\JBMDemo1.csv';

    If you run the following code (again), you'll see that the file has been successfully imported into the "MyTable" table.

    SELECT * FROM dbo.MyTable;

    Results:

    ArticleID CategorieID

    --------- -----------

    1500 7

    1500 8

    1500 9

    1500 5

    1501 4

    1501 3

    1502 8

    1502 1

    1503 8

    1503 9

    (10 row(s) affected)

    Now, use the proc to import the next file like this...

    EXEC dbo.ImportArticleCategory 'C:\Temp\JBMDemo2.txt';

    When we run the SELECT on MyTable again, we can see that the second file's information has been imported without changing anything previously imported.

    SELECT * FROM dbo.MyTable;

    Results:

    ArticleID CategorieID

    --------- -----------

    1500 7

    1500 8

    1500 9

    1500 5

    1501 4

    1501 3

    1502 8

    1502 1

    1503 8

    1503 9

    Jeff Moden

    Jeff Zidane

    Jeff Headbutt

    5 1

    5 4

    5 6

    (16 row(s) affected)

    I'm sure that you have a million questions but, as you can see, this "problem" is really easy to crack with that DelimitedSplit8K splitter and some relatively easy code. Study the things I used in "Books Online" (press the {f1} key in SSMS to get there) and you're well on your way to shaking off the "newbie fur". 😛

    Don't forget to Google and study about "DOS Injection" and "SQL Injection".

    Let us know if you have more questions about this.

    P.S. You could also use "SSIS" for this but someone else will have to cover how to do that because, well, I just use T-SQL to do things like this because it's easier for me. In other words, I've never used "SSIS" for such an easy thing and certainly can't explain how to do it with "SSIS". 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)