Create Table and Bulk Insert

Question

Create Table and Bulk Insert

urso47

SSC Eights!

Points: 979
More actions
December 15, 2012 at 11:05 pm

#268112

Hi guys,
I have a .csv file wich has the following information (exactly like this):
List Name: User
Format Version:1.2.5.0
Date:12/11/2012 12:00:34
Equipment.:L6867000214
IP:172.21.12.62
User,Name,Total,B&W
[1600],[technical],76,76
[6611],[Daise],559,559
[900301],[Richard],0,0
I want to execute a create table and a bulk insert transaction but I don't know how to insert the information of the top lines of the .csv file like "Equipment.:L6867000214" and "IP:172.21.12.62". I need to create a collumn called "Equipment", other collumn called "IP" and insert this information (repeat "L6867000214" and "172.21.12.62" in all records bulk inserted). The result must be like this:
IP,Equipment.,User,Name,Total,B&W, Color
172.21.12.62,L6867000214,[1600],[technical],76,76
172.21.12.62,L6867000214,[6611],[Daise],559,559
172.21.12.62,L6867000214,[900301],[Richard],0,0
I have no ideia how to do that. Could anyone here help me out on this issue?
Thanks a lot!

Viewing 15 posts - 1 through 15 (of 46 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 1

urso47 (12/15/2012)
Hi guys,
I have a .csv file wich has the following information (exactly like this):
List Name: User
Format Version:1.2.5.0
Date:12/11/2012 12:00:34
Equipment.:L6867000214
IP:172.21.12.62
User,Name,Total,B&W
[1600],[technical],76,76
[6611],[Daise],559,559
[900301],[Richard],0,0
I want to execute a create table and a bulk insert transaction but I don't know how to insert the information of the top lines of the .csv file like "Equipment.:L6867000214" and "IP:172.21.12.62". I need to create a collumn called "Equipment", other collumn called "IP" and insert this information (repeat "L6867000214" and "172.21.12.62" in all records bulk inserted). The result must be like this:
IP,Equipment.,User,Name,Total,B&W, Color
172.21.12.62,L6867000214,[1600],[technical],76,76
172.21.12.62,L6867000214,[6611],[Daise],559,559
172.21.12.62,L6867000214,[900301],[Richard],0,0
I have no ideia how to do that. Could anyone here help me out on this issue?
Thanks a lot!

Do you ever have multiple "List Name" sections with their matching user data lists (ie: full multiple sets of what you posted) in the same file? Also, what do the brackets signify in the data? Do they simply signify alpha-numeric information as opposed to just numeric information?

And, do you really want a CSV output as you've indicated or are you just using the commas to show which columns in a final table you want?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

urso47 SSC Eights! Points: 979 More actions · Answer 2

Hi Jeff,

Thank you for your response!

Yes, I do have multiple files, and the most important information are the Date, Equipment and the IP. This 5 lines header (considering that it is a file header) comes with each file from different equipments, dates and IP adresses. One file has one header only (5 lines header).

The brackets are just an alpha-numeric information, we can eather leave it or delete it.

Actually, I have to consolidate 6 or 7 different kind of cvs files this one that comes from each vendor and consolidate it in one excel report in order to calculate totals X units prices per client. The comas are to separate the collumns. This one has this kind of header wich makes it harder for me to execute the bulk insert transaction (I am a newbie here). So, probably, I will keep asking for your help for a while.

For the output, if possible, I would like to open it in a customized excel file. If not, I will just copy and paste it, but I wanted to automate it as much as I can.

Thanks a lot!

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 3

Perfect and understood.

I've done things like this a lot so I'll give it a shot. In the mean time, could you post the CREATE TABLE statment for the target table? Thanks.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 4

Here we go... this is just one way to skin this particular cat. If you end up with some really large files or files with line lengths longer than 8k bytes, there are some faster ways to do it all in T-SQL that will require either a CLR splitter or a bit of help from xp_CmdShell, but this method should work just fine for you.

1. To test this code, I used what you posted to create a "urso47Test01.txt" file in C:\Temp. I've attached that file to this post if anyone else wants to "play" with the code. That ZIP file contains everything you need for this example.

2. The code has a couple of dependencies that are very easily resolved. Just read the attached code to find out what those dependencies are. I've included those objects in the attachments, as well.

3. Other than that, just about anything else I say would be superfluous because I document just about everything you need to know in the code. Even the BCP format file is documented within. READ THAT DOCUMENTATION because you may need to make a change before you run the code. The documentation can stay within the file.

4. Last but not least, if you don't know what a "BCP Format File" is, Books Online (Press the {f1} key to get there) is your friend.

5. If you need help running this stored procedure over many files, please post back and we'll give you a leg up... especially since you're not using SSIS to do this.

6. I didn't take the time to export the data to an Excel file. That can't really be easily done from T-SQL unless you can use xp_CmdShell. Typically, its much easier to modify the stored proc to simply capture all of the data you want in a table and then tell Excel to import it or create a "pass through" query to load the data into the spreadsheet.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

urso47 SSC Eights! Points: 979 More actions · Answer 5

Great Jeff!!!

Thanks a lot! I saw it and it seems very, very complex for me to understand rigth away. I will take sometime later on to test the code you provided.

As soon as I test it, I will let you know.

Thanks again!!!!

Andre

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 6

urso47 (12/16/2012)
Great Jeff!!!
Thanks a lot! I saw it and it seems very, very complex for me to understand rigth away. I will take sometime later on to test the code you provided.
As soon as I test it, I will let you know.
Thanks again!!!!
Andre

Thanks for the feedback, Andre. When you're reading the code, just remember the simple stuff it's doing. Other than checking for the position of data to make sure the file is formatted correctly, it just ...

1. Reads the file into a single column.

2. Parses the header info and pivots it into a single line table.

3. Parses the detail info and adds the header information to it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

urso47 SSC Eights! Points: 979 More actions · Answer 7

Hi Jeff,

First of all, I am impressed how helpfull you were creating this code, it is totally amazing! I am so happy to find people like you helping people like me that knows just a little bit.

I created the directory C:Temp/, copied the files into it, executed the 2 files GetFileType01 and The New DelimitedSplit8K Function but I can't even find the table to apply a select or find this SP 🙁

Maybe it is too much to ask but, could you tell me how to use the files? I read the instructions but I still not able to find out how to use it...

Thanks a lot again!

PS.:

1- I have only the weekend to study and practice but I will do my best to do it on weekdays either.

2- Once the table was created, I think would be easier just add the new files into it (use "select into" instead "create table", what do you think?).

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 8

urso47 (12/23/2012)
Hi Jeff,
First of all, I am impressed how helpfull you were creating this code, it is totally amazing! I am so happy to find people like you helping people like me that knows just a little bit.
I created the directory C:Temp/, copied the files into it, executed the 2 files GetFileType01 and The New DelimitedSplit8K Function but I can't even find the table to apply a select or find this SP 🙁
Maybe it is too much to ask but, could you tell me how to use the files? I read the instructions but I still not able to find out how to use it...
Thanks a lot again!
PS.:
1- I have only the weekend to study and practice but I will do my best to do it on weekdays either.
2- Once the table was created, I think would be easier just add the new files into it (use "select into" instead "create table", what do you think?).

The C:\Temp directory was just an example of a "data directory". My bad for not being a bit more clear.

1. You've already done this... From SSMS, execute the "The New DelimitedSplit8K Function.sql" file in the same database as where you intend all of this work to be done.

2. I think you've already done this, as well... From SSMS, execute the "GetFileType01.sql" file in the same database as where you intend all of this work to be done.

3. Copy the "RawData8000.fmt" file to C:\Temp on the server. We'll get back to this in a minute but let's get the test file running first. This is the file necessary to be able to do the imports and allow the rows to be numbered.

4. Copy the "urso47Test01.txt" file to C:\Temp on the server. This is just a test file with the data from your original post in it. It's just for test purposes.

5. We're all set now. Run the following command in SSMS.

EXEC dbo.GetFileType01 'C:\Temp\urso47Test01.txt';

You should see the following display which just shows that the original data was loaded and parsed correctly.

ListNameFormatVersionDate EquipmentIP UserNumberUserNameTotalB&W

User 1.2.5.0 2012-12-11 12:00:34.000L6867000214172.21.12.621600 technical76 76

User 1.2.5.0 2012-12-11 12:00:34.000L6867000214172.21.12.626611 Daise 559 559

User 1.2.5.0 2012-12-11 12:00:34.000L6867000214172.21.12.62900301 Richard 0 0

6. C:\Temp isn't a good place to keep anything permanently. What I recommend is that you copy the "RawData8000.fmt" to the same directory as where your real files are.

7. Now, find the following line in the dbo.GetFileType01 stored procedure...

FORMATFILE = ''C:\Temp\RawData8000.fmt'',

... and change the C:\Temp\ part of that to wherever you just copied the RawData8000.fmt format file to.

8. Pick one of your data files from that same directory and substitute either the full file path or full UNC between the quotes to load the file, parse it, and display it.

EXEC dbo.GetFileType01 'Put full file path or UNC here'

Once you have that working, we'll move onto the next steps.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

urso47 SSC Eights! Points: 979 More actions · Answer 9

Hi Jeff,

Hope you had a great X'mas Eve!

I have to agree with you, this whole code is like a every kid's dream ;-)!

It worked perfectly, even when I don't get whole understanding right now, but I will learn when practicing. I didn't even know about this "CROSS APPLY " and " Item = REPLACE" function. Thank you for teach me step by step like a kid, because for me most part of the code still very hard to understand.

Now, I would like to ask you how could I execute this SP in dozens of files within the same directory (c:temp in this case). Is there a way to call all the files within this directory regardless the file name?

I found the SP under Programmability, but not the table for the "GetFileType01", so I think we have to do an INSERT INTO SomeTable, am I right?

And finally, for some other files some fields will be bigger than 8K, specially one field that brings http adresses longer than 500 or 600 characters and I am having troubles even using the T-SQL or a SQL Server Import Export Tools, it comes with the error message:

Error 0xc02020a1: Data Flow Task 1: Data conversion failed. The data conversion for column "cDocumentName" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".

(SQL Server Import and Export Wizard)

Error 0xc020902a: Data Flow Task 1: The "output column "cDocumentName" (58)" failed because truncation occurred, and the truncation row disposition on "output column "cDocumentName" (58)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component. (SQL Server Import and Export Wizard)

Error 0xc0202092: Data Flow Task 1: An error occurred while processing file "C:\MyDocs\SimpressDB\Equitrac\Equitrac.csv" on data row 2.

(SQL Server Import and Export Wizard)

Error 0xc0047038: Data Flow Task 1: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Source - Equitrac_csv" (1) returned error code 0xC0202092. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure. (SQL Server Import and Export Wizard)

When I use a T-SQL transaction to do it, I found out the this field gets splited and part of it goes to the next fields, even using VARCHAR(MAX) or TEXT configuration in this field. I don't have the "header problem" with this files, but now I faced this problem with a very big/long information to insert in this other table field.

Always thank you, Jeff!

Andre

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 10

urso47 (12/25/2012)
Hope you had a great X'mas Eve!

Thank you and Merry Christmas!

It worked perfectly, even when I don't get whole understanding right now, but I will learn when practicing. I didn't even know about this "CROSS APPLY " and " Item = REPLACE" function. Thank you for teach me step by step like a kid, because for me most part of the code still very hard to understand.

Press the {f1} key to bring up "Books Online". That and this website should become your best friends.

Now, I would like to ask you how could I execute this SP in dozens of files within the same directory (c:temp in this case). Is there a way to call all the files within this directory regardless the file name?

Yes... that's the next "lesson". What is the name of "that" directory?

I found the SP under Programmability, but not the table for the "GetFileType01", so I think we have to do an INSERT INTO SomeTable, am I right?

Correct. We first needed to show you how to parse the data. Next, we'll show you how to insert it into a table and how to do it for many files.

And finally, for some other files some fields will be bigger than 8K, specially one field that brings http adresses longer than 500 or 600 characters and I am having troubles even using the T-SQL or a SQL Server Import Export Tools, it comes with the error message:

I need the complete file layout in order to be able to help there. A sample file would also be very helpful.

When I use a T-SQL transaction to do it, I found out the this field gets splited and part of it goes to the next fields, even using VARCHAR(MAX) or TEXT configuration in this field. I don't have the "header problem" with this files, but now I faced this problem with a very big/long information to insert in this other table field.

Again, I'd need to know the layout for such a file including what the delimiters are (comma, tab, or something else). I'd also need to know if the large column has any column delimiters embedded in the data itself and if the column is encapsulated in any type of "text qualifier".

Always thank you, Jeff!
Andre

My pleasure. If you want to save some time, always include things like a record layout, the CREATE TABLE statement for the target table, and, if you can and without violating any private information or "company proprietary informatio", at least the first 10 lines from the files you'll be working with. Obviously, a file like those containing headers will need to have more lines included to cover the header and about 10 lines of data.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

urso47 SSC Eights! Points: 979 More actions · Answer 11

Thanks Jeff,

I will use a directory at:

C:\Temp\Eq_files\Eq_sample_1.csv and Eq_sample_2.csv

C:\Temp\sync_files\sync_sample_1.csv and sync_sample_2.csv

C:\Temp\pcut_files\pcut_sample_1.csv and pcut_sample_2.csv

C:\Temp\sm_files\V151110456_print.csv and V1511100466_print.csv

All the files are zipped in a folder (sample files, table structure image and t-code). Pcut_files and sm_files have headers. Eq_files have a big length problematic field.

For the pcut_files, the first header row is unnecessary. If possible, I would like to have an Identity field with an Identity Increment for all those tables.

I tried CREATE TABLE t_eq_temp then BULK INSERT on it in order to SELECT INTO t_equitrac table but it didn't work, I can't even remove the double quotes, and the cDocumentName field gets truncated... For the eq_sample_1.csv a BULK INSERT works with all fields as VARCHAR (MAX) but, I still have the unnecessary double quotes and the sDocumentName gets splited and part of it goes to the next fields. The real files have more than 100k rows, I can send you by e-mail if you prefer. Is there a way to avoid duplicated rows in case I try to insert the same file over again?

All files are in original format now, and to be honest, I didn't even imagine that we would go this further with this help and I am very glad for that.

Many Thanks again!

Andre

urso47 SSC Eights! Points: 979 More actions · Answer 12

urso47

SSC Eights!

Points: 979

December 27, 2012 at 2:05 am

#1571768

Up!!!:-D

urso47 SSC Eights! Points: 979 More actions · Answer 13

Hi Celko,

Thank you for your message. I just submitted a download form and a message said that an Astera Software representative will be in contact shortly.

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 14

CELKO (12/27/2012)
I sued Monarch years ago,...

Why did you need to sue them?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)