Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

How to import non-delimited text file Expand / Collapse
Author
Message
Posted Wednesday, July 3, 2013 12:07 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, November 20, 2014 9:16 AM
Points: 8, Visits: 46
Hi, all. I (will) have a non-delimited text file containing 35 lines each for over 5000 records. Below is an example of lines 1-5 and 30-35 of one record. I need two of the 35 values (SubnetMask and SubnetDescription) for each of the 5000+ records in this long, continuous file. Just looking for high-level ideas on the best way to handle this. All my other processes are in SSIS, so will ultimately integrate this import/parse solution in the flow. Any thoughts appreciated.

SubnetAddress=10.16.224.128
SubnetName=FOS-Salzburg Kasernenstraase #823 VPN
SubnetMask=255.255.255.128
NetworkAddress=10.0.0.0
LocationID=895
...
SubnetDescription=CHINA-BEIJING-C025 F2 XIANGJIANG CNVASX1706
DHCPOptionTemplate=
DHCPPolicyTemplate=
PrimaryInterface=No
AllowDHCPClientsModifyDynamicObjectResourceRecords=Same As in Global Policies
Post #1470186
Posted Wednesday, July 3, 2013 1:53 PM
SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 11:35 AM
Points: 4,482, Visits: 3,938
This is a pretty interesting problem. I don't know if this is the best way to handle it, but here's one approach. Let me admit up-front that I've not tried this, but it seems feasible. The only gotcha I see is if the values themselves will contain =, which is addressed below.

I'd start by creating an XML format file for it and an inbox table to receive the data. Import the file using BULK INSERT into the inbox table. If the = doesn't appear in the values themselves, the inbox table can consist of 2 columns - name and value. If it does, use one column and split the string on the first = in the field. An identity field in the inbox table can help with keeping the sequence of the rows straight.

Next, delete everything from the table where the name isn't one of the two you need.

You now have a table of rows containing everything you need. Use the identity field to sort (so things are paired up correctly) and you can insert it from the inbox table into a permanent table for long-term storage.

Like I said, I've not tried this. If there's a better way, I'm definitely interested in it.



Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions
Post #1470215
Posted Wednesday, July 3, 2013 2:05 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 10:08 AM
Points: 3,937, Visits: 8,927
Here's an idea, you might have to work on it.
First, insert into a dummy table with an identity column and a varchar column (the length might depend on your file.
Then, with that dummy table you can work something around like this. I used a CTE to represent the dummy table.

WITH Dummy(ID, String) AS(
SELECT 1,'SubnetAddress=10.16.224.128' UNION ALL
SELECT 2,'SubnetName=FOS-Salzburg Kasernenstraase #823 VPN' UNION ALL
SELECT 3,'SubnetMask=255.255.255.128' UNION ALL
SELECT 4,'NetworkAddress=10.0.0.0' UNION ALL
SELECT 5,'LocationID=895' UNION ALL
SELECT 30,'SubnetDescription=CHINA-BEIJING-C025 F2 XIANGJIANG CNVASX1706' UNION ALL
SELECT 31,'DHCPOptionTemplate=' UNION ALL
SELECT 32,'DHCPPolicyTemplate=' UNION ALL
SELECT 33,'PrimaryInterface=No' UNION ALL
SELECT 34,'AllowDHCPClientsModifyDynamicObjectResourceRecords=Same As in Global Policies'
),
Split AS(
SELECT LEFT( String, CHARINDEX( '=', String) - 1) columnname,
SUBSTRING( String, CHARINDEX( '=', String) + 1, 8000) value,
ROW_NUMBER() OVER( PARTITION BY LEFT( String, CHARINDEX( '=', String) - 1) ORDER BY ID) rn
FROM Dummy
)
SELECT MAX( CASE WHEN columnname = 'SubnetMask' THEN value END) AS SubnetMask,
MAX( CASE WHEN columnname = 'SubnetDescription' THEN value END) AS SubnetDescription
FROM Split
GROUP BY rn




Luis C.
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1470220
Posted Wednesday, July 3, 2013 2:21 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, November 20, 2014 9:16 AM
Points: 8, Visits: 46
Wow, I very much appreciate this input. I follow what both of you are saying and never would have thought of either. I will attempt to implement both and see which one I like best. Will be a good learning experience as I'm fairly new to SSIS. Again, thank you.
Post #1470227
Posted Wednesday, July 3, 2013 3:16 PM
SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 11:35 AM
Points: 4,482, Visits: 3,938
I really like the approach Luis took to split the data. You could use the BULK INSERT into a single field from my approach and then split the values using the his CTE. I think his Dummy table is meant to implement the physical inbox table with a single field from my example in a reproducible form.


Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions
Post #1470247
Posted Wednesday, July 3, 2013 3:20 PM
SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 11:35 AM
Points: 4,482, Visits: 3,938
I should have pointed out that you don't have to use SSIS for this. You could implement this is a SQL procedure using the BULK INSERT statement. The only caveat here is that the file you're importing from must be accessible in the file system on the server. The permission required to use BULK INSERT is the server role bulk_admin or better.


Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions
Post #1470248
Posted Wednesday, July 3, 2013 3:31 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 10:08 AM
Points: 3,937, Visits: 8,927
You're right Ed, I called it a dummy table and you called it inbox table but we're referring to the same thing. There's not much difference on our methods, it's mostly a way on which they're expressed.

SSIS would be optional. It can be done with or without it depending on personal preferences or policies.



Luis C.
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1470253
Posted Wednesday, July 3, 2013 4:17 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, November 20, 2014 9:16 AM
Points: 8, Visits: 46
Thanks for the clarification guys, helps a lot. I can, indeed, access the files from the SQL Server so a bulk insert should work well. I will implement via a stored proc and simply call it from within my SSIS flow, where this and other imports will be automated. Part of my SSIS flow will be to copy the text file from a network share to a local folder. Thanks to your input, I now know how to proceed and don't have to worry about this over this 4th-of-July weekend. Much appreciated. :)
Post #1470260
Posted Wednesday, July 3, 2013 7:59 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Monday, November 24, 2014 4:53 AM
Points: 3,422, Visits: 5,368
Similar to Luis's suggestion:

WITH Dummy(ID, String) AS(
SELECT 1,'SubnetAddress=10.16.224.128' UNION ALL
SELECT 2,'SubnetName=FOS-Salzburg Kasernenstraase #823 VPN' UNION ALL
SELECT 3,'SubnetMask=255.255.255.128' UNION ALL
SELECT 4,'NetworkAddress=10.0.0.0' UNION ALL
SELECT 5,'LocationID=895' UNION ALL
SELECT 30,'SubnetDescription=CHINA-BEIJING-C025 F2 XIANGJIANG CNVASX1706' UNION ALL
SELECT 31,'DHCPOptionTemplate=' UNION ALL
SELECT 32,'DHCPPolicyTemplate=' UNION ALL
SELECT 33,'PrimaryInterface=No' UNION ALL
SELECT 34,'AllowDHCPClientsModifyDynamicObjectResourceRecords=Same As in Global Policies'
)
SELECT SubnetMask=MAX(CASE [key] WHEN 'SubnetMask' THEN [value] END)
,SubnetDescription=MAX(CASE [key] WHEN 'SubnetDescription' THEN [value] END)
FROM Dummy a
CROSS APPLY (
SELECT [key]=LEFT(String, CHARINDEX('=', String)-1)
,[value]=RIGHT(String, LEN(String)-CHARINDEX('=', String))) b




My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1470283
Posted Monday, July 8, 2013 11:40 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, November 20, 2014 9:16 AM
Points: 8, Visits: 46
Dwain.c, your code generates a single output row, not one for each record (comprising 35 rows) in the source file.
Post #1471302
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse