SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Parsing a summary / detail flat file


Parsing a summary / detail flat file

Author
Message
Brandie Tarvin
Brandie Tarvin
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14791 Visits: 9003
My team has a flat file from another system that we need to parse and import into 2 tables. We're currently importing this into a staging table then parsing the staging table RBAR style in WHILE loop. As you can imagine, this is causing us an extraordinary amount of pain. It's taking several days to process one file and the bigger they get...

Each line on the file has a record id. There's 1 (the header), 2 (the vendor info), and 3 (the details). A vendor can have 1-N number of 3 records listed after it but (and here's the kicker) none of the detail records have any identifying information that connect it to the vendor info in record 2. The only way we know they are connect is by the order.

Example Data:

1MyFile05122013
2VendorID123 StartDate Terminated AgreementNum AnotherCol AnotherCol2
3ContractNum SaleDate Product Region Col1 Col2 Col3
2VendorID456 StartDate Terminated AgreementNum AnotherCol AnotherCol2
3ContractNum SaleDate Product Region Col1 Col2 Col3
3ContractNum SaleDate Product Region Col1 Col2 Col3
3ContractNum SaleDate Product Region Col1 Col2 Col3
3ContractNum SaleDate Product Region Col1 Col2 Col3
2VendorID789 StartDate Terminated AgreementNum AnotherCol AnotherCol2
3ContractNum SaleDate Product Region Col1 Col2 Col3
3ContractNum SaleDate Product Region Col1 Col2 Col3

So all the 3 records contain all the details of the vendor that are above it. But again, there's no identifying information between the 3 records and the 2 records. To add to this burden, the report is a rolling 3 month report of all records, so we can't trucate the tables and start over from scratch because it's not inception to date. Also, we have to delete out the current month (though I'm working to get that changed).

Our Staging table looks like this:


CREATE TABLE [dbo].[Staging](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Extract_Record] [varchar](2000) NULL,
[UNID] [int] NULL
) ON [PRIMARY]

GO



Because we can't immediately tell what record is what and we don't want to mess up the order, we insert all the lines into the Extract_Record column and parse everything out later with substrings. We use UNID to create a unifying ID for all these records and insert them into their tables.

Forgive me for scrubbing the heck out of all my column names, but I'm erroring on the side of protecting our business. Here's what our code looks like:



BEGIN
DECLARE @i INT,
@max INT,
@rec SMALLINT,
@newid INT,
@TransactionID INT,
@unid INT;

--Below code sets unid to identify complete record sets

SELECT @i=1,@max=MAX(id),@newid=0
FROM dbo.Staging;

WHILE @i<=@max
BEGIN
SELECT @rec=CONVERT(SMALLINT,LEFT(extract_RI,1))
FROM dbo.Staging
WHERE id=@i;

IF @rec>1 AND @rec<4
BEGIN
IF @rec=2
BEGIN
SET @newid=@newid+1;

UPDATE dbo.Staging
SET unid=@newid
WHERE id=@i;
END
ELSE
BEGIN
UPDATE dbo.Staging
SET unid=@newid
WHERE id=@i;
END
END

SET @i=@i+1;
END

SELECT @i=1,@unid=NULL, @rec=NULL;

WHILE @i<=@max
BEGIN
SELECT @unid=unid
FROM dbo.Staging
WHERE unid IS NOT NULL and id=@i;

IF @unid IS NOT NULL
BEGIN
SELECT @rec=CONVERT(SMALLINT,LEFT(extract_RI,1))
FROM dbo.Staging
WHERE id=@i;

IF @rec=2
BEGIN
--SELECT @RI=Substring(extract_RI, 2,10)
--FROM dbo.Staging
--WHERE id=@i;

--Below code deletes current month data. Reporting team only requires previous months data
WITH CurrentMonth AS
(select * from dbo.Staging
WHERE LTRIM(RTRIM(Substring(Extract_Record, 97,6))) = Substring(CONVERT(varchar,GETDATE(),112),1,6))
DELETE FROM dbo.Staging FROM dbo.Staging ser
INNER JOIN CurrentMonth cm
ON ser.UNID = cm.UNID

SELECT @TransactionID=@@IDENTITY;

--Below code loads differential data from staging table to final reporting table

INSERT INTO Summary (Col1,
Col2,
Col3,
Col4,
Col5,
Col6,
Col7,
Col8,
Col9
)
SELECT
LTRIM(RTRIM(Substring(Extract_Record, 2,4))) AS Col1,
LTRIM(RTRIM(Substring(Extract_Record, 6,10))) AS Col2,
LTRIM(RTRIM(Substring(Extract_Record, 16,3))) AS Col3,
LTRIM(RTRIM(Substring(Extract_Record, 19,20))) AS Col4,
LTRIM(RTRIM(Substring(Extract_Record, 39,8))) AS Col5,
LTRIM(RTRIM(Substring(Extract_Record, 47,10))) AS Col6,
LTRIM(RTRIM(Substring(Extract_Record, 57,10))) AS Col7,
LTRIM(RTRIM(Substring(Extract_Record, 67,30))) AS Col8,
LTRIM(RTRIM(Substring(Extract_Record, 97,8))) AS Col9
FROM dbo.Staging ri
LEFT OUTER JOIN Summary rit
ON LTRIM(RTRIM(Substring (ri.Extract_Record, 2,4))) = LTRIM(RTRIM(rit.Col1))
AND LTRIM(RTRIM(Substring(ri.Extract_Record, 6,10))) = LTRIM(RTRIM(rit.Col2))
AND LTRIM(RTRIM(Substring(ri.Extract_Record, 16,3))) = LTRIM(RTRIM(rit.Col3))
AND LTRIM(RTRIM(Substring(ri.Extract_Record, 19,20)))= LTRIM(RTRIM(rit.Col4))
AND LTRIM(RTRIM(Substring(ri.Extract_Record, 39,8))) = LTRIM(RTRIM(rit.Col5))
AND LTRIM(RTRIM(Substring(ri.Extract_Record, 47,10)))= LTRIM(RTRIM(rit.Col6))
AND LTRIM(RTRIM(Substring(ri.Extract_Record, 57,10)))= LTRIM(RTRIM(rit.Col7))
AND LTRIM(RTRIM(Substring(ri.Extract_Record, 67,30)))= LTRIM(RTRIM(rit.Col8))
AND LTRIM(RTRIM(Substring(ri.Extract_Record, 97,8)))= LTRIM(RTRIM(rit.Col9))
WHERE id=@i
AND rit.Col1 IS NULL
AND rit.Col2 IS NULL
AND rit.Col3 IS NULL
AND rit.Col4 IS NULL
AND rit.Col5 IS NULL
AND rit.Col6 IS NULL
AND rit.Col7 IS NULL
AND rit.Col8 IS NULL
AND rit.Col9 IS NULL;

SELECT @TransactionID=@@IDENTITY;
END
ELSE IF @rec=3
BEGIN

--Below code loads differential data from staging table to final reporting table

INSERT INTO Detail (TransactionID,
Col1,
Col2,
Col3,
Col4,
Col5,
Col6,
Col7,
Col8,
Col9,
Col10,
Col11,
Col12,
Col13,
Col14,
Col15,
Col16)

SELECT @TransactionID,
LTRIM(RTRIM(Substring(Extract_Record, 2,4))) AS Col1,
LTRIM(RTRIM(Substring(Extract_Record, 6,4))) AS Col2,
LTRIM(RTRIM(Substring(Extract_Record, 10,20))) AS Col3,
LTRIM(RTRIM(Substring(Extract_Record, 30,8))) AS Col4,
LTRIM(RTRIM(Substring(Extract_Record, 38,24))) AS Col5,
LTRIM(RTRIM(Substring(Extract_Record, 62,8))) AS Col6,
LTRIM(RTRIM(Substring(Extract_Record, 70,8))) AS Col7,
LTRIM(RTRIM(Substring(Extract_Record, 78,24))) AS Col8,
LTRIM(RTRIM(Substring(Extract_Record, 102,10))) AS Col9,
LTRIM(RTRIM(Substring(Extract_Record, 112,4))) AS Col10,
LTRIM(RTRIM(Substring(Extract_Record, 116,10))) AS Col11,
LTRIM(RTRIM(Substring(Extract_Record, 126,4))) AS Col12,
LTRIM(RTRIM(Substring(Extract_Record, 130,24))) AS Col13,
LTRIM(RTRIM(Substring(Extract_Record, 154,3))) AS Col14,
LTRIM(RTRIM(Substring(Extract_Record, 157,17))) AS Col15,
LTRIM(RTRIM(Substring(Extract_Record, 174,5))) AS Col16
FROM dbo.Staging ri2
LEFT OUTER JOIN sap.tblRITransactionDetail rit2
ON LTRIM(RTRIM(Substring(ri2.Extract_Record, 2,4))) = LTRIM(RTRIM(rit2.Col1))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 6,4))) = LTRIM(RTRIM(rit2.Col2))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 10,20))) = LTRIM(RTRIM(rit2.Col3))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 30,8))) = LTRIM(RTRIM(rit2.Col4))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 38,24)))= LTRIM(RTRIM(rit2.Col5))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 62,8))) = LTRIM(RTRIM(rit2.Col6))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 70,8))) = LTRIM(RTRIM(rit2.Col7))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 78,24))) = LTRIM(RTRIM(rit2.Col8))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 102,10))) = LTRIM(RTRIM(rit2.Col9))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 112,4))) = LTRIM(RTRIM(rit2.Col10))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 116,10))) = LTRIM(RTRIM(rit2.Col11))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 126,4))) = LTRIM(RTRIM(rit2.Col12))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 130,24))) = LTRIM(RTRIM(rit2.Col13))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 154,3))) = LTRIM(RTRIM(rit2.Col14))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 157,17)))= LTRIM(RTRIM(rit2.Col15))
AND LTRIM(RTRIM(Substring(ri2.Extract_Record, 174,5))) = LTRIM(RTRIM(rit2.Col16))
WHERE id=@i
AND rit2.Col1 IS NULL
AND rit2.Col2 IS NULL
AND rit2.Col3 IS NULL
AND rit2.Col4 IS NULL
AND rit2.Col5 IS NULL
AND rit2.Col6 IS NULL
AND rit2.Col7 IS NULL
AND rit2.Col8 IS NULL
AND rit2.Col9 IS NULL
AND rit2.Col10 IS NULL
AND rit2.Col11 IS NULL
AND rit2.Col12 IS NULL
AND rit2.Col13 IS NULL
AND rit2.Col14 IS NULL
AND rit2.Col15 IS NULL
AND rit2.Col16 IS NULL;

END
END

SET @i=@i+1;
END
END



So, messy, slow and very very annoying. I've got a few thoughts on how we can start to fix it, but before I share I wanted to see what everyone else was thinking. I'm hoping someone has an ephiphany that can help.

So, any ideas?

Just an FYI: there's no fixing the input file in our immediate future. Just getting this much information was like pulling teeth and the other team takes months upon months (if even that soon) to work changes through their SDLC. So I have to do what I can to mitigate the load issue now with the file that I have.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
Sioban Krzywicki
Sioban Krzywicki
SSCommitted
SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)

Group: General Forum Members
Points: 1865 Visits: 8090
Can you use SSIS?
I've done something similar recently with SSIS, so if you can use it I think you have a good option.

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
Lynn Pettis
Lynn Pettis
SSC-Dedicated
SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)

Group: General Forum Members
Points: 39894 Visits: 38563
Also, could you mock-up some sample data and provide a destination for the data? Could help figure out how to approach the problem.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Brandie Tarvin
Brandie Tarvin
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14791 Visits: 9003
Stefan Krzywicki (5/13/2013)
Can you use SSIS?
I've done something similar recently with SSIS, so if you can use it I think you have a good option.


Yes, we can use SSIS. In fact we're using SSIS to load the file into our staging table and run the proc that does the update.

I've got a few meetings, but after those I'll dummy up some data and sanitze the destination tables and post them in a bit.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
Sioban Krzywicki
Sioban Krzywicki
SSCommitted
SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)

Group: General Forum Members
Points: 1865 Visits: 8090
Brandie Tarvin (5/13/2013)
Stefan Krzywicki (5/13/2013)
Can you use SSIS?
I've done something similar recently with SSIS, so if you can use it I think you have a good option.


Yes, we can use SSIS. In fact we're using SSIS to load the file into our staging table and run the proc that does the update.

I've got a few meetings, but after those I'll dummy up some data and sanitze the destination tables and post them in a bit.


I have two files I need to process every week that have a similar format. Descriptive lines that I don't need to import and multiple header rows that are interspersed with corresponding data rows. I use a conditional split to check the line type and divert it to the proper parsing and insert, which I think you do with the IF @rec=2 line.

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
Brandie Tarvin
Brandie Tarvin
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14791 Visits: 9003
Duh. Forgot there is a record 4 line which is the last line of the file. Anyway, sample data attached and CREATE TABLE code as follows:


/****** Object: Table [dbo].[Detail] Script Date: 05/13/2013 10:56:50 ******/
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Detail]') AND type in (N'U'))
DROP TABLE [dbo].[Detail]
GO

/****** Object: Table [dbo].[Summary] Script Date: 05/13/2013 10:56:50 ******/
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Summary]') AND type in (N'U'))
DROP TABLE [dbo].[Summary]
GO

/****** Object: Table [dbo].[Detail] Script Date: 05/13/2013 10:56:51 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[Detail](
[TransactionDetailID] [int] IDENTITY(1,1) NOT NULL,
[TransactionID] [int] NULL,
[Col1] [varchar](4) NOT NULL,
[Col2] [varchar](4) NOT NULL,
[Col3] [varchar](20) NULL,
[Col4] [varchar](8) NULL,
[Col5] [varchar](24) NULL,
[Col6] [varchar](8) NULL,
[Col7] [varchar](8) NULL,
[Col8] [varchar](24) NOT NULL,
[Col9] [varchar](10) NOT NULL,
[Col10] [varchar](4) NOT NULL,
[Col11] [varchar](10) NOT NULL,
[Col12] [varchar](4) NOT NULL,
[Col13] [varchar](24) NOT NULL,
[Col14] [varchar](3) NOT NULL,
[Col15] [varchar](17) NOT NULL,
[Col16] [varchar](5) NOT NULL,
CONSTRAINT [PK_Detail_TransactionDetailID] PRIMARY KEY NONCLUSTERED
(
[TransactionDetailID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO

/****** Object: Table [dbo].[Summary] Script Date: 05/13/2013 10:56:51 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[Summary](
[TransactionID] [int] IDENTITY(1,1) NOT NULL,
[Col1] [varchar](4) NOT NULL,
[Col2] [varchar](10) NOT NULL,
[Col3] [varchar](3) NOT NULL,
[Col4] [varchar](20) NOT NULL,
[Col5] [varchar](8) NOT NULL,
[Col6] [varchar](10) NOT NULL,
[Col7] [varchar](10) NOT NULL,
[Col8] [varchar](30) NULL,
[Col9] [varchar](8) NULL,
CONSTRAINT [PK_Summary_TransactionID] PRIMARY KEY NONCLUSTERED
(
[TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO



Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
Attachments
SampleData.txt (20 views, 1.00 KB)
WayneS
WayneS
SSCrazy Eights
SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)

Group: General Forum Members
Points: 9974 Visits: 10574
Brandie - just to verify... you are using SQL 2008? (I know you posted it in this forum, but we both know that folks sometimes make mistakes...)

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
Links: For better assistance in answering your questions, How to ask a question, Performance Problems, Common date/time routines,
CROSS-TABS and PIVOT tables Part 1 & Part 2, Using APPLY Part 1 & Part 2, Splitting Delimited Strings

Brandie Tarvin
Brandie Tarvin
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14791 Visits: 9003
WayneS (5/13/2013)
Brandie - just to verify... you are using SQL 2008? (I know you posted it in this forum, but we both know that folks sometimes make mistakes...)


Yes, I am using 2008. Plain vanilla Enterprise Edition 2008.

NOT R2, BTW.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
Chad Crawford
 Chad Crawford
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2856 Visits: 18718
Yo Jeff, sounds like a candidate for the "Quirky Update"?
Robyn Page's Article
Jeff Moden's Article

I added a second UNID column and ran this:

DECLARE @RollingCounterLevel1 int
, @RollingCounterLevel2 int

UPDATE Staging
SET @RollingCounterLevel1 = UNID = CASE LEFT(Extract_Record, 1) WHEN '1' THEN ID ELSE @RollingCounterLevel1 END
, @RollingCounterLevel2 = UNID2 = CASE LEFT(Extract_Record, 1) WHEN '1' THEN NULL WHEN 2 THEN ID ELSE @RollingCounterLevel2 END



It gave me the correct 1st and 2nd level IDs, you might need to tweak it depending if you want the 2nd level value to be itself or the 1st level value.

As I remember there are several prerequisites to make sure the Quirky Update works (hints, parallelism, indexes, etc.) and MS doesn't "guarantee" it will always work, but Jeff couldn't find a situation where it didn't work as he expected.

Chad
wolfkillj
wolfkillj
UDP Broadcaster
UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)

Group: General Forum Members
Points: 1468 Visits: 2582
Brandie,

I couldn't use your sample data because the text file formatting seemed to be inconsistent from line to line, so I made up my own. This solution assumes that you can get the rows into a table in the correct order, parse the rows into columns, and once you can assign an identifying number to the rows that belong together, you can do what you need to do with them. This takes three full scans of the table to work, but with some divide-and-conquer (temp tables instead of CTEs, etc.) supported by proper indexing, you may be able to get it to perform acceptably.

In the output of this code, rows with the same listNbr belong together.



-- Please remove the extraneous letter "x" from the CREATE keyword. My company filters internet traffic with certain T-SQL keywords in it.

CxREATE TABLE dbo.fileParseTest (rowNbr INT IDENTITY(1,1), rowType int, data varchar(20))

INSERT INTO dbo.fileParseTest(rowType, data)

VALUES (1, 'HEADER')
,(2, 'VENDOR123')
,(3, 'CONTRACTX')
,(3, 'CONTRACTY')
,(3, 'CONTRACT1')
,(3, 'CONTRACT2')
,(3, 'CONTRACT3')
,(3, 'CONTRACT4')
,(3, 'CONTRACT5')
,(3, 'CONTRACT6')
,(3, 'CONTRACT7')
,(3, 'CONTRACT8')
,(3, 'CONTRACT9')
,(2, 'VENDOR456')
,(3, 'CONTRACTZ')
,(2, 'VENDOR789')
,(3, 'CONTRACTA')
,(3, 'CONTRACTB')
,(2, 'VENDORXYZ')
,(3, 'CONTRACT1')

;

WITH cteA AS (

SELECT ROW_NUMBER() OVER (ORDER BY rowNbr) as listNbr
,rowNbr

FROM dbo.fileParseTest

WHERE rowType = 2

)

,cteB AS (

SELECT c1.listNbr, c1.rowNbr AS rowNbrStart, ISNULL(c2.rowNbr - 1, 100000) AS rowNbrEnd

FROM cteA c1

LEFT OUTER JOIN cteA c2
ON c1.listNbr + 1 = c2.listNbr

)

SELECT b.listNbr, fpt.rowNbr, fpt.rowType, fpt.data

FROM cteB b

LEFT OUTER JOIN dbo.fileParseTest fpt
ON fpt.rowNbr BETWEEN b.rowNbrStart and b.rowNbrEnd





listNbr rowType data
1 2 VENDOR123
1 3 CONTRACTX
1 3 CONTRACTY
1 3 CONTRACT1
1 3 CONTRACT2
1 3 CONTRACT3
1 3 CONTRACT4
1 3 CONTRACT5
1 3 CONTRACT6
1 3 CONTRACT7
1 3 CONTRACT8
1 3 CONTRACT9
2 2 VENDOR456
2 3 CONTRACTZ
3 2 VENDOR789
3 3 CONTRACTA
3 3 CONTRACTB
4 2 VENDORXYZ
4 3 CONTRACT1



Jason Wolfkill
Blog: SQLSouth
Twitter: @SQLSouth
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search