Hi Jeff, thanks for your interest. Ok, you asked for it so here goes...
Some background info first. We have an ASP.NET Web Application that serves up a list of names in a dynamic tree control. The nodes of the tree are generated on demand and loaded into the tree dynamically. When this list of names become very big (and indeed it will be), the delay takes too long. So to minimize the delay we will group the names into dynamically created folders. Each folder will contain n number of names based on some passed in limit (In my code below this is the variable @NthRow). This way we can mimimize the number of nodes to fetch yet still support a large number of names.
The final output of our function must be a table of strings in the form of X - Y where X is the starting name and Y is the ending name of the group.
For example, given the following names and @NthRow = 5
Baker, Ms. Jamie
Espinosa, Ms. Jean
Gonzales, Mr. Robert
Holland, Ms. Julia
Jimenez, Ms. Soni
Macdonald, Mr. Mickey
Miller, Ms. Jana
Noriega, Ms. Michelle
Owen, Ms. Alma
Ramirez, Ms. Stephanie
Salas, Ms. Jeanine
Sipes, Mr. Tyler
Tamayo, Ms. Laura
Timmer, Ms. Julie
Trevino, Ms. Yvonne
Young, Ms. Annia
Produce an output table of:
Ba-Ji represents the names Baker, Ms. Jamie thru Jimenez, Ms. Soni
Ma-Ra represents the names Macdonald, Mr. Mickey thru Ramirez, Ms. Stephanie
Sa-Tr represents the names Salas, Ms. Jeanine thru Trevino, Ms. Yvonne
Y-Y represents the name Young, Ms. Annia
Got it? Now for the code that I have so far, which works mind you, but may not be very efficient and/or elegant as I'm a T-SQL newbie. As you can probably tell from my style, I am a developer. This style may or may not lead to very efficient code in T-SQL. Any hoot, please feel free to comment on how I can better follow DBA "best practices" where you see fit.
The code itself is not in a function yet as I wanted to nail down the agloritm first. So I was just working of a script file. below is the content of what I have thus far.
I bet you're sorry you asked.
DECLARE @RowId INT
DECLARE @maxRowId INT
DECLARE @StartName VARCHAR(8000)
DECLARE @EndName VARCHAR(8000)
DECLARE @LastFolderNodeStartNameID INT
DECLARE @i INT
DECLARE @len INT
DECLARE @NodeText VARCHAR(8000)
DECLARE @FullNodeTextCount INT
DECLARE @NthRow INT
-- Temp table variable to store raw query results with an ID
DECLARE @raw TABLE
ID int IDENTITY(1,1),
-- Temp table variable to store long versions of both the starting text and the ending text
DECLARE @FullNodeText TABLE
ID int IDENTITY(1,1),
-- Temp table variable to store results
DECLARE @Results TABLE
-- Initialize tables and variables
SET @NthRow = 100 -- This controls how many nodes will be represented by each folder node. Later this will be passed in as an argument to the function
-- Populate the Raw table
INSERT INTO @raw (RawText)
SELECT RTRIM(LastName) + ', ' + RTRIM(Title) + ' ' + RTRIM(FirstName) + ' ' + RTRIM(MiddleName) + ' ' + RTRIM(Suffix) AS RawText
WHERE KeyHolderID > 0
ORDER BY RawText
-- Populate the FullNodeText table with with every nth and nth + 1 raw text
INSERT INTO @FullNodeText(StartText,EndText)
SELECT R1.RawText, R2.RawText
FROM @raw R1
INNER JOIN @raw R2 ON R1.ID + @NthRow - 1 = R2.ID
WHERE R1.ID % @NthRow-1 = 0
-- How many text nodes were found?
SELECT @MaxRowId = MAX(ID) FROM @raw
-- Were there any remainders?
IF @MaxRowId ((SELECT COUNT(*) FROM @FullNodeText) * @NthRow )
-- In the case we have a remainder, we still need to add the last folder node start and end names to our final results.
-- This occurs when there are remaining node(s) after extracting the nth and nth + 1 row (previously).
-- To do this simply calculate the IDs and insert them into the FullNodeText table.
SET @LastFolderNodeStartNameID = (SELECT COUNT(*) FROM @FullNodeText) * @NthRow + 1
SET @StartName = (SELECT RawText FROM @raw WHERE id = @LastFolderNodeStartNameID)
SET @EndName = (SELECT RawText FROM @raw WHERE id = @MaxRowId)
INSERT INTO @FullNodeText(StartText,EndText) VALUES(@StartName, @EndName)
-- Determine the start and end row IDs (@FullNodeText.ID is an Identity column, auto-incrementing starting at 1)
SELECT @RowId = MIN(ID), @MaxRowId = MAX(ID) FROM @FullNodeText
SET @StartName = ''
SET @EndName = ''
-- Iterate through the FullNodeText table one row at a time.
-- For each row, compare the Start text with the end text one character at a time starting
-- from the left until a difference is detected. When the difference is found insert the
-- strings into another temp table which is our final function results
WHILE @RowId <= @MaxRowId BEGIN
DECLARE @StartText VARCHAR(8000)
DECLARE @EndText VARCHAR(8000)
-- Fetch the values from the @FullNodeText table one row at a time
SELECT @StartName = StartText, @EndName = EndText
WHERE ID = @RowId
-- It is possible to have a single remaining row in which case the
-- start name would
IF @StartName = @EndName
SET @StartText = SUBSTRING(@StartName, 1, 1)
SET @EndText = SUBSTRING(@EndName, 1, 1)
INSERT INTO @Results (ID, NodeText) VALUES(-1, @StartText + '-' + @EndText)
-- compare on character at a time starting at the left until different.
SET @i = 1
SET @len = Len(@EndName)
WHILE @i <= @Len BEGIN
IF (UPPER(LEFT(@StartName, @i))) UPPER((LEFT(@EndName, @i)))
SET @StartText = SUBSTRING(@StartName, 1, @i + 1)
SET @EndText = SUBSTRING(@EndName, 1, @i + 1)
SET @NodeText = @StartText + '-' + @EndText
INSERT INTO @Results (ID, NodeText) VALUES(-1, @NodeText)
SET @i = @i + 1
-- increment loop counter
SET @RowId = @RowId + 1
SELECT * FROM @Results