Need help to query a non-normalized database

  • Hi!

    I need to create a report regarding the victims of crimes.

    I've joined all the tables I need.

    But, now I'm stumped and can't proceed.

    This is a commercial-off-the-shelf database, so I can't normalize it.

    I need to connect the LWCHRG.cnt field (numeric) to the LWNAMES.VictimOfCrime field.

    VictimOfCrime is a free-form text field that contains a series of numbers, separated by commas. There may be some embedded spaces as well.

    I need to create a row in the resultset for each victim-crime combo.

    (See attached sample data.)

    For VIctim#1 in the LWNAMES table, I need to create

    8 rows in the result set (crimes 3,4,6,8,10,12,14,15)

    For VIctim#2 in the LWNAMES table, I need to create

    7 rows in the result set (crimes 1,2,5,7,9,11,13)

    For VIctim#3 in the LWNAMES table, I need to create

    3 rows in the result set (crimes 1,5,2)

    Please help! Our software vendor doesn't have any "canned" reports that will give us what we need.

    So far, none of their TechSupport people have been able to help me.

  • Check out Jeff Moden's splitter to do this: http://www.sqlservercentral.com/articles/Tally+Table/72993/

    Jared
    CE - Microsoft

  • ckelly

    Where did you get the screen shots ? It doesn't look like SSMS

  • piotrka (3/8/2012)


    ckelly

    Where did you get the screen shots ? It doesn't look like SSMS

    Who said the OP is using SSMS? Could be any 1 of many 3rd party GUIs.

    Jared
    CE - Microsoft

  • Just for fun (and a little learning), I made it work with your example:

    SELECT lw.SomeCode, lw.Seq, lw.LWType, lw.Age,lw.Race,lw.Sex,

    SUBSTRING(','+lw.Crimes+',',N+1,CHARINDEX(',',','+lw.Crimes+',',N+1)-N-1) AS Value

    FROM SCRIDB.dbo.Tally t

    CROSS JOIN lwNames lw

    WHERE N < LEN(','+lw.Crimes+',')

    AND SUBSTRING(','+lw.Crimes+',',N,1) = ','

    ;

    You still need to create the Tally table, though. Just follow the instructions in Jeff's article.

    (Wow, Jeff, that was mind-boggling!)

    Pieter

  • this one loks so interesting, but I'm not able to take the time to convert the data in the screenshots into tables myself.

    can you post a few sample rows of data fromt hose two tables?

    from there it's be really easy to prove my theoretical query works;

    here's my first guess:

    SELECT

    LWNAMES.*,

    mySplitData.*,

    LWCHRG.*

    FROM LWNAMES

    CROSS APPLY dbo.DelimitedSplit8K(LWNAMES.[Victim of Crime],',') mySplitData

    LEFT OUTER JOIN LWCHRG

    ON mySplitData.Item = LWCHRG.CNT

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • pietlinden (3/8/2012)


    Just for fun (and a little learning), I made it work with your example:

    SELECT lw.SomeCode, lw.Seq, lw.LWType, lw.Age,lw.Race,lw.Sex,

    SUBSTRING(','+lw.Crimes+',',N+1,CHARINDEX(',',','+lw.Crimes+',',N+1)-N-1) AS Value

    FROM SCRIDB.dbo.Tally t

    CROSS JOIN lwNames lw

    WHERE N < LEN(','+lw.Crimes+',')

    AND SUBSTRING(','+lw.Crimes+',',N,1) = ','

    ;

    You still need to create the Tally table, though. Just follow the instructions in Jeff's article.

    (Wow, Jeff, that was mind-boggling!)

    Pieter

    Thank you for the feedback, Pieter.

    The code above is using the "old" version of a splitter that a whole lot of people before me have used. It has a major performance problem associated with it when the number of elements per row increases. There's a new, much faster, much more linear splitter available as an attachment to the following article.

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ckelly 43038 (3/8/2012)


    Hi!

    I need to create a report regarding the victims of crimes.

    I've joined all the tables I need.

    But, now I'm stumped and can't proceed.

    This is a commercial-off-the-shelf database, so I can't normalize it.

    I need to connect the LWCHRG.cnt field (numeric) to the LWNAMES.VictimOfCrime field.

    VictimOfCrime is a free-form text field that contains a series of numbers, separated by commas. There may be some embedded spaces as well.

    I need to create a row in the resultset for each victim-crime combo.

    (See attached sample data.)

    For VIctim#1 in the LWNAMES table, I need to create

    8 rows in the result set (crimes 3,4,6,8,10,12,14,15)

    For VIctim#2 in the LWNAMES table, I need to create

    7 rows in the result set (crimes 1,2,5,7,9,11,13)

    For VIctim#3 in the LWNAMES table, I need to create

    3 rows in the result set (crimes 1,5,2)

    Please help! Our software vendor doesn't have any "canned" reports that will give us what we need.

    So far, none of their TechSupport people have been able to help me.

    You sent me a PM on this one talking about some sort of problem posting. It doesn't look like you actually have a problem with that.

    As I said in my response to that PM, a .JPG won't allow us to deliver the goods in the form of tested code. Please submit data in the format requested in that PM. That same format is also explained in the first link in my signature line below.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Okay, got the new one to work too... took a little while, so maybe I learned something in the process... (To test on larger datasets???)

    ALTER PROCEDURE [dbo].[uspNormalizeData3]

    AS

    -- Jeff Moden's new and improved Splitter

    SELECT lw.SomeCode, lw.Seq, lw.lwtype, lw.age, lw.race, lw.sex,split.ItemNumber, Item = CAST(REPLACE(QUOTENAME(split.Item,'"'),'"','') AS integer)

    FROM FixLongLat.dbo.lwNames AS lw

    CROSS APPLY dbo.DelimitedSplit8k(lw.Crimes,',') split;

    The whole CAST thing was because it was (strangely enough) returning my value as a string (complete with the quotes) and I wanted a number... But it works now. Does that make me officially S-M-R-T now? (What if I burn my diploma?)

  • Just for fun, I'll post the create table and insert statements that I used to figure this out... no doubt I should have done more (to see how slow the old version of Split8K was), but here it is anyway...

    CREATE TABLE lwNames(

    SomeCodeCHAR(2),

    SeqINT,

    LWTypeCHAR,

    AgeINT,

    Race CHAR,

    SexCHAR,

    CrimesVARCHAR(25)

    );

    Go

    INSERT INTO lwNames(SomeCode, Seq, LWType, Age, Race, Sex, Crimes) VALUES ('VI',1,'S',Null, Null, Null, '3,4,6,8,10,12,14,15');

    INSERT INTO lwNames(SomeCode, Seq, LWType, Age, Race, Sex, Crimes) VALUES ('VI',2,'S',Null, Null, Null, '1,2,5,7,9,11,13');

    INSERT INTO lwNames(SomeCode, Seq, LWType, Age, Race, Sex, Crimes) VALUES ('VI',3,'I',32,'U','F','1,5,2');

    then your SQL statements manipulating the table data (above) go here...

    CREATE PROCEDURE [dbo].[uspNormalizeData3]

    AS

    -- Jeff Moden's new and improved Splitter

    SELECT lw.SomeCode, lw.Seq, lw.lwtype, lw.age, lw.race, lw.sex,split.ItemNumber, Item = CAST(REPLACE(QUOTENAME(split.Item,'"'),'"','') AS integer)

    FROM FixLongLat.dbo.lwNames AS lw

    CROSS APPLY dbo.DelimitedSplit8k(lw.Crimes,',') split

    ;

    GO

    Happy learning!

  • The screen shots are from the software application.

  • Thanks to everyone that has replied to me so far!

    Today, during my workday, I created the attached create-table,insert-rows, and query-code TEXT file.

    I've probably duplicated some of the work that others have done. In my world, we don't "create" tables in the production environment. (A temp table used just during my query should be okay.)

    I kind of understand Jeff Moden's tally table; but not fully.

    Where would I add that code to create it?

    After it's created, how do I use it?

    I apologize for being so obtuse. I'm a visual learner, and have to "see"

    all the resultset for each of the steps, particularly with a totally new concept.

  • ckelly 43038 (3/9/2012)


    Thanks to everyone that has replied to me so far!

    Today, during my workday, I created the attached create-table,insert-rows, and query-code TEXT file.

    I've probably duplicated some of the work that others have done. In my world, we don't "create" tables in the production environment. (A temp table used just during my query should be okay.)

    I kind of understand Jeff Moden's tally table; but not fully.

    Where would I add that code to create it?

    After it's created, how do I use it?

    I apologize for being so obtuse. I'm a visual learner, and have to "see"

    all the resultset for each of the steps, particularly with a totally new concept.

    Personally, for splitter functionality I prefer to use a CTE instead of a physical table.

    Jared
    CE - Microsoft

  • okay, since I read the last how-to instructions, I've been trying to understand and make it work

    Since my tables are live data, I don't need to create them.

    Here's the code that I have in my query window...

    CREATE FUNCTION dbo.DelimitedSplit8K

    -- Define I/O parameters

    (@pString VARCHAR(8000), @pDelimiter CHAR(1))

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    -- "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...

    -- enough to cover VARCHAR(8000)

    WITH E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front

    -- for both a performance gain and prevention of accidental "overruns"

    SELECT 0 UNION ALL

    SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4

    ),

    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

    SELECT t.N+1

    FROM cteTally t

    WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)

    )

    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.

    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),

    Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000))

    FROM cteStart s

    ;

    go

    SELECT

    lw.name_code, lw.nc_seq, lw.name_type, lw.age, lw.race, lw.sex,

    split.ItemNumber,

    Item = CAST(REPLACE(QUOTENAME(split.Item,'"'),'"','') AS integer)

    FROM [FAYOSSIRMS].[rms].[dbo].[lwnames] as lw

    where name_code like 'VI%'

    CROSS APPLY dbo.DelimitedSplit8K(lw.vic_crime,',') split

    I'm getting an error that says...

    Msg 156, Level 15, State 1, Line 8

    Incorrect syntax near the keyword 'CROSS'

    A question I have is this: Does this code create a table for all the rows that match my "VI" criterion, or is only a handful of records created for the values of the vic_crime field--row by row?

  • that was a minor syntax issue...the where statemetn was occurring before the join;

    change to this:

    SELECT

    lw.name_code, lw.nc_seq, lw.name_type, lw.age, lw.race, lw.sex,

    split.ItemNumber,

    Item = CAST(REPLACE(QUOTENAME(split.Item,'"'),'"','') AS integer)

    FROM [FAYOSSIRMS].[rms].[dbo].[lwnames] as lw

    CROSS APPLY dbo.DelimitedSplit8K(lw.vic_crime,',') split

    where name_code like 'VI%'

    and giv that a whirl!

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply