Multiple self joins with Count function goes wrong

  • Hello everyone 🙂 good site, very comprehensive, though after routing round I’m still having trouble getting to grips with Joins. I’ve developed in Filemaker for many years but I’m now trying to learn SQL. What I’m trying to achieve would be very straight forward in Filemaker but I’ve been going around for circles for days trying to do it with SQL.

    I've created some code to try and demonstrate what's going on. The first file is the main table. The problem occurs in the view (2nd bit of code) when I use the Count / Max functions when the 2nd self Join is introduced (commented out initially)

    use LEIS

    CREATE TABLE dbo.OrderItems

    ( OrderItem int not null IDENTITY (1,1)

    ,OrderNumber int not null

    ,LastItemOnOrder tinyint not null

    ,OrderQty tinyint not null

    ,ItemInStock tinyint not null

    ,pk_lioo varchar(20) not null

    );

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7032357 , 1 , 1 , 1 , '7032357:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7032357, 2 , 3 , 1 , '7032357:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7032357 , 1 , 19 , 2 , '7032357:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7034826 , 1 , 2 , 1 , '7034826:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7034826 , 2 , 1 , 0 , '7034826:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 2 , 2 , 1 , '7088650:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 1 , 1 , 2 , '7088650:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 2 , 3 , 0 , '7088650:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 1 , 40 , 1, '7088650:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 2 , 5 , 2 , '7088650:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7171003 , 1 , 3 , 1 , '7171003:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7171003 , 2 , 1 , 1 , '7171003:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7171003 , 2 , 2 , 2 , '7171003:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7175553 , 1 , 1 , 1 , '7175553:1' )

    Then I want to create a view to validation the data. It all works fine until I introduce the 2nd Join (which i’ve commented out, so try both states) I can see what's going on but I don’t understand why I can’t have two joins in the same select statement. Thanks in advance any pointers (apart from do it Filemaker) would be most appreciated.

    USE LEIS

    GO

    CREATE VIEW dbo.OrderItemsView

    AS

    SELECT fk1.OrderNumber

    ,fk1.LastItemOnOrder

    ,CASE

    WHEN fk1.LastItemOnOrder IN(1,2)

    THEN 'Valid'

    ELSE 'Invalid'

    END LIIOcheck

    ,fk1.OrderQty

    ,CASE

    WHEN fk1.OrderQty BETWEEN 1 AND 9

    THEN 'Valid'

    ELSE 'Invalid'

    END OQcheck

    ,fk1.ItemInStock

    ,CASE

    WHEN fk1.ItemInStock BETWEEN 1 AND 2

    THEN 'Valid'

    ELSE 'Invalid'

    END lIScheck

    ,COUNT(pk2.pk_lioo) countE --this works fine until the 2nd Join is introduced.

    ,CASE

    WHEN COUNT(pk2.pk_lioo) > 1

    THEN 'Invalid'

    ELSE 'Valid'

    END lioocheck

    --,MAX(pk3.OrderQty) MaxQty

    FROM dbo.OrderItems fk1

    left join dbo.OrderItems pk2

    on CAST(fk1.OrderNumber as varchar) + ':1' = pk2.pk_lioo

    --left join dbo.OrderItems pk3 --this causes count results to multiply

    --on fk1.OrderNumber = pk3.OrderNumber

    group by fk1.OrderNumber, fk1.LastItemOnOrder, fk1.ItemInStock, fk1.OrderQty

    go

  • The query with and without the third join works fine for me! What is the error that you are seeing?

  • Wow your first post and you posted all the ddl and sample data perfectly!!! Excellent!!! I wish everyone who posted paid as much attention.

    Now for your issue. I see what you mean about your join aliased pk3. The issue is because you have created a cartersian product. You joined the orders table to itself on OrderNumber. Take a look at the following query that will demonstrate what I mean.

    select * from OrderItems oi --this will return 14 rows as expected

    select * from OrderItems oi

    join orderItems oi2 on oi.OrderNumber = oi2.OrderNumber

    --48 ROWS?!?!?!?!?

    I am not totally sure what you are trying to accomplish in your counts but hopefully the above explanation will help clarify why you are seeing the counts like you are.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (4/13/2012)


    Wow your first post and you posted all the ddl and sample data perfectly!!! Excellent!!! I wish everyone who posted paid as much attention.

    Now for your issue. I see what you mean about your join aliased pk3. The issue is because you have created a cartersian product. You joined the orders table to itself on OrderNumber. Take a look at the following query that will demonstrate what I mean.

    select * from OrderItems oi --this will return 14 rows as expected

    select * from OrderItems oi

    join orderItems oi2 on oi.OrderNumber = oi2.OrderNumber

    --48 ROWS?!?!?!?!?

    I am not totally sure what you are trying to accomplish in your counts but hopefully the above explanation will help clarify why you are seeing the counts like you are.

    as sean said you create the cartersian product (each possible combination of the rows possible where ordernumber = ordernumber) i would try either adding columns to the join or joining on order item. if you can post what you would like your results to look like we can most likely get out what you want from the input you gave us (sample data)


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • Thank your time on this. This boils down to my lack of understand of how SQL queries work. I may not have solved this yet but I learning what is probably fundamental to understanding SQL.

    It is possible to set up Cartesian relationships in Filemaker but they don’t have the same effect. If you create a relationship between order number and order number you would still only count the records once and not twice for every match as is happening here.

    I was working on the premise each join was treated separately, what I think is happening is as the query is parsed, SQL creates a temporary table based on the joins, the left join pk2 works fine on its own (the count function works fine) but the left join pk3 relationship seems to set up some sort of outer join which creates the doubling up of the rows (on a match) which gives you a temporary table of 48 rows where there should only be 14 and hence the count no longer works how I would expect it to.

    I have slightly modified the code. We have a table with data entered by the user ([dbo].[OrderItems]) the view ([dbo].[OrderItemsView]) is created temporarily to validate the data entered into the colums (fields) and also checks for ceratin scenarios which should’nt exsist.

    The left join pk2 and the count function are there to determine that there should be only one row (hence combining the two columns OrderNumber and LastItemOnOrder) which has ‘1’ LastItemOnOrder in an OrderNumber set i.e. 7034826

    (the count column in the table is only there to aid bug fixing)

    The left join pk3 and max functions are there to determine if the row has ‘1’ in the LastItemOnOrder it should also be the highest

    (the Min and Max columns in the table are only there to aid bug fixing)

    The data in the view is then inserted into the warehouse table where its stored for reference and used for manual validation (in to spreadsheets by the users)

    I know it may look odd how this has been set up, but its how the system is set up at the moment. My question would be how (if you cant use joins) would you incorporate two selects statements where the current case statements are. Adding subquiries into the create view statement casue and error. How it make sense, this is is quite dificult to put into words. Thanks in advance. (if you’ve read this far and not lost the will to live, hardcore, very comendable 🙂 )

    (Code 1: Creates the user entered data table)

    note: the 2nd join pk3 is not commented out this time so the count function is bad.

    use warehouse

    --drop table dbo.OrderItems

    CREATE TABLE dbo.OrderItems

    ( OrderItem int not null IDENTITY (1,1)

    ,OrderNumber int not null

    ,LastItemOnOrder tinyint not null

    ,OrderQty tinyint not null

    ,ItemInStock tinyint not null

    ,pk_lioo varchar(20) not null

    );

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7032357 , 1 , 1 , 2 , '7032357:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7032357, 2 , 3 , 1 , '7032357:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7032357 , 1 , 19 , 3 , '7032357:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7034826 , 1 , 2 , 2 , '7034826:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7034826 , 2 , 1 , 1 , '7034826:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 2 , 2 , 1 , '7088650:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 1 , 1 , 5 , '7088650:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 2 , 3 , 2 , '7088650:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 1 , 40 , 3 , '7088650:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7088650 , 2 , 5 , 4 , '7088650:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7171003 , 1 , 3 , 2 , '7171003:1' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7171003 , 2 , 1 , 1 , '7171003:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7171003 , 2 , 2 , 3 , '7171003:2' )

    INSERT INTO dbo.OrderItems (OrderNumber, LastItemOnOrder, OrderQty, ItemInStock, pk_lioo)

    VALUES (7175553 , 1 , 1 , 1 , '7175553:1' )

    Go

    --SELECT *

    --FROM dbo.OrderItems

    (Code 2: Creates the final (results table)) table warehouse)

    USE [warehouse]

    --drop table dbo.warehouse

    GO

    /****** Object: Table [dbo].[OrderItems] Script Date: 04/17/2012 10:27:10 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[warehouse](

    [warehouseItem] [int] IDENTITY(1,1) NOT NULL

    --,[OrderItem] [int] NOT NULL

    ,[OrderNumber] [int] NOT NULL

    ,[LastItemOnOrder] [tinyint] NOT NULL

    ,[LIIOcheck] [varchar](7) NULL

    ,[OrderQty] [tinyint] NOT NULL

    ,[OQcheck] [varchar](7) NULL

    ,[ItemInStock] [tinyint] NOT NULL

    ,[lIScheck] [varchar](7) NULL

    ,[lioocheck] [varchar](20) NULL

    ,[IISTcheck] [varchar](20) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    [/sql]

    (Code 3: validates the data flags error and inserts into the data warehouse)

    USE [warehouse]

    --DROP VIEW dbo.OrderItemsView

    GO

    /****** Object: View [dbo].[OrderItemsView] Script Date: 04/17/2012 16:00:46 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE VIEW [dbo].[OrderItemsView]

    AS

    SELECT

    fk1.OrderNumber

    ,fk1.LastItemOnOrder

    ,CASE

    WHEN fk1.LastItemOnOrder IN(1,2)

    THEN 'Valid'

    ELSE 'Invalid'

    END LIIOcheck

    ,fk1.OrderQty

    ,CASE

    WHEN fk1.OrderQty BETWEEN 1 AND 9

    THEN 'Valid'

    ELSE 'Invalid'

    END OQcheck

    ,fk1.ItemInStock

    ,CASE

    WHEN fk1.ItemInStock BETWEEN 1 AND 2

    THEN 'Valid'

    ELSE 'Invalid'

    END lIScheck

    ,COUNT(pk2.pk_lioo) countE --this works fine until the 2nd Join is introduced.

    ,CASE

    WHEN COUNT(pk2.pk_lioo) > 1

    THEN 'Invalid'

    ELSE 'Valid'

    END lioocheck

    ,CASE

    --WHEN MAX(pk3.ItemInStock) = fk1.ItemInStock and fk1.LastItemOnOrder <> 1

    WHEN MAX(pk3.ItemInStock) > fk1.ItemInStock and fk1.LastItemOnOrder = 1

    THEN 'Invalid'

    ELSE 'Valid'

    END IISTcheck

    ,MAX(pk3.ItemInStock) MaxC

    ,MIN(pk3.ItemInStock) MinC

    FROM dbo.OrderItems fk1

    left join dbo.OrderItems pk2

    on CAST(fk1.OrderNumber as varchar) + ':1' = pk2.pk_lioo

    left join dbo.OrderItems pk3 --this causes count results to multiply

    on fk1.OrderNumber = pk3.OrderNumber

    group by fk1.OrderNumber, fk1.LastItemOnOrder, fk1.ItemInStock, fk1.OrderQty

    GO

    INSERT INTO [warehouse].[dbo].[warehouse] ( [OrderNumber]

    ,[LastItemOnOrder]

    ,[LIIOcheck]

    ,[OrderQty]

    ,[OQcheck]

    ,[ItemInStock]

    ,[lIScheck]

    ,[lioocheck]

    ,[IISTcheck]

    )

    SELECT [OrderNumber]

    ,[LastItemOnOrder]

    ,[LIIOcheck]

    ,[OrderQty]

    ,[OQcheck]

    ,[ItemInStock]

    ,[lIScheck]

    ,[lioocheck]

    ,[IISTcheck]

    FROM dbo.OrderItemsView

    GO

    --Thank you

  • you still have the issue of joining on non unique columns. if you take a look order number 7032357 has 3 rows in the table

    SELECT *

    FROM OrderItems o1

    WHERE OrderNumber = 7032357

    ----------Results----------------------

    OrderItem OrderNumber LastItemOnOrder OrderQty ItemInStock pk_lioo

    ----------- ----------- --------------- -------- ----------- --------------------

    1 7032357 1 1 1 7032357:1

    2 7032357 2 3 1 7032357:2

    3 7032357 1 19 2 7032357:1

    when you add your self join on this is what you get when you join on the ordernumber:

    SELECT *

    FROM OrderItems o1

    INNER JOIN OrderItems o2

    ON o1.OrderNumber = o2.OrderNumber

    WHERE o1.OrderNumber = 7032357

    OrderItem OrderNumber LastItemOnOrder OrderQty ItemInStock pk_lioo OrderItem OrderNumber LastItemOnOrder OrderQty ItemInStock pk_lioo

    ----------- ----------- --------------- -------- ----------- -------------------- ----------- ----------- --------------- -------- ----------- --------------------

    1 7032357 1 1 1 7032357:1 1 7032357 1 1 1 7032357:1

    2 7032357 2 3 1 7032357:2 1 7032357 1 1 1 7032357:1

    3 7032357 1 19 2 7032357:1 1 7032357 1 1 1 7032357:1

    1 7032357 1 1 1 7032357:1 2 7032357 2 3 1 7032357:2

    2 7032357 2 3 1 7032357:2 2 7032357 2 3 1 7032357:2

    3 7032357 1 19 2 7032357:1 2 7032357 2 3 1 7032357:2

    1 7032357 1 1 1 7032357:1 3 7032357 1 19 2 7032357:1

    2 7032357 2 3 1 7032357:2 3 7032357 1 19 2 7032357:1

    3 7032357 1 19 2 7032357:1 3 7032357 1 19 2 7032357:1

    notice that o1.orderitem goes 1 2 3 and o2.orderitem goes 111 for the first three rows. if you join on orderitem you get the following:

    SELECT *

    FROM OrderItems o1

    INNER JOIN OrderItems o2

    ON o1.OrderItem = o2.OrderItem

    WHERE o1.OrderNumber = 7032357

    OrderItem OrderNumber LastItemOnOrder OrderQty ItemInStock pk_lioo OrderItem OrderNumber LastItemOnOrder OrderQty ItemInStock pk_lioo

    ----------- ----------- --------------- -------- ----------- -------------------- ----------- ----------- --------------- -------- ----------- --------------------

    1 7032357 1 1 1 7032357:1 1 7032357 1 1 1 7032357:1

    2 7032357 2 3 1 7032357:2 2 7032357 2 3 1 7032357:2

    3 7032357 1 19 2 7032357:1 3 7032357 1 19 2 7032357:1

    (3 row(s) affected)

    when you join on unique columns you get the results that you want. if you join on multiple columns that together are unique you also would get a similar result as the last query.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • thank you capn.hector, you've demonstrated easily what i was trying to do with words badly. it’s a bit of a woo whooo...d’oh though, i thought it might solve my problem (rather than being a demonstration of what’s going wrong) joining on OrderItem stops the MAX function working correctly which puts us back in a Cartesian....loop i’m going to sleep on it, again thank you for the insight.

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply