I have an application that regularly passes queries that are cartesian products. I am trying to get some documentation together to send to the people who use this app so they will understand how bad this is. I am trying to calculate the number of records that would be produced and want to make sure that I am correct.
For Example: I see a query like ...
select B.BANK_SETID, B.BANK_CD, B.BANK_ACCT_KEY, B.PROCESS_INSTANCE from PS_BUS_UNIT_INTFC A, PS_PAYMENT_TBL B, PS_PYMNT_VCHR_XREF C, PS_VOUCHER D, PS_BANK_ACCT_TBL E, PS_BUS_UNIT_TBL_GL F, PS_BUS_UNIT_TBL_GL H, PS_BU_LED_AP_VW I
There is NO where clause!! And, one table is referenced twice. I have no idea why this was built this way.
Here are the number of records in each table:
PS_BUS_UNIT_INTFC has 4 recordsPS_PAYMENT_TBL has 74740 recordsPS_PYMNT_VCHR_XREF has 145521 recordsPS_VOUCHER has 144263 recordsPS_BANK_ACCT_TBL has 28 recordsPS_BUS_UNIT_TBL_GL has 4 recordsPS_BU_LED_AP_VW has 4 records
Since this would produce results of all records of all tables then this query would produce 28,362,348,734,709,280 records. Correct?? That would be 4*74740*145521*144263*28*4*4=28,362,348,734,709,280
yep, it's just multiply all the row counts
don't the users notice that their results don't make sense (I'm assuming they don't)? actually, don't answer that one ...
i've seen in some apps a default behaviour of not allowing this (no on/where clauses) so the user has to deliberately say they want to do a cartesian product
o/w maybe give them prejoined views to work with?