December 4, 2009 at 12:35 pm
As you can see from the code below, this stored procedure that its big and doesn't produce the "preview" I'm hoping for. It seems like it's trying to display the report, but it never does. I have to force the application to close to stop it in its tracks because 15 minutes goes by without any display.
I was told that this kind of 'SQL coding' produces a very large 'cartesian' table that causes the system to take forever, if at all, display the report results on the screen.
ALTER PROCEDURE [dbo].[pp_Catalog_Items]
(
@facility varchar(10),
@commodity_cat_code varchar(1000)
)
AS
SET NOCOUNT ON
SET ANSI_NULLS ON
SET @commodity_cat_code = dbo.format_string(@commodity_cat_code)
DECLARE @SQL VARCHAR(8000)
SET @SQL =
'SELECT DISTINCT
tidcafac.facility, tidcafac.catalog_id, tidcamst.catalog_desc,
tidcafac.catalog_status, tidcafac.stock_type,
tidcafac.auto_reorder_ind, tidcafac.pre_capital_ind,
tidcamst.unit_of_issue, tidcafac.target_maximum,
tidcawhs.qty_in_whse, tidcafac.avg_unit_price,
tidcawhs.loc_zone, tidcawhs.loc_row, tidcawhs.loc_section,
tidcawhs.loc_tier, tidcawhs.loc_bin, tidcamfr.manufacturer_code,
tidcamfr.manuf_part_number, tidcapur.vendor_code,
tidcapur.vendor_suffix, MAX(CAST(tidrclin.rec_line_date As integer)) AS
rec_line_date,
tidcawhs.last_updated_date, tidcafac.lead_time_order,
tidcafac.lead_time_vendor, tidcafac.leadtime_putaway,
tidcafac.repairable_ind, tidcafac.work_order_nbr,
tidcafac.auto_requisition, tidcapur.auto_award,
tidcapur.bpo_candidate, tidcafac.expensed_item_ind,
tidcamst.commodity_cat_code, tidcamst.commodity_type_cod,
tidcamst.commodity_name_cod, tidcafac.cycle_count_code,
tidcafac.abc_analysis_code, tidacdst.cost_center_charge,
tidacdst.activity_id, tidacdst.account_nbr, tidacdst.sub_account_nbr
FROM
cb10.tidcafac,
cb10.tidcamfr,
cb10.tidcamst,
cb10.tidcapur,
cb10.tidcawhs,
cb10.tidrclin,
cb10.tidacdst
WHERE (
(tidcafac.catalog_id = tidcamst.catalog_id_master)
AND (tidcafac.catalog_id = tidcamfr.catalog_id)
AND (tidcafac.q_level = tidcamfr.q_level)
AND (tidcafac.facility = tidcapur.facility)
AND (tidcafac.catalog_id = tidcapur.catalog_id)
AND (tidcafac.q_level = tidcapur.q_level)
AND (tidcafac.facility = tidcawhs.facility(+))
AND (tidcafac.catalog_id = tidcawhs.catalog_id(+))
AND (tidcafac.q_level = tidcawhs.q_level(+))
AND (tidcafac.facility = tidrclin.facility(+))
AND (tidcafac.catalog_id = tidrclin.catalog_id(+))
AND (tidcafac.q_level = tidrclin.q_level(+))
AND ((tidcafac.facility = '''+@facility+'''))
AND ((tidcamst.commodity_cat_code IN ('+@commodity_cat_code+')))
)
GROUP BY
tidcafac.facility, tidcafac.catalog_id, tidcamst.catalog_desc, cafac.catalog_status,
tidcafac.stock_type, tidcafac.auto_reorder_ind, tidcafac.pre_capital_ind,
tidcamst.unit_of_issue, tidcafac.target_maximum, tidcawhs.qty_in_whse,
tidcafac.avg_unit_price, tidcawhs.loc_zone, tidcawhs.loc_row, cawhs.loc_section,
tidcawhs.loc_tier, tidcawhs.loc_bin, tidcamfr.manufacturer_code,
tidcamfr.manuf_part_number, tidcapur.vendor_code, tidcapur.vendor_suffix,
tidcawhs.last_updated_date, tidcafac.lead_time_order, tidcafac.lead_time_vendor,
tidcafac.leadtime_putaway, tidcafac.repairable_ind, tidcafac.work_order_nbr,
tidcafac.auto_requisition, tidcapur.auto_award, tidcapur.bpo_candidate,
tidcafac.expensed_item_ind, tidcamst.commodity_cat_code,
tidcamst.commodity_type_cod, tidcamst.commodity_name_cod,
tidcafac.cycle_count_code, tidcafac.abc_analysis_code,
tidacdst.cost_center_charge, tidacdst.activity_id, tidacdst.account_nbr,
tidacdst.sub_account_nbr
ORDER BY
tidcamst.commodity_cat_code, tidcamst.commodity_type_cod,
tidcamst.commodity_name_cod ';
SET @SQL = 'select * from openquery(PRPCB,''' + REPLACE(@SQL,'''','''''') + ''')'
EXEC (@SQL)
SET NOCOUNT OFF
My basic understanding of SQL is that the SELECT statement extracts the columns (fields) from their respective tables, and the WHERE statements act as 'JOIN' statements that link the tables together based on a 'common' column name. The GROUP BY statement will cause the report to show the 'column names' in the order that I want the users to see on the screen, and the ORDER BY statements are for sorting/displaying the report based on whatever column names are listed.
As for how to go about setting this monster of a query up so that it avoids the 'cartesian' effect, is my dilemma. I've read that creating an 'Index' would be very effective in speeding up the process, but where to place that code is something I don't know. Do I add the 'create index' code to the beginning, middle, or end of the existing stored procedure? Do I even need the 'index' because of something the query is missing/lacking?
I sure could use some helpful assistance with this issue. Once I understand what needs to be done and how it needs to be done, I can tweak the rest of the stored procedures (50+) that are similar to this chaotic sp.
Thanks in advance to anyone who can help me!
December 4, 2009 at 1:47 pm
Okay, couple of misconceptions there.
First, a "cartesian product" is when you join two tables together without anything telling the code which rows go with which. Your code doesn't do that, it has criteria indicating which rows go with which. Except on the tidacdst table. I don't see join criteria on that one, which means it is a cartesian product. If that table has very many rows, it'll slow the whole thing down. Also, it's rare that joining all rows of one table to all rows of another is actually useful in a report, so you'll want to look at that.
Second, I'd move the join criteria into the From clause. Won't change the behavior on this particular query, but it'll make it a lot more readable, and it does have a major effect on some queries.
Third, Group By is for controlling aggregate functions. For example, you want the Max() function on the rec_line_date column, so you have to Group By all the other columns. When you use an aggregate function on a column, like Avg or Max or Min or Sum, you have to tell it where to break up the values. For example, if you had an Orders table, and you wanted the latest order date for each customer, you would group by the CustomerID column and use Max on the order date. Has nothing to do with the sequence of columns in the final result. It's all about aggregating.
Third and a half, you have a Group By for aggregating, which means you don't need Distinct. Group By will already do that for you.
Fourth, indexes aren't something you build in the query. You would add them to the tables. Indexes in database are just like indexes in books. If you look at the back of a non-fiction book, there's usually an index. If you want to find out what pages "Select" is mentioned on, you would look in the index. If it's not in the index, you have to read the whole book to find where it's mentioned. Indexes on tables work the same way. They give SQL Server a shortcut to finding the contents of specific columns.
Fifth, why do this with dynamic SQL and OpenQuery? It looks like that's just to solve the fact that commodity_cat_code can contain a delimited list of values. Is that correct? If so, there are better ways to handle that.
Here's one way to write that query that will make it easier to read and figure out what it's doing:
ALTER PROCEDURE [dbo].[pp_Catalog_Items] (
@facility varchar(10),
@commodity_cat_code varchar(1000))
AS
SET NOCOUNT ON
SET @commodity_cat_code = dbo.format_string(@commodity_cat_code)
SELECT
tidcafac.facility,
tidcafac.catalog_id,
tidcamst.catalog_desc,
tidcafac.catalog_status,
tidcafac.stock_type,
tidcafac.auto_reorder_ind,
tidcafac.pre_capital_ind,
tidcamst.unit_of_issue,
tidcafac.target_maximum,
tidcawhs.qty_in_whse,
tidcafac.avg_unit_price,
tidcawhs.loc_zone,
tidcawhs.loc_row,
tidcawhs.loc_section,
tidcawhs.loc_tier,
tidcawhs.loc_bin,
tidcamfr.manufacturer_code,
tidcamfr.manuf_part_number,
tidcapur.vendor_code,
tidcapur.vendor_suffix,
MAX(CAST(tidrclin.rec_line_date As integer)) AS rec_line_date,
tidcawhs.last_updated_date,
tidcafac.lead_time_order,
tidcafac.lead_time_vendor,
tidcafac.leadtime_putaway,
tidcafac.repairable_ind,
tidcafac.work_order_nbr,
tidcafac.auto_requisition,
tidcapur.auto_award,
tidcapur.bpo_candidate,
tidcafac.expensed_item_ind,
tidcamst.commodity_cat_code,
tidcamst.commodity_type_cod,
tidcamst.commodity_name_cod,
tidcafac.cycle_count_code,
tidcafac.abc_analysis_code,
tidacdst.cost_center_charge,
tidacdst.activity_id,
tidacdst.account_nbr,
tidacdst.sub_account_nbr
FROM
cb10.tidcafac
INNER JOIN cb10.tidcamfr
ON tidcafac.catalog_id = tidcamfr.catalog_id
AND tidcafac.q_level = tidcamfr.q_level
INNER JOIN cb10.tidcamst
ON tidcafac.catalog_id = tidcamst.catalog_id_master
INNER JOIN cb10.tidcapur
ON tidcafac.facility = tidcapur.facility
AND tidcafac.catalog_id = tidcapur.catalog_id
AND tidcafac.q_level = tidcapur.q_level
INNER JOIN cb10.tidcawhs
ON tidcafac.facility = tidcawhs.facility
AND tidcafac.catalog_id = tidcawhs.catalog_id
AND tidcafac.q_level = tidcawhs.q_level
INNER JOIN cb10.tidrclin
ON tidcafac.facility = tidrclin.facility
AND tidcafac.catalog_id = tidrclin.catalog_id
AND tidcafac.q_level = tidrclin.q_level
CROSS JOIN cb10.tidacdst
WHERE
tidcafac.facility = @facility
AND @commodity_cat_code like '%' + tidcamst.commodity_cat_code + '%'
GROUP BY
tidcafac.facility,
tidcafac.catalog_id,
tidcamst.catalog_desc,
cafac.catalog_status,
tidcafac.stock_type,
tidcafac.auto_reorder_ind,
tidcafac.pre_capital_ind,
tidcamst.unit_of_issue,
tidcafac.target_maximum,
tidcawhs.qty_in_whse,
tidcafac.avg_unit_price,
tidcawhs.loc_zone,
tidcawhs.loc_row,
cawhs.loc_section,
tidcawhs.loc_tier,
tidcawhs.loc_bin,
tidcamfr.manufacturer_code,
tidcamfr.manuf_part_number,
tidcapur.vendor_code,
tidcapur.vendor_suffix,
tidcawhs.last_updated_date,
tidcafac.lead_time_order,
tidcafac.lead_time_vendor,
tidcafac.leadtime_putaway,
tidcafac.repairable_ind,
tidcafac.work_order_nbr,
tidcafac.auto_requisition,
tidcapur.auto_award,
tidcapur.bpo_candidate,
tidcafac.expensed_item_ind,
tidcamst.commodity_cat_code,
tidcamst.commodity_type_cod,
tidcamst.commodity_name_cod,
tidcafac.cycle_count_code,
tidcafac.abc_analysis_code,
tidacdst.cost_center_charge,
tidacdst.activity_id,
tidacdst.account_nbr,
tidacdst.sub_account_nbr
ORDER BY
tidcamst.commodity_cat_code,
tidcamst.commodity_type_cod,
tidcamst.commodity_name_cod ;
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 4, 2009 at 2:33 pm
wow! what an eye opener....I really appreciate your feedback... I didn't realize all that you mentioned. I will work with what you said and apply that to this query to make it more 'manageable' and efficient....your example really makes it easy to read and follow.
I will work on it this weekend and reply back to this forum on Monday, not sure when on Monday, and let you know what changes I've made.
December 7, 2009 at 7:37 am
Take the time you need. I wrote a bit.
There are ways to improve the efficiency on this, like parsing the input parameter into a derived table. I didn't want to hit you with everything all at once, but once you get the query running, let me know, and I'll help you with that. (Or someone else on the forum will.)
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 7, 2009 at 11:34 am
I'm not sure what that looks like or what it means when you refer to 'parsing input parameters into a derived table, but if it means better efficiency, I'm all for it.
After trying your example above, there's one error just won't go away that reads:
'Invalid object name cb10.tidcafac'.
The line it refers to is the 'From' statement following the long 'Select' statement in the beginning.
The parameters in Visual Studio appear to be set up correctly, so I'm not sure what the cause is since 'cb10.tidcafac' was referenced in the pre-existing query and did not trigger any errors.
Any thoughts?
Thanks!
December 7, 2009 at 11:47 am
cb10.(whatever) references a schema, "cb10", and a table or view, "tidcafac" in this case. Does that object exist in the database where you're running this?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 7, 2009 at 12:08 pm
The schema and table are found in the Toad for Oracle application, but the SQL query is being run in the SQL Reporting database of MS SQL Server Management Studio.
When I tried to resolve the error by adding the single quote to the left-side of the of 'Select' statment at the beginning and next to the closing semi-colon of the 'Order By' statement...
DECLARE @SQL VARCHAR(8000)
SET @SQL=
'SELECT
tidcafac.facility,
and at the end...
tidcamst.commodity_name_cod ';
SET @SQL = 'select * from openquery(PRPCB,''' + REPLACE(@SQL,'''','''''') + ''')'
A new error appears stating that reads:
The data types varchar and varchar are incompatible in the modulo operator.
Typical code stuff...fix one error only to get a new one...lol
I didn't want to make a mountain out of a mole hill because I think it's probably something really simple that I'm not noticing in the query.
December 7, 2009 at 12:12 pm
Okay, you lost me now. The tables exist in an Oracle database, but you're running the query in an SQL Server database?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 7, 2009 at 12:25 pm
The way it's set up is that the Oracle application is where the schema and tables exist for the report I'm creating, and I use the SQL Reporting database in SSMS to create the stored procedure and MS Visual Studio to design the report and reference the stored procedure for report parameters/datasets.
Sometimes, I'll run the query in Oracle to see the results for the type of report I need to create, and then it has to be 'tweaked' in SSMS because some syntax items are different from Oracle to SQL SMS.
I hope this helps!
Thanks!
December 8, 2009 at 6:51 am
It clarifies the situation. But I really can't help with an Oracle query. Not qualified to have an opinion on those.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 10 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply