

SSC Journeyman
Group: General Forum Members
Last Login: Tuesday, August 4, 2015 2:06 AM
Points: 81,
Visits: 330


Hi,
My question is about calculating a computed column in SSIS. Unfortunately, the column is based on another column in the same query which, at its turn, is a subquery.
I'd prefer a generic answer, but I know some people can't answer unlesss they see the script, so here they are.
Table X1 (scores of volunteer per week and class):
CREATE TABLE [dbo].[tblPAScores]( [ID] [int] IDENTITY(1,1) NOT NULL, [VolunteerID] [varchar](50) NOT NULL, [NoWeek] [int] NULL, [PA] [numeric](5, 2) NULL, [Class] [varchar](50) NULL, CONSTRAINT [PK_tblPAScores] PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
The view on previous table:
SELECT TOP (100) PERCENT VolunteerID, Class, NoWeek, PA, (SELECT COUNT(*) AS Expr1 FROM dbo.tblPAScores WHERE (PA > X1.PA) AND (Class = X1.Class) AND (NoWeek = X1.NoWeek)) + 1 AS PA1R, (SELECT COUNT(VolunteerID) AS Expr1 FROM dbo.tblPAScores AS tblPAScores_1 WHERE (Class = X1.Class) AND (NoWeek = X1.NoWeek)) AS N FROM dbo.tblPAScores AS X1 ORDER BY NoWeek, VolunteerID
• N: calculates the number of people for a week and class (column 'N') • PA1R: calculates how many people have higher PA score than you in that week and class
In the next column, I'd need to divide N by PA1R, it's that simple. Why? The target is, in terms of PA, calculate what percentage of people does better than yourself (say, 30%), same as you (40%) and worse (say 30%) fior a given week and class.
To get this, I'd need dividing N by PA1R (1/N, (PA1R1/N), etc).
Questions: 1) Is it possible to have a computed column based on previous columns which are subqueries? 2) If not, any alternative solution (pivot tables or the like)? Do I have to run a megasubquery (sort of the subquery1 / subquery 2)
Thanks in advance, a.




Hall of Fame
Group: General Forum Members
Last Login: Thursday, August 13, 2015 3:23 PM
Points: 3,427,
Visits: 3,696


a_ud (4/19/2013) Hi,
Questions: 1) Is it possible to have a computed column based on previous columns which are subqueries?
Hi, It is possible to have computed columns in a view. If you tried directly 1/N; PA1R/N then you probably had an error message.
create view vwPAScores as SELECT TOP (100) PERCENT VolunteerID, Class, NoWeek, PA, (SELECT COUNT(*) AS Expr1 FROM dbo.tblPAScores WHERE (PA > X1.PA) AND (Class = X1.Class) AND (NoWeek = X1.NoWeek)) + 1 AS PA1R, (SELECT COUNT(VolunteerID) AS Expr1 FROM dbo.tblPAScores AS tblPAScores_1 WHERE (Class = X1.Class) AND (NoWeek = X1.NoWeek)) AS N, 1/(SELECT COUNT(VolunteerID) AS Expr1 FROM dbo.tblPAScores AS tblPAScores_1 WHERE (Class = X1.Class) AND (NoWeek = X1.NoWeek)) as [1_divide_N], (((SELECT COUNT(*) AS Expr1 FROM dbo.tblPAScores WHERE (PA > X1.PA) AND (Class = X1.Class) AND (NoWeek = X1.NoWeek)) + 1)1)/(SELECT COUNT(VolunteerID) AS Expr1 FROM dbo.tblPAScores AS tblPAScores_1 WHERE (Class = X1.Class) AND (NoWeek = X1.NoWeek)) as [PA1R1_divide_N] FROM dbo.tblPAScores AS X1 ORDER BY NoWeek, VolunteerID
Igor Micev, SQL Server developer at Seavus www.seavus.com




Ten Centuries
Group: General Forum Members
Last Login: Monday, August 24, 2015 12:20 PM
Points: 1,064,
Visits: 2,582


a_ud (4/19/2013)
Hi, My question is about calculating a computed column in SSIS. Unfortunately, the column is based on another column in the same query which, at its turn, is a subquery. I'd prefer a generic answer, but I know some people can't answer unlesss they see the script, so here they are. Table X1 (scores of volunteer per week and class): CREATE TABLE [dbo].[tblPAScores]( [ID] [int] IDENTITY(1,1) NOT NULL, [VolunteerID] [varchar](50) NOT NULL, [NoWeek] [int] NULL, [PA] [numeric](5, 2) NULL, [Class] [varchar](50) NULL, CONSTRAINT [PK_tblPAScores] PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] The view on previous table: SELECT TOP (100) PERCENT VolunteerID, Class, NoWeek, PA, (SELECT COUNT(*) AS Expr1 FROM dbo.tblPAScores WHERE (PA > X1.PA) AND (Class = X1.Class) AND (NoWeek = X1.NoWeek)) + 1 AS PA1R, (SELECT COUNT(VolunteerID) AS Expr1 FROM dbo.tblPAScores AS tblPAScores_1 WHERE (Class = X1.Class) AND (NoWeek = X1.NoWeek)) AS N FROM dbo.tblPAScores AS X1 ORDER BY NoWeek, VolunteerID
• N: calculates the number of people for a week and class (column ' N') • PA1R: calculates how many people have higher PA score than you in that week and class In the next column, I'd need to divide N by PA1R, it's that simple. Why? The target is, in terms of PA, calculate what percentage of people does better than yourself (say, 30%), same as you (40%) and worse (say 30%) fior a given week and class. To get this, I'd need dividing N by PA1R (1/N, (PA1R1/N), etc). Questions: 1) Is it possible to have a computed column based on previous columns which are subqueries? 2) If not, any alternative solution (pivot tables or the like)? Do I have to run a megasubquery (sort of the subquery1 / subquery 2) Thanks in advance, a.
Hi again, @a_ud,
By posting your code rather than asking for a generic answer, you've allowed me to give you what should be a better solution. Your view definition could be
SELECT VolunteerID, Class, NoWeek, PA, Y.PA1R, Z.N, ((Z.NY.PA1R)/(Z.N * 1.0)) * 100 AS pct FROM dbo.tblPAScores AS X1
OUTER APPLY (SELECT COUNT(*) + 1 AS PA1R FROM dbo.tblPAScores X2 WHERE (X2.PA > X1.PA) AND (X2.Class = X1.Class) AND (X2.NoWeek = X1.NoWeek)) Y
OUTER APPLY (SELECT COUNT(X3.VolunteerID) AS N FROM dbo.tblPAScores X2 WHERE (X2.Class = X1.Class) AND (X2.NoWeek = X1.NoWeek)) Z
The correlated subqueries in the OUTER APPLYs generate columns for PA1R and N for each row, and you can use those columns as many different times and ways as you'd like in the SELECT columns.
One note  you'll see that I set up the calculation of the percent with better scores with a seemingly superfluous "* 1.0". Since COUNT() returns an int, N and PA1R will both be ints. Therefore, (NPA1R)/N will return an int, which will always be 0. Adding the "* 1.0" introduces a decimal datatype into the mix so that all the int values will be implicitly converted to decimal values before the calculation is performed, allowing you to get values in the domain 0.0  1.0, as expected.
Also, you'll see that I removed the TOP(100) PERCENT . . . ORDER BY from my code. It's generally better practice to order the results in the SELECT statement that references your view. You can run into some unexpected performance problems when you write views to return ordered result sets. When the query optimizer expands the view definition into the query, the ORDER BY in the view may confound the optimizer and lead to an undesirable execution plan.
Jason Wolfkill Blog: SQLSouth Twitter: @SQLSouth



