March 24, 2017 at 12:12 pm
For a given ReportTable this below query does A, B, C but not D.  I do not know where to change query so that it gives info in D .  Can someone show me how to modify to get D?
A. # of attributes two reports have in common
B. Report A column total # of attributes  
C. # of attributes in common DIVIDED BY Report A total # of attributes   (as percentage)
I also need
D. # of attributes in common DIVIDED BY total  attributes for Report B column (as percentage)
Query
SELECT DISTINCT 
CommonAttributesCount.ReportA, 
CommonAttributesCount.CommonAttributesCount,
CommonAttributesCount.ReportB,
ReportATotal,
(CommonAttributesCount.CommonAttributesCount*1.00)/ReportATotal as reportApercentage,
ReportATotal
FROM
(
--gets me Total Report A Attribute Count
SELECT  DISTINCT 
ReportName as ReportA
,COUNT(AttributeName) ReportATotal
FROM ReportTable
GROUP BY  ReportName
) ReportA_TotalAttributes
LEFT JOIN
(
--lists all the reports in the ReportTable, side by side, next to a column showing how many attributes they have in common
SELECT 
RT1.ReportName AS ReportA,
RT2.ReportName AS ReportB, 
COUNT(*) AS CommonAttributesCount
FROM 
(SELECT DISTINCT ReportName, AttributeName FROM ReportTable) RT1 -- trying to ensure only distinct ReportName/AttributeName combinations returned, but not working
INNER JOIN ReportTable RT2 ON RT2.ReportName > RT1.ReportName AND RT2.AttributeName = RT1.AttributeName 
WHERE RT1.[ReportName] <> ''
GROUP BY RT1.ReportName, RT2.ReportName
) CommonAttributesCount
ON ReportA_TotalAttributes.ReportA=CommonAttributesCount.ReportA 
if you run this, you'll see what I need results to look like this, but mine are currently short the last two columns
SELECT 'COLUMN HEADERS','ReportA', 'CommonAttributesCount', 'ReportB', 'ReportATotal', 'reportApercentage', 'ReportBTotal', 'reportBpercentage'UNION
SELECT 'ROW1','Compliance Details','3','Delivery and Invoice','6','0.50000','4','0.75000' UNION
SELECT 'ROW2','Delivery and Invoice','1', 'Operations Review','4','0.25000','1','1.0000'
Report Table
create table ReportTable (
ReportName varchar (20),
AttributeName varchar(30)
);
insert into ReportTable
Values
('Compliance Details', 'Vendor Name'),
('Compliance Details', 'Area'),
('Compliance Details', 'Area'),
('Compliance Details', 'Area'),
('Compliance Details', 'Region'),
('Compliance Details', 'SubRegion'),
('Delivery and Invoice', 'Customer Name'),
('Delivery and Invoice', 'Area'),
('Delivery and Invoice', 'Region'),
('Delivery and Invoice', 'SubRegion'),
('Operations Review', 'Customer Name'),
('Operations Review', 'Approver'),
('Operations Review', 'Approval Status');'Compliance Details' and 'Delivery and Invoice' share 3 attributes, and 'Delivery and Invoice' and 'Operations Review' share 1. 
--Quote me
March 24, 2017 at 2:18 pm
How are you arriving at the following values in your sample output script?
ReportBTotal
4
1
If it is supposed to be shared attributes, shouldn't those counts be 3 and 1?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
March 24, 2017 at 2:41 pm
I made mistake and corrected:
CommonAttributes column for shared attributes, and they are 3 and 1.
4 and 1 are the grand total of attributes for ReportB reports
--Quote me
March 24, 2017 at 3:24 pm
polkadot - Friday, March 24, 2017 2:41 PMI made mistake and corrected:
CommonAttributes column for shared attributes, and they are 3 and 1.
4 and 1 are the grand total of attributes for ReportB reports
I don't see how "Compliance Details" has 4 attributes as a grand total. That report has 6 attributes and three shared attributes.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
March 24, 2017 at 4:02 pm
The underlying table (in real life!) will sometimes show the duplicate fieldnames for a single 
So my sample data shows Area duplicated within Compliance Details and in final results I want to count it once per report.
--Quote me
March 24, 2017 at 9:49 pm
polkadot - Friday, March 24, 2017 2:41 PMI made mistake and corrected:
CommonAttributes column for shared attributes, and they are 3 and 1.
4 and 1 are the grand total of attributes for ReportB reports
I believe the numbers in the sample data to still be inaccurate. The grand total of the ReportB reports should be 4 and 3 based on the names provided in the samples and the count of unique instances of the attribute within that report.
Based on that, here is something that should get you close.
SELECT DISTINCT CommonAttributesCount.ReportA
        , CommonAttributesCount.CommonAttributesCount
        , CommonAttributesCount.ReportB
        , ReportATotal
        , ( CommonAttributesCount.CommonAttributesCount * 1.00 ) / ReportATotal AS reportApercentage
        , ReportATotal
        , rb.ReportBTotal
        , (CommonAttributesCount.CommonAttributesCount * 1.0)/ RB.ReportBTotal AS reportBpercentage
    FROM (
--gets me Total Report A Attribute Count
            SELECT DISTINCT ReportName AS ReportA
                    , COUNT(AttributeName) ReportATotal
                FROM ReportTable
                GROUP BY ReportName
            ) ReportA_TotalAttributes
        LEFT JOIN (
--lists all the reports in the ReportTable, side by side, next to a column showing how many attributes they have in common
                    SELECT RT1.ReportName AS ReportA
                            , RT2.ReportName AS ReportB
                            , COUNT(*) AS CommonAttributesCount
                        FROM ( SELECT DISTINCT ReportName
                                        , AttributeName
                                    FROM ReportTable
                                ) RT1 -- trying to ensure only distinct ReportName/AttributeName combinations returned, but not working
                            INNER JOIN ReportTable RT2
                                ON RT2.ReportName > RT1.ReportName
                                    AND RT2.AttributeName = RT1.AttributeName
                        WHERE RT1.[ReportName] <> ''
                        GROUP BY RT1.ReportName
                            , RT2.ReportName
                    ) CommonAttributesCount
            ON ReportA_TotalAttributes.ReportA = CommonAttributesCount.ReportA
        CROSS APPLY (SELECT rti.ReportName, SUM(rti.rownum) AS ReportBTotal --distinct attributes
                        FROM (SELECT rt.reportname, ROW_NUMBER() OVER (PARTITION BY rt.reportname,attributename ORDER BY rt.reportname) AS rownum
                                FROM dbo.ReportTable rt) rti
                        WHERE rti.rownum = 1
                        GROUP BY rti.ReportName) RB
    WHERE rb.ReportName = CommonAttributesCount.ReportB; 
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
March 27, 2017 at 5:59 am
I too am not sure about the proper solution, as your original data has duplicated values. Here's what I came up with, after assuming that the duplicates are valid:
CREATE TABLE #ReportTable (
    RowNum int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
    ReportName varchar (20),
    AttributeName varchar(30),
);
CREATE NONCLUSTERED INDEX IX_ReportTable_ReportName_AttributeName ON #ReportTable
        (
        ReportName ASC,
        AttributeName ASC
        );
INSERT INTO #ReportTable (ReportName, AttributeName)
    VALUES    ('Compliance Details', 'Vendor Name'),
            ('Compliance Details', 'Area'),
            ('Compliance Details', 'Area'),
            ('Compliance Details', 'Area'),
            ('Compliance Details', 'Region'),
            ('Compliance Details', 'SubRegion'),
            ('Delivery and Invoice', 'Customer Name'),
            ('Delivery and Invoice', 'Area'),
            ('Delivery and Invoice', 'Region'),
            ('Delivery and Invoice', 'SubRegion'),
            ('Operations Review', 'Customer Name'),
            ('Operations Review', 'Approver'),
            ('Operations Review', 'Approval Status');
WITH ReportAttributeCount AS (
    SELECT ReportName as ReportA, COUNT(AttributeName) AS ReportAttributeCount
    FROM #ReportTable
    GROUP BY ReportName
),
    CommonAttributesCount AS (
        SELECT RT.ReportName AS ReportA, RT2.ReportName AS ReportB, COUNT(DISTINCT RT.AttributeName) AS CommonAttributeCount
        FROM #ReportTable AS RT
            INNER JOIN #ReportTable AS RT2
                ON RT.ReportName < RT2.ReportName
                AND RT.AttributeName = RT2.AttributeName
        GROUP BY RT.ReportName, RT2.ReportName
)
SELECT DISTINCT
    RAC.ReportA,
    RAC.ReportAttributeCount,
    CACA.CommonAttributeCount,
    CACA.ReportB,
    RACB.ReportAttributeCount AS ReportBAttributeCount,
    (CACA.CommonAttributeCount * 1.00) / RAC.ReportAttributeCount AS reportApercentage,
    (CACA.CommonAttributeCount * 1.00) / RACB.ReportAttributeCount AS reportBpercentage
FROM ReportAttributeCount AS RAC
    INNER JOIN CommonAttributesCount AS CACA
        ON RAC.ReportA = CACA.ReportA
    INNER JOIN ReportAttributeCount AS RACB
        ON CACA.ReportB = RACB.ReportA
    LEFT OUTER JOIN CommonAttributesCount AS CACB
        ON RAC.ReportA = CACA.ReportB;
DROP TABLE #ReportTable;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
March 27, 2017 at 1:22 pm
Jason and Steve, you've both understood and helped me.... for no reason but goodness. thank goodness! Many thanks
--Quote me
Viewing 8 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply