• polkadot - Friday, March 24, 2017 2:41 PM

    I made mistake and corrected:
    CommonAttributes column for shared attributes, and they are 3 and 1.
    4 and 1 are the grand total of attributes for ReportB reports

    I believe the numbers in the sample data to still be inaccurate. The grand total of the ReportB reports should be 4 and 3 based on the names provided in the samples and the count of unique instances of the attribute within that report.

    Based on that, here is something that should get you close.


    SELECT DISTINCT CommonAttributesCount.ReportA
            , CommonAttributesCount.CommonAttributesCount
            , CommonAttributesCount.ReportB
            , ReportATotal
            , ( CommonAttributesCount.CommonAttributesCount * 1.00 ) / ReportATotal AS reportApercentage
            , ReportATotal
            , rb.ReportBTotal
            , (CommonAttributesCount.CommonAttributesCount * 1.0)/ RB.ReportBTotal AS reportBpercentage
        FROM (
    --gets me Total Report A Attribute Count
                SELECT DISTINCT ReportName AS ReportA
                        , COUNT(AttributeName) ReportATotal
                    FROM ReportTable
                    GROUP BY ReportName
                ) ReportA_TotalAttributes
            LEFT JOIN (
    --lists all the reports in the ReportTable, side by side, next to a column showing how many attributes they have in common
                        SELECT RT1.ReportName AS ReportA
                                , RT2.ReportName AS ReportB
                                , COUNT(*) AS CommonAttributesCount
                            FROM ( SELECT DISTINCT ReportName
                                            , AttributeName
                                        FROM ReportTable
                                    ) RT1 -- trying to ensure only distinct ReportName/AttributeName combinations returned, but not working
                                INNER JOIN ReportTable RT2
                                    ON RT2.ReportName > RT1.ReportName
                                        AND RT2.AttributeName = RT1.AttributeName
                            WHERE RT1.[ReportName] <> ''
                            GROUP BY RT1.ReportName
                                , RT2.ReportName
                        ) CommonAttributesCount
                ON ReportA_TotalAttributes.ReportA = CommonAttributesCount.ReportA
            CROSS APPLY (SELECT rti.ReportName, SUM(rti.rownum) AS ReportBTotal --distinct attributes
                            FROM (SELECT rt.reportname, ROW_NUMBER() OVER (PARTITION BY rt.reportname,attributename ORDER BY rt.reportname) AS rownum
                                    FROM dbo.ReportTable rt) rti
                            WHERE rti.rownum = 1
                            GROUP BY rti.ReportName) RB
        WHERE rb.ReportName = CommonAttributesCount.ReportB;

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events