what is sql logic needed to categorize objects by how much they have in common with other objects

  • I do not know where to begin to solve this with sql and would like advise for tackling problem/or sample sql or logic, would be highly appreciated.

    End goal: I'd like to make a report with a slicer on '# of Attributes In Common', where depending on selection I can get all reports having 10, 30, 50, 70, 100, etc. attributes in common.   The list should return the name of the report, the attributes it has in common with another report, and the name of the other report.

    Source table shows that certain attributes appear in more than one report
    ReportName--Attributename
    Report 1------->alpha
    Report 1------->beta
    Report 1------->gamma
    Report 2------->beta
    Report 2------->gamma
    Report 3------->alpha

    Desired Result to rearrange above info, showing that Report 1 has 2 attributes in common with Report 2  AND Report 1 has 1 attribute in common with Report 3
    ReportA----------ReportB----------AttributesIncommon
    Report1----------Report2----------beta
    Report1----------Report2----------gamma
    Report1----------Report3----------alpha

    Then I want to be able to control the result with slicer, to list only those reports having X attributes in common.

    Is this possible?

    --Quote me

  • If you'll post some directly useable data -- CREATE TABLE and INSERT statements -- I'll provide you code to do this (or someone else will first).

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Yes, here it is. 

    ReportTable

    create table ReportTable (
    ReportName varchar (20),
    AttributeName varchar(30)
    );

    insert into ReportTable
    Values
    ('Compliance Details', 'Vendor Name'),
    ('Compliance Details', 'Area'),
    ('Compliance Details', 'Region'),
    ('Compliance Details', 'SubRegion'),
    ('Delivery and Invoice', 'Customer Name'),
    ('Delivery and Invoice', 'Area'),
    ('Delivery and Invoice', 'Region'),
    ('Delivery and Invoice', 'SubRegion'),
    ('Operations Review', 'Customer Name'),
    ('Operations Review', 'Approver'),
    ('Operations Review', 'Approval Status');

    'Compliance Details' and 'Delivery and Invoice' share 3 attributes, and 'Delivery and Invoice' and 'Operations Review' share 1. 

    --Quote me

  • Does this problem appear solveable with sql query?

    --Quote me

  • polkadot - Tuesday, March 21, 2017 1:28 AM

    Does this problem appear solveable with sql query?

    it would more sense to focus on the a attribute rather then report. i.e. A single attribute is common among how many reports.
    you can easy determine the common attribute using count() function and use PIVOT on top of it

    Something like this:

    DECLaRE @ReportTable table (
    ReportName varchar (20),
    AttributeName varchar(30)
    );

    insert into @ReportTable
    Values
    ('Compliance Details', 'Vendor Name'),
    ('Compliance Details', 'Area'),
    ('Compliance Details', 'Region'),
    ('Compliance Details', 'SubRegion'),
    ('Delivery and Invoice', 'Customer Name'),
    ('Delivery and Invoice', 'Area'),
    ('Delivery and Invoice', 'Region'),
    ('Delivery and Invoice', 'SubRegion'),
    ('Operations Review', 'Customer Name'),
    ('Operations Review', 'Approver'),
    ('Operations Review', 'Approval Status');

    WITH CTE
    AS
    (
        select *
                , count(ReportName) OVER (Partition by AttributeName) as Attributecount
        from @ReportTable
    )

    Select
        *
    from
    (
        select
                AttributeName, ReportName, Attributecount
        from CTE r
        where r.Attributecount > 1
    ) A
    Pivot
    (
        Count(Attributecount) FOR ReportName IN ([Compliance Details], [Delivery and Invoice], [Operations Review])
    ) A1

    Does it make sense?


  • SELECT RT1.ReportName AS ReportA, RT2.ReportName AS ReportB, RT1.AttributeName AS AttributeInCommon
    FROM ReportTable RT1
    INNER JOIN ReportTable RT2 ON RT2.ReportName > RT1.ReportName AND RT2.AttributeName = RT1.AttributeName

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • thanks twin.devil it is also interesting what you show. I plan to use it.

    ScottPletcher you made the solution but really curious: Since ReportName is text, how do I read the greator than sign in the JOIN? 

    For using this in a report with a slicer, I'm unfortunately still struggling.  Will this be enough so that I can pull into excel and make a slicer called "# of Attributes in common" having buckets such as 10+, 30+, 40+, 50+.  I want to be able to click on one of the slicer buckets, and get a list of all reports having that many attributes in common, and the output in Excel to be
     

    ReportA----------ReportB----------AttributesIncommon
    Report1----------Report2----------beta
    Report1----------Report2----------gamma
    Report1----------Report3----------alpha

    I realize I should be asking What should the underlying query output be to support this desired excel report format.   Can someone please advise?

    --Quote me

  • If you just want a count of the common attributes, you could do this:

    SELECT RT1.ReportName AS ReportA, RT2.ReportName AS ReportB, COUNT(*) AS CommonAttributesCount
    FROM ReportTable RT1
    INNER JOIN ReportTable RT2 ON RT2.ReportName > RT1.ReportName AND RT2.AttributeName = RT1.AttributeName
    ORDER BY  RT1.ReportName, RT2.ReportName

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Great many thanks to both of you.

    --Quote me

  • I have follow up question, but I will post as new topic.

    --Quote me

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply