Distinct Row Query Help

  • Hello all,

    I have a document database, which contains values for:

    DOC_NO

    REV_NO

    FILE_EXTENSION

    There can be multiple "renditions" of the same file and I am trying to query which have a variation in the file type.

    So, basically any records where the DOC_NO and REV_NO would have to be the same as another row, with a different FILE_EXTENSION value.

    I have manged to return the entire list of values, but am having trouble filtering the list down to "unique rows" which have the above criteria.

    Example Data:

    DOC1, VER1, PDF

    DOC1, VER1, PDF

    DOC1, VER1, PDF

    DOC1, VER1, DOC

    DOC2, VER1, DWG

    DOC2, VER2, DWG

    DOC2, VER2, PDF

    DOC3, VER1, XLS

    DOC4, VER1, PDF

    DOC4, VER1, XLS

    DOC4, VER1, DOC

    DOC4, VER2, XLS

    DOC5, VER1, PDF

    DOC5, VER1, DOC

    DOC5, VER2, PDF

    DOC5, VER2, PDF

    DOC5, VER3, XLS

    The values I would be looking for would be;

    DOC1, VER1, PDF

    DOC1, VER1, DOC

    DOC2, VER2, DWG

    DOC2, VER2, PDF

    DOC4, VER1, PDF

    DOC4, VER1, XLS

    DOC4, VER1, DOC

    DOC5, VER1, PDF

    DOC5, VER1, DOC

    Any assistance very much appreciated and thanks in advance for any replies.

  • Open/import you document in Excel (comma delimited). Then go to the Data tab and there you have "Remove Duplicates" button. Click it and job is done.

    Igor Micev,My blog: www.igormicev.com

  • I'm not sure about the performance of this in a large scale, but I think I have the logic working.

    ;with RawData as

    (

    select 'DOC1' as DOC_NO, 'VER1' as REV_NO, 'PDF' as FILE_EXTENSION

    union all select 'DOC1', 'VER1', 'PDF'

    union all select 'DOC1', 'VER1', 'PDF'

    union all select 'DOC1', 'VER1', 'DOC'

    union all select 'DOC2', 'VER1', 'DWG'

    union all select 'DOC2', 'VER2', 'DWG'

    union all select 'DOC2', 'VER2', 'PDF'

    union all select 'DOC3', 'VER1', 'XLS'

    union all select 'DOC4', 'VER1', 'PDF'

    union all select 'DOC4', 'VER1', 'XLS'

    union all select 'DOC4', 'VER1', 'DOC'

    union all select 'DOC4', 'VER2', 'XLS'

    union all select 'DOC5', 'VER1', 'PDF'

    union all select 'DOC5', 'VER1', 'DOC'

    union all select 'DOC5', 'VER2', 'PDF'

    union all select 'DOC5', 'VER2', 'PDF'

    union all select 'DOC5', 'VER3', 'XLS'

    )

    , Multiples as

    (

    selectDOC_NO, REV_NO

    fromRawData

    group byDOC_NO, REV_NO

    having COUNT(distinct FILE_EXTENSION) > 1

    )

    select distinct

    RD.DOC_NO

    ,RD.REV_NO

    ,RD.FILE_EXTENSION

    from

    RawDataRD

    joinMultiplesMon M.DOC_NO = RD.DOC_NO and M.REV_NO = RD.REV_NO

    order by

    RD.DOC_NO

    ,RD.REV_NO

    ,RD.FILE_EXTENSION

  • Thanks Igor, that seems to have done the job, just using Excel...

    I'm sure it's doable in SQL, but as I'm reporting in Excel anyway, it's easy enough to do the filtering in there 🙂

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply