How to indicate rows with identical values in certain column

  • Hello all,

    I wonder if following is somehow possible:

    I'd like to add an extra column to my existing query which would indicate the rows with identical values in certain column.

    Sample table:

    create table #_TEST (

    _ID int,

    _VAL1 int,

    _VAL2 int,

    _VAL3 datetime

    );

    insert into #_TEST (_ID, _VAL1, _VAL2, _VAL3)

    select 1,10,10,'2010-05-15 07:30:00' union all

    select 2,20,10,'2010-05-16 07:30:00' union all

    select 3,10,20,'2010-05-16 07:30:00' union all

    select 4,30,10,'2010-05-16 10:25:20' union all

    select 5,10,40,'2010-05-17 08:20:05' union all

    select 6,20,30,'2010-05-18 14:45:35' union all

    select 7,50,10,'2010-05-18 14:45:35' union all

    select 8,40,10,'2010-05-18 14:45:35'

    select * from #_TEST

    Extra column would then indicate (by having e.g. value '1') that rows 2 and 3 have identical values in column _VAL3. Same indicator would express the same for rows 6, 7 and 8.

    What would be be the best aproach to achieve this?

    Many thanks for any hints,

    Marin

  • marin-231997 (5/21/2010)


    Hello all,

    I wonder if following is somehow possible:

    I'd like to add an extra column to my existing query which would indicate the rows with identical values in certain column.

    Sample table:

    create table #_TEST (

    _ID int,

    _VAL1 int,

    _VAL2 int,

    _VAL3 datetime

    );

    insert into #_TEST (_ID, _VAL1, _VAL2, _VAL3)

    select 1,10,10,'2010-05-15 07:30:00' union all

    select 2,20,10,'2010-05-16 07:30:00' union all

    select 3,10,20,'2010-05-16 07:30:00' union all

    select 4,30,10,'2010-05-16 10:25:20' union all

    select 5,10,40,'2010-05-17 08:20:05' union all

    select 6,20,30,'2010-05-18 14:45:35' union all

    select 7,50,10,'2010-05-18 14:45:35' union all

    select 8,40,10,'2010-05-18 14:45:35'

    select * from #_TEST

    Extra column would then indicate (by having e.g. value '1') that rows 2 and 3 have identical values in column _VAL3. Same indicator would express the same for rows 6, 7 and 8.

    What would be be the best aproach to achieve this?

    Many thanks for any hints,

    Marin

    If you look at column _VAL1, three rows all have the same value of 10.

    Column _VAL1 indicates (by having value '10') that rows 1, 3 and 5 have identical values in column _VAL1.

    The best approach to achieve this is already there, you don't need extra columns 😉

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hello Chris,

    thanks for your input.

    Unfortunately it is not what I'd like to achieve. I need a way to "flag" the rows having identical values in column _VAL3 (kind of making "blocks" of identical timestamps visible).

    Thanks,

    Marin

  • thats a fun one - this is prior to me having any coffee. So its a bit lame, but I believe I can do that with a row_number

    with testcte (_ID, _VAL1, _VAL2, _VAL3, rnum)

    as

    (

    select *,

    row_number () over(partition by _val3 order by _val3) as rnum

    from #_TEST

    )

    select f.*,

    case when coalesce(s.rnum, t.rnum) is not null then 'Dup' else null end as dup

    from testcte f left outer join testcte s

    on f._val3 = s._val3 and f.rnum = s.rnum+1

    left outer join testcte t

    on f._val3 = t._val3 and f.rnum = t.rnum-1

  • marin-231997 (5/21/2010)


    Hello Chris,

    thanks for your input.

    Unfortunately it is not what I'd like to achieve. I need a way to "flag" the rows having identical values in column _VAL3 (kind of making "blocks" of identical timestamps visible).

    Thanks,

    Marin

    So, if you have three rows with value '2' in column _VAL3, and three rows with value '3', what value would you use in your indicator column for each of the six rows?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hello bnordberg,

    well, I'd really like to see how your queries look like after you've had your cup of coffee 🙂 , but this one just hits the nail on the head!

    Many thanks again for excellent solution, I was really stuck...

    Thanks also to Chris for attending to my problem.

    Cheers,

    Marin

  • marin-231997 (5/21/2010)


    Hello bnordberg,

    well, I'd really like to see how your queries look like after you've had your cup of coffee 🙂 , but this one just hits the nail on the head!

    Many thanks again for excellent solution, I was really stuck...

    Thanks also to Chris for attending to my problem.

    Cheers,

    Marin

    No worries.

    SELECT _ID, _VAL1, _VAL2, _VAL3,

    ROW_NUMBER () OVER(PARTITION BY _val3 ORDER BY _val3) AS rnum,

    COUNT(*) OVER(PARTITION BY _val3) AS ValueCount

    FROM #_TEST

    But...what are you going to do with this?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • try this

    create table #_TEST (

    _ID int,

    _VAL1 int,

    _VAL2 int,

    _VAL3 datetime,

    _IndicatorCol bit

    );

    insert into #_TEST (_ID, _VAL1, _VAL2, _VAL3,_IndicatorCol)

    select 1,10,10,'2010-05-15 07:30:00',NULL union all

    select 2,20,10,'2010-05-16 07:30:00',NULL union all

    select 3,10,20,'2010-05-16 07:30:00',NULL union all

    select 4,30,10,'2010-05-16 10:25:20',NULL union all

    select 5,10,40,'2010-05-17 08:20:05',NULL union all

    select 6,20,30,'2010-05-18 14:45:35',NULL union all

    select 7,50,10,'2010-05-18 14:45:35',NULL union all

    select 8,40,10,'2010-05-18 14:45:35',NULL

    UPDATEA

    SETA._IndicatorCol=CASE WHEN A._VAL3=B._VAL3 THEN 1END

    FROM#_TEST A

    INNER JOIN#_TEST B

    ON A._ID = B._ID+1

    UPDATEA

    SETA._IndicatorCol=CASE A._IndicatorCol

    WHEN 1 THEN 1

    ELSE

    CASE WHEN A._VAL3=B._VAL3 THEN 1END

    END

    FROM#_TEST A

    INNER JOIN#_TEST B

    ON A._ID = B._ID-1

    SELECT *from #_TEST

    drop table #_TEST

  • Above solution is for SQL 2000 🙂

  • Hello Chris,

    your sugesstion is really helpful as well!

    Although information in extra column might look kind of useless at first sight, I need it for further processing exactly like it is supplied by your or bnordberg's query.

    Many thanks for your help,

    Marin

  • Thanks Gopi for input!

    Will give it a try as well...

    Cheers,

    Marin

  • I do similiar stuff all the time - I look at prescription drug history and lab values and look for changes that would indicate adverse drug events, drug seeking behavior ...

    Definitly useful stuff.

  • bnordberg (5/21/2010)


    I do similiar stuff all the time - I look at prescription drug history and lab values and look for changes that would indicate adverse drug events, drug seeking behavior ...

    Definitly useful stuff.

    Something like this?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Absolutely

    let me make the example a little more descriptive. Here I will look for people with the same drug prescription on the same day. This is a frequent occurrence where people see many different physicians trying to get Oxycodone prescriptions.

    create table #Prescriptions (

    ID int,

    PatientID varchar(25),

    Drug varchar(50),

    DatePrescribed datetime

    );

    insert into #Prescriptions (ID, PatientID, Drug, DatePrescribed)

    select 1,'Bob','Aspirin','2010-05-15 07:30:00' union all

    select 2,'John','Oxycodone','2010-05-16 07:30:00' union all

    select 3,'John','Oxycodone','2010-05-16 07:30:00' union all

    select 4,'Denny','Aspirin','2010-05-16 10:25:20' union all

    select 5,'Jane','Antibiotics','2010-05-17 08:20:05' union all

    select 6,'Brian','Oxycodone','2010-05-18 14:45:35' union all

    select 7,'Brian','Oxycodone','2010-05-18 16:45:35' union all

    select 8,'Brian','Oxycodone','2010-05-18 18:45:35'

    select * from #Prescriptions

    with testcte (ID, PatientID, Drug, DatePrescribed, rnum)

    as

    (

    select *,

    row_number () over(partition by convert(varchar,DatePrescribed,101) order by DatePrescribed) as rnum

    from #Prescriptions

    )

    select f.*,

    case when coalesce(s.rnum, t.rnum) is not null then 'PossibleDrugSeeker' else null end as DrugSeeker

    from testcte f left outer join testcte s

    on f.patientid=s.patientid and convert(varchar,f.DatePrescribed,101) = convert(varchar,s.DatePrescribed,101) and f.rnum = s.rnum+1

    left outer join testcte t

    on f.patientid=t.patientid and convert(varchar,f.DatePrescribed,101) = convert(varchar,t.DatePrescribed,101) and f.rnum = t.rnum-1

  • ;with testcte (ID, PatientID, Drug, DatePrescribed, rnum)

    as

    (

    select *,

    row_number () over(partition by convert(varchar,DatePrescribed,101) order by DatePrescribed) as rnum

    from #Prescriptions

    )

    select f.*,

    case when coalesce(s.rnum, t.rnum) is not null then 'PossibleDrugSeeker' else null end as DrugSeeker

    from testcte f left outer join testcte s

    on f.patientid=s.patientid and convert(varchar,f.DatePrescribed,101) = convert(varchar,s.DatePrescribed,101) and f.rnum = s.rnum+1

    left outer join testcte t

    on f.patientid=t.patientid and convert(varchar,f.DatePrescribed,101) = convert(varchar,t.DatePrescribed,101) and f.rnum = t.rnum-1

    -- same results:

    SELECT ID, PatientID, Drug, DatePrescribed,

    row_number () over(partition by convert(varchar,DatePrescribed,101) order by DatePrescribed) as rnum,

    CASE WHEN COUNT(*) OVER(PARTITION BY patientid, CONVERT(VARCHAR,DatePrescribed,101), Drug) > 1

    THEN 'PossibleDrugSeeker'

    ELSE NULL END

    FROM #Prescriptions

    ORDER BY ID

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply