duplicate problem after a join

  • Good morning al

    I encounter a duplicate problem in the result of my query

    I have a doubt that the result of the duplicate comes after my join

    Who can help me correct my request to remove the duplicate

    I try with distinct but it does not work

     

    ;with echantillon as (
    select
    Ref_Contrat = v.Contract_ID + '-' + format('0'+ rank() over (partition by v.Contract_ID order by v.sor_ident),'0#')
    ,MaxNumeroVersion = max(v."Numéro Version")
    ,NumAliment = v.Contract_ID
    ,v.sor_ident
    ,v.Client_ID
    from Base_versions v
    inner join Base_Risques r on r.[pol_numpol contract_id]=v.Contract_ID
    inner join Base_mouvements m on m.Contract_ID=v.Contract_ID
    inner join RemonteePerimetreContrats rp on rp.num_aliment=v.Contract_ID
    --where v.Contract_ID = 'VAZP00057'
    group by v.sor_ident ,v.Contract_ID
    ,v.Client_ID
    )

    select

    Ref_Contrat
    ,type_ante ='0' -- 'Obligatoire => Antécédents Auto - Type (durée) ' 'MODIFY Null To 0'
    ,t_crm_ini = FIRST_VALUE ( [Bonus a la derniere echeance] ) OVER (partition by v.Contract_ID order by v.Age_of_the_policy asc)
    ,t_crm = r.bm_rec_and_used_in_tari
    ,c_crm50 = r.nb_of_years_in_bm_50
    ,t_crm_ant = r.[Bonus a la derniere echeance]
    ,ante_assu_date_debut = NULL --'Non Obligatoire'
    ,ante_assu_date_fin = NULL --'Non Obligatoire'
    ,ante_assu_date_stop = NULL --'Non Obligatoire'
    ,dernier_assureur = r.previous_insurer_description
    ,meme_vehicule = 'non'
    ,taux_alcoolemie = r.[Ant_Taux d alcoolemie positif]
    ,sinistre_alcoolemie = NULL --'Non Obligatoire'
    ,duree_retrait_permis = NULL --'Non Obligatoire'
    ,motif_retrait_permis = NULL --'Non Obligatoire'
    ,motif_resiliation = NULL --'Non Obligatoire'
    ,motif_autres = NULL --'Non Obligatoire'
    ,c_anc_ass = NULL --'Non Obligatoire'
    ,controle_alcoolemie = r.[Ant_Test alcool positif ?]
    ,date_controle_alcoolemie = r.[Ant_Date alcoolemie positive]
    ,nature_controle_alcoolemie = NULL --'Non Obligatoire'
    ,consequence_controle_alcoolemie = NULL --'Non Obligatoire'
    ,duree_suspension_alcoolemie = NULL --'Non Obligatoire'
    ,rsa_permis_hors_alcoolemie = NULL --'Non Obligatoire'
    ,motif_suspension_permis = NULL --'Non Obligatoire'
    ,duree_RSA_hors_alcoolemie = NULL --'Non Obligatoire'
    ,resiliation_assureur = NULL --'Non Obligatoire'
    ,resilie_par = NULL --'Non Obligatoire'
    ,autre_resilie_par = NULL --'Non Obligatoire'
    ,nb_mois_ass = NULL --'Non Obligatoire'
    ,num_c =NUll -- 'Obligatoire => Antécédents Auto - Référence de l antécédent'
    ,nombre_assureur = NULL --'Non Obligatoire'
    ,nombre_infraction = NULL --'Non Obligatoire'
    ,nombre_sinistre = NULL --'Non Obligatoire'
    --,' '
    --,r.*

    from
    echantillon

    join Base_versions v
    on v.Contract_ID = echantillon.NumAliment
    and v.sor_ident = echantillon.sor_ident
    and v."Numéro Version" = echantillon.MaxNumeroVersion

    left join Base_Risques r
    on r.[pol_numpol contract_id] = echantillon.NumAliment
    and r.[sor_ident contract_update_id] = echantillon.sor_ident

    inner join Base_clients c
    on c.client_id = echantillon.Client_ID
    where r.bm_rec_and_used_in_tari is not null --'ajout suite au preésence du valeur null VAZP00179-10'

    order by Ref_Contrat
    ;

     

    Sans titre

  • Duplicate rows in a result set can be caused by things like, in correct joins or perhaps the joins are correct but in one table you have a single row that referenced multiple rows in another table. My guess without having access to the data would be that in your resultset at least one data element in the duplicate is different there distinct wont work.

     

     

     

     

     

    ***The first step is always the hardest *******

  • Without seeing your data, it's very difficult to help you.  What happens if you run only the query in the echantillon CTE?  Do you get the duplicate then?

    John

  • if I run the first cte I do not have a duplicate

    the duplicate it is generated when I make this join in my request

     left join Base_Risks r   on r. [pol_numpol contract_id] = v.Contract_ID  and [sor_ident contract_update_id] = sample.sor_ident
  • In which case there must be multiple rows in Base_Risks with the same [pol_numpol contract_id] and [sor_ident contract_update_id] for the value for which you're getting the duplicates.

    John

  • Have you tried throwing a rank on the table that’s causing the duplicates. Then join on that table to with  rank = 1 so you are only getting 1 recorded per join

  • to find out where are duplicates try run a

    select * and check any distinct columns in two "duplicates"

    then you have a candidate to throw a rank on that table like lauro.alvarado.jr said

    or maybe add one more clause in that table join

    -- OPTION 1

    ;with echantillon as (
    ...
    )

    select ...
    from
    echantillon

    join Base_versions v
    on v.Contract_ID = echantillon.NumAliment
    and v.sor_ident = echantillon.sor_ident
    and v."Numéro Version" = echantillon.MaxNumeroVersion

    left join Base_Risques r
    on r.[pol_numpol contract_id] = echantillon.NumAliment
    and r.[sor_ident contract_update_id] = echantillon.sor_ident
    and r.THE_FIELD_IS_CAUSING_THE_DUPLICATES = SOME_KIND_OF_DISTINCT_OTHER_FIELD
    inner join Base_clients c
    on c.client_id = echantillon.Client_ID
    where r.bm_rec_and_used_in_tari is not null --'ajout suite au preésence du valeur null VAZP00179-10'

    order by Ref_Contrat
    ;

    ----------------------------------------------------------------------------
    ----------------------------------------------------------------------------


    -- OPTION 2

    ;with echantillon as (
    ...
    )
    , BASE_RISQUES_CTE as (
    select (all the fields you need on select clause and joins clause and where clause)
    from (
    select rank() over (partition by [pol_numpol contract_id], [sor_ident contract_update_id] ORDER BY THE_FIELD_IS_CAUSING_THE_DUPLICATES) ind
    , *
    from Base_Risques
    ) t
    where ind = 1
    )
    select ...
    from
    echantillon

    join Base_versions v
    on v.Contract_ID = echantillon.NumAliment
    and v.sor_ident = echantillon.sor_ident
    and v."Numéro Version" = echantillon.MaxNumeroVersion

    left join BASE_RISQUES_CTE r
    on r.[pol_numpol contract_id] = echantillon.NumAliment
    and r.[sor_ident contract_update_id] = echantillon.sor_ident

    inner join Base_clients c
    on c.client_id = echantillon.Client_ID
    where r.bm_rec_and_used_in_tari is not null --'ajout suite au preésence du valeur null VAZP00179-10'

    order by Ref_Contrat
    ;

     

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply