SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Select into a Text file from a Table and removing dupilcates


Select into a Text file from a Table and removing dupilcates

Author
Message
Neel Singh
Neel Singh
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 40
Hi

I have a table of about 21 000 000 rows, I need to export this data to a text file and before exporting I need to remove all duplicates. What would be the fastest and most efficient way to do this task.

Thanks
Peter Lavelle
Peter Lavelle
SSC-Enthusiastic
SSC-Enthusiastic (130 reputation)SSC-Enthusiastic (130 reputation)SSC-Enthusiastic (130 reputation)SSC-Enthusiastic (130 reputation)SSC-Enthusiastic (130 reputation)SSC-Enthusiastic (130 reputation)SSC-Enthusiastic (130 reputation)SSC-Enthusiastic (130 reputation)

Group: General Forum Members
Points: 130 Visits: 1497
Hi Neel,

Could you provide some more details please?

1. What version SQL Server?
2. Table definition (table name, column names, data types, keys, indexes)?
3. Some sample data rows
4. What exactly do you mean by duplicate rows? If three rows are duplicates do you want to delete all of them, the first two or the last two? There are many postings on this site about eliminating duplicate rows.
5. Is this a once-off request or do you want to run this regularly?

PeterL
amar_mazhar
amar_mazhar
SSC Rookie
SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)

Group: General Forum Members
Points: 47 Visits: 51
Hi Neel,

I think you can use DTS package to transfer data from table to text file. In Dts packages you can use store procedures or TSQL quesries to control duplicate values.
Neel Singh
Neel Singh
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 40
Hi

For inserting the data into the text file, I did a striaght select into the text file using sql query.

For removing Duplicates I used the below query but I'm sure there is a better option.

--select all dupicates into a temp table
select pho_phone_number,count(*) cc
into #temp
from XD_HM_Extract with (nolock)
where data_type = 'XD'
group by pho_phone_number
having count(*) > 1


--select accounts with no duplicate into a temp table
select acc_account_code,
ped_initials,
ped_name_1,
ped_surname,
adr_line_1,
adr_line_2,
adr_line_3,
adr_line_4,
adr_post_code,
que_code,
que_description,
bck_id,
pho_phone_number,
data_type,
status
into #main
from XD_HM_Extract xd with (nolock)
where not exists (select *
from #temp t
where xd.pho_phone_number = t.pho_phone_number)
and data_type = 'XD'




--select all duplicate accounts
select acc_account_code,
ped_initials,
ped_name_1,
ped_surname,
adr_line_1,
adr_line_2,
adr_line_3,
adr_line_4,
adr_post_code,
que_code,
que_description,
bck_id,
xd.pho_phone_number,
data_type,
status
into #dup
from XD_HM_Extract xd with (nolock)
join #temp t with (nolock)
on (xd.pho_phone_number = t.pho_phone_number)
where data_type = 'XD'



--select distinct accounts where phone number is duplicate
select *
into #main1
from #dup xd
where xd.acc_account_code = (select top 1 t.acc_account_code
from #dup t
where xd.pho_phone_number = t.pho_phone_number
order by acc_account_code desc)





--merging my two temp tables
insert into #main
select *
from #main1



/*
drop table #main
drop table #main1
drop table #temp
drop table #dup*/

select *
from #main
Dallas Martin
Dallas Martin
Valued Member
Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)

Group: General Forum Members
Points: 69 Visits: 46
First, you need to understand your data.

Exactly what makes anyone record the "unique" or desired record?

Does the table have an IDENTITY column or a datetime column
that could be used to identity the most recent record for a common
column or set of columns.

For example, your given table has repeating address information
for a pediatrician (ped_?). since the pediatrician's name could
be the same for each of this pediatrician's records then you would
identify the most recent record for the entire set of that's pediatrician's
records using a query like this:

Assume we have an IDENTITY column named: record_id

SELECT * FROM dbo.SomeTable A,
(SELECT ped_Name, MAX(record_id) as MAX_ID
FROM dbo.SomeTable
GROUP BY ped_name) as B
WERE a.record_id=b.MAX_ID
Neel Singh
Neel Singh
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 40
Thanks, I will apply that logic, seems much simpler.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search