Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Check and transfer data between two databases in same server using complex conditions


Check and transfer data between two databases in same server using complex conditions

Author
Message
tough1
tough1
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 20
Source DB1.TB1 [ ID, EAN, ASIN, category, NAME, CONDITION]

1, null, 20001, ps2, COD2, new
2, 1002, 20002, xbox, HOLA, new
3, 1003, 20003, xbox, Spider Man, used

and DB1.TB2 [ ID, PRICE]

1, 50
2, 51
3, 61


Destination DB2.TB1 [ KID, EAN, category, NAME, CONDITION, PRICE ]

1013, 1001, ps2, COD2, new, 50
1015, 1002, xbox, HOLA, new, 31


Now I need to transfer the data from source DB1 TB1 and TB2 to destination table with some conditions:

ID is same products int DB1, KID is the actual "ID" we refer to in the organization. each item without this id should be updated with KID_max + 1.

The code must check the existing items in DB2.TB1 and update its price if the items exist else add it to this table using new KID .

I wanted to use `join` but I am not sure where shall I put the `on` condition. Since I could not find any entity that would be common in both, names may be same but not sure if they can be used in this `on` condition. Some items in the DB2.TB1 do not have EAN. Some Items on the DB1.TB1 do not have EAN as well but all do have ASIN.

Just some sample codes that I may use to compare the products and update the destination would be enough for me, I think I will be able to modify the codes if I am able to know the path that I should follow to solve this this type of task.

Info: DB2.TB1 consists of around 53k items and source DB1.TB1 consists 18k items.
This task needs to be done every day so I need to consider some performance issue as well.

I am not sure if the full code will be too much to ask or not (being a beginner,it seems too complex) I would like to know some information on how shall I approach to solve this issue.
tough1
tough1
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 20
Thank you for your response. @Celko,
I agree that I do not have enough knowledge about the RDMS, I am in the learning stage, have tried myself on searching related topics though I have not mentioned here. I don't think it is possible to learn these topics in short period of time so as I am learning as well as working on some of the things.

Our organization supplied me with a piece of software that downloads the products to re-post in our website, its from amazon. The downloading is totally different, we have been using ID2 for identification, which is not related to both EAN and ASIN, so my little knowledge was thinking to compare EAN,ASIN and the NAMEs of the products so that I can update the tables. Not knowing how! I have basic knowledge by now, at least i understand the codes, and sometimes modify it to suit my needs.
As I told in my post it seems quite complex to me just to begin with so wanted to know some approaches how its done in this kind of situation.
vinu512
vinu512
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1149 Visits: 1618
adhikarideep (11/6/2012)
Source DB1.TB1 [ID, EAN, ASIN, NAME, CONDITION]

1,1001,20001,ps2,COD2,new
2,1002,20002,xbox,HOLA,new
3,1003,20003,xbox,Spider Man,used



I understand the requirement but you're Sample Data is not correct. The above quoted part shows that the table has 5 columns but the data you have given is 6 columns.
Please provide correct sample data for a working solution for your requirement.

Vinu Vijayan

For better and faster solutions please check..."How to post data/code on a forum to get the best help" - Jeff Moden ;-)
tough1
tough1
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 20
vinu512 (11/7/2012)
[quote]adhikarideep (11/6/2012)

I understand the requirement but you're Sample Data is not correct. The above quoted part shows that the table has 5 columns but the data you have given is 6 columns.
Please provide correct sample data for a working solution for your requirement.


Thanks for your understanding, I have by now updated my question along with the entity the tables. Hope some suggestions will help me figure out how to go ahead. Looking at it my head is spinning, I do not know if a sql level solution will help or application is supposed to do it.
vinu512
vinu512
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1149 Visits: 1618
I can see that you have edited your sample data, but only the part that I quoted in my last post. From what I can see, there are still issues with your Sample Data. The data in the EAN at the Destination does not match with the data at the Source. I am assuming that its a typo error and going with the values at the source.

You can select the data by joining the tables at the Source ON the Id field in both TB1 and TB2. But, you don't have such a field at the Destination. So....I would say that the design at the Destination is terribly wrong and without making changes to this design it won't be possible to do what you are trying to do.

I would have inserted/updated the data by checking the EAN field in the Destination table. But, as I see....the first row has an EAN value "NULL" and the corresponding row at the Destination has a value which is NOT NULL. Which means that even if we know that the two rows are same but we can't assume that it might be true for all the rest of the data on a larger scale.

So, all you could do here is add an "ID" field to the destination table which would map the existing data at the Destination to the data at the Source. If this is done once then it would be a lot lot easier for you to solve such problems in the future.

Finally, if you think that the NAME field is unique and you want to map the data between the Source and the Destination on the NAME field then you can use the following script:


--Creating Tables

Create table Ex
(ID int,
EAN int,
ASIN BigInt,
category NVarchar(20),
NAME NVarchar(20),
CONDITION NVarchar(20) )

Create table Ex1
(ID int,
PRICE Float)

Create Table Ex2
(KID int,
EAN Int,
category NVarchar(20),
NAME NVarchar(20),
CONDITION NVarchar(20),
PRICE Float )


--Inserting Sample Data

Insert Into Ex
Select 1, null, 20001, 'ps2', 'COD2', 'new'
Union ALL
Select 2, 1002, 20002, 'xbox', 'HOLA', 'new'
Union ALL
Select 3, 1003, 20003, 'xbox', 'Spider Man', 'used'

Insert Into Ex1
Select 1, 50
Union ALL
Select 2, 51
Union ALL
Select 3, 61

Insert Into Ex2
Select 1013, 1001, 'ps2', 'COD2', 'new', 50
Union ALL
Select 1015, 1002, 'xbox', 'HOLA', 'new', 31


--Script for your Requirement

Declare @newKID Int
Select @newKID = MAX(KID) From Ex2
IF Exists(Select Name From Ex Except Select Name From Ex2)
   Begin
      Insert Into Ex2
         Select (@newKID + ROW_NUMBER() Over (Order By ID)) As KID, EAN, category, NAME, CONDITION, PRICE From
            (
                Select a.Id, a.EAN, a.category, a.NAME, a.CONDITION, b.PRICE
                From Ex As a JOIN Ex1 As b ON a.ID = b.ID
            Wink As p
         Where NAME IN (Select Name From Ex Except Select Name From Ex2)
   End
IF EXISTS(Select Name From Ex Intersect Select Name From Ex2)
Begin
   Update Ex2
   Set EAN = p.EAN, category = p.category, NAME = p.NAME, CONDITION = p.CONDITION, PRICE = p.PRICE
   From
   (
       Select c.EAN, a.category, a.NAME, a.CONDITION, b.PRICE, c.KID
       From Ex As a JOIN Ex1 As b ON a.ID = b.ID
       JOIN Ex2 As c On a.NAME = c.NAME
   Wink As p
   Where Ex2.EAN = p.EAN
End



You can make the necessary changes in the Script according to your actual data after testing it on the sample data. Here I use Ex and Ex1 as Source tables and Ex2 As the Destination Table.

This would not cause performance issues even while inserting a hud load of data and would work in a matter of seconds......as I have avoide using Cursors for Row by Row insertions.

Hope it helps.

Vinu Vijayan

For better and faster solutions please check..."How to post data/code on a forum to get the best help" - Jeff Moden ;-)
tough1
tough1
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 20
vinu512 (11/7/2012)


So, all you could do here is add an "ID" field to the destination table which would map the existing data at the Destination to the data at the Source. If this is done once then it would be a lot lot easier for you to solve such problems in the future.



Thanks a lot vinu512, for understanding my situation.
among all options you provided along with some I looked upon, I think adding new "ID" must be a good idea.The names not being unique as we used the same name for the used product and the new one indicating it on other columns .
so, going to first (but difficult kind of option Think), how shall I add the "ID" to this table which is around 53k big.

>><<I saw something called "lookup" and "fuzzy*lookup" these as far as I know are some statistical tools to compare the similarity and provides the confidence interval as well as the level of similarity.
would it be good idea to use it, I wanted to use it seems difficult. would it do what I am looking for to do?

my organization most of the time changes its sources of products, and different programmer are used to get those products to the database, there are same products from different supplier with same name but we try to display the cheapest one. The only way we use to differentiate the products is its kid.
Another problem is that it is not documented, most of the time have to go and see all the consequences of the changes I make.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search