April 2, 2015 at 4:17 pm
Hi all,
I have tables in my database, tblNames1, tblNames2, tblNames3, and a main addresses table (currently empty). Once I've imported the address data I need to match the addressIDs in the names tables to the Primary ID in the address table based on the values of a field CompanyName (which is common to all the tables) My issue is that I have a huge CSV file with the master address information but obviously SQL server needs to assign foreign keys so the names tables can linked to corresponding rows in address table. It's a a many to 1 relationship as their will be one address with multiple name entries. All the names are normalized so everything can be matched up... Does anybody know how I could acheive this..?
April 3, 2015 at 5:18 am
Judging by the zero replies this isn't possible. So could someone tell me whether it would be more logical to merge the tables and just have a flat table...
April 3, 2015 at 6:00 am
For starters can you post the DDL (create table) for the tables and some consumable sample data in the form of an insert statement? This will increase your chances of getting an answer to your question.
😎
This problem is rather common, quite few ways of solving it and some recognized patterns exist on the internet. The problem is that there are few factors directly related to the data such as the format etc. which do affect the choice of method.
April 3, 2015 at 6:34 am
My understanding is that you have a Names table, and then you have an Address table. Actually you have multiple Names tables for some reason, but we'll put that aside. I'll assume all these Names tables have the same column structure. So, you need to join the Names table with the Address table on CompanyName and then update a foreign key in the Names table containing the Address ID. That is, you need to link Companies with Addresses.
Below is an example of a joined update.
update N
set N.AddressID = A.AddressID
from tblNames1 as N
join Address as A on A.CompanyName = N.CompanyName;
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
April 3, 2015 at 7:19 am
Did you mean from information schema columns?
1IDintNULLNO
2databasenamevarchar50NO
3sectorvarchar75YES
4datedateNULLYES
5influenceclassificationvarchar50YES
6firstnamevarchar50NO
7lastnamevarchar75NO
8companyvarchar100NO
9phonevarchar50NO
10companyrevenuevarchar100YES
11companysizevarchar100YES
12titlevarchar-1YES
13linkedinIDvarchar-1YES
14emailvarchar-1YES
April 3, 2015 at 7:23 am
OK, the names table was a generic way of listing for brevity, the tables are different industry sectors. Yes all of those tables are exactly the same. will your solution permanently match them...? And for example, if "company a" in names was already matched to the address table, and I added more records for company a in names, could I just run the command again to update the new entries, because that is essential for the db..
April 3, 2015 at 7:39 am
tomsharp85 (4/3/2015)
Did you mean from information schema columns?
No that is what they meant. You need to provide DDL (create table statements) and some sample data (insert statements). Without knowing what the tables are like and having some data to work with we are just guessing. It takes a little effort to put this together but you will be rewarded with tested and fast code almost every time. Please take a few minutes and read the first link in my signature for best practices when posting questions.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 3, 2015 at 7:58 am
tomsharp85 (4/3/2015)
OK, the names table was a generic way of listing for brevity, the tables are different industry sectors. Yes all of those tables are exactly the same. will your solution permanently match them...? And for example, if "company a" in names was already matched to the address table, and I added more records for company a in names, could I just run the command again to update the new entries, because that is essential for the db..
The following will update all records in the table. If Address.AddressID hasn't changed, or if there is no join to Address, then Names.AddressID retains current value.
update N
set N.AddressID = A.AddressID
from tblNames1 as N
join Address as A on A.CompanyName = N.CompanyName;
The following is same as above, but will preserve previous updates to Names.AddressID, by applying a filter:
update N
set N.AddressID = A.AddressID
from tblNames1 as N
join Address as A on A.CompanyName = N.CompanyName
where N.AddressID is null;
The following will update all records in the table, but will only look at records in Address that were inserted on or after a certain date. You would use this in a scenario where a change of address results in multiple records for same CompanyName in Address table that are keyed on something like InsertDate.
update N
set N.AddressID = A.AddressID
from tblNames1 as N
join Address as A on A.CompanyName = N.CompanyName
and A.InsertDate >= '2015-04-01';
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
April 3, 2015 at 1:00 pm
Create Table for CompanyAddress1
USE MasterAddress
Create Table CompanyAddress1
(
ID int IDENTITY(1,1) PRIMARY KEY
,CompanyName varchar (MAX)
,CompanyNumber varchar (MAX)
,RegAddress_CareOf varchar(MAX)
,RegAddress_POBox varchar (MAX)
,RegAddress_AddressLine1 varchar (MAX)
,RegAddress_AddressLine2 varchar(MAX)
,RegAddress_PostTown varchar (MAX)
,RegAddress_County varchar(MAX)
,RegAddress_Country varchar(MAX)
,RegAddress_PostCode varchar (15)
,CompanyCategory varchar (MAX)
,CompanyStatus varchar (MAX)
,CountryOfOrigin varchar
,DissolutionDate date
,IncorporationDate date
,Accounts_AccountRefDay tinyint
,Accounts_AccountRefMonth tinyint
,Accounts_AccountsNextDueDate date
,Accounts_AccountsLastMadeUpDate date
,Accounts_AccountCategory varchar (MAX)
,[Returns_NextDueDate] date
,[Returns_LastMadeUpDate] date
,Mortgages_NumMortCharges int
,Mortgages_NumMortOutstanding int
,Mortgages_NumMortPartSatisfied int
,Mortgages_NumMortSatisfied int
,SICCode_SicText_1 varchar (MAX)
,SICCode_SicText_2 varchar (MAX)
,SICCode_SicText_3 varchar (MAX)
,SICCode_SicText_4 varchar (MAX)
,LimitedPartnerships_NumGenPartners varchar(MAX)
,LimitedPartnerships_NumLimPartners varchar(MAX)
,URI varchar (MAX)
,PreviousName_1CONDATE varchar (MAX)
,PreviousName_1CompanyName varchar (MAX)
,PreviousName_2CONDATE varchar (MAX)
,PreviousName_2CompanyName varchar (MAX)
,PreviousName_3CONDATE varchar (MAX)
,PreviousName_3CompanyName varchar (MAX)
);
The insert:
INSERT INTO CompanyAddress1
VALUES (
'BARDOG NEW OPCO LIMITED OPCO LIMITED'
,'5843995'
,NULL
,NULL
,'SUITE 555 SECOND FLOOR'
,'DESIGN CENTRE EAST MILL'
,'YORK'
,NULL
,NULL
,'P05T K0DE'
,'Private Limited Company'
,'Active'
,'United Kingdom'
,NULL
,'2006-06-2006'
,31
,12
,'2015-09-30'
,'2013-12-31'
,'FULL'
,'2015-07-19'
,'2014-07-14'
,0
,0
,0
,0
,'74990 - Non-trading company'
,'86900 - Other human health activities'
,NULL
,NULL
,0
,0
,'http://biszness.data.gov/id/company/05843995'
,NULL
,NULL
,NULL
,NULL
,NULL
,NULL
);[/code]
The create for the names tables:
CREATE TABLE tblFinancialServices
(
ID int IDENTITY(1,1) PRIMARY KEY
,Sector varchar (100)
,InfluenceClassification varchar (100)
,FirstName varchar(255)
,LastName varchar(255)
,Title varchar (255)
,CompanyName varchar(MAX)
,Phone varchar (50)
,Email varchar (MAX)
,LinkedinID varchar (MAX)
,Website varchar (MAX)
,CompanyRevenue varchar (25)
,CompanySize varchar (25)
);
I don't think an insert into the names is relevant here.Also I thought of splitting the address tables into 3 as in total there are more than 10million records, or would this run ok if it was 1 table?
April 3, 2015 at 1:10 pm
Holy cow. Why so many varchar max columns? You need to use appropriately sized columns. You are never going to exceed 8,000 characters for ANY of those columns.
Here is where I am confused. You originally mentioned several names tables but in your last post it sounds like there is only one. It is extremely unclear what you are trying to do here. What do you mean by "Do I need a foreign key (from address1) in my names table"? We don't even see a names table and you originally said there were several. However suddenly there is now a tblFinancialServices table? BTW, I would get rid of that prefix as it does nothing but make it more difficult to code. I am willing and able to help but until there is a clear question there is little anybody can do.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 3, 2015 at 1:19 pm
tomsharp85 (4/3/2015)
Create Table for CompanyAddress1...
I don't think an insert into the names is relevant here.Also I thought of splitting the address tables into 3 as in total there are more than 10million records, or would this run ok if it was 1 table?
10 million rows is no problem, even for a low end server, however, you need to index CompanyName on both tables, if you want to join them efficiently on that column.
Also, change variable length columns to 80, 255, or even 8000, but not (MAX), because MAX datatypes are stored off-row. MAX is for storing things like blocks of text, XML, or images.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
April 3, 2015 at 1:19 pm
Further on Sean's answer, seriously suggest you look into some database design patterns, what you got there is just awful
😎
April 3, 2015 at 1:25 pm
the financialservices is names, they will all be set up the same but obviously the names will be different. I will rethink it and get back to you...
April 3, 2015 at 2:06 pm
Eirikur Eiriksson (4/3/2015)
Further on Sean's answer, seriously suggest you look into some database design patterns, what you got there is just awful😎
Oye!!! I didn't even actually look at the table when I saw the status of the post.
Wow, that "Address" table is a complete disaster!!! I put address in quotes because it contains data from a lot more things than just addresses.
You need to do a LOT of reading about normalization. This isn't even in the realm of normalized. You have all sorts of pieces of data all crammed into a single table. You have managed to get your design to first normal form (1NF) but are not yet approaching second normal form (2NF). You need to get to at least third normal form (3NF) and as close to fourth (4NF) as you can.
You also need to read up about datatypes and use the appropriate ones. Everything should NOT be varchar(max).
The following link has a number of articles that would be a good place to help you get started. http://www.sqlservercentral.com/stairway/72400/%5B/url%5D
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 3, 2015 at 3:08 pm
Ok, thanks for all your points- I will separate the other information that goes beyond addresses into another table and change the lengths. Then I will see where I am in relation to the problem of mathching things up...
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply