Hi,
Thanks in advance for your help. I’m trying to download some data from a website and import it into a SQL Database Here is the website:
Disclosure Table – The Takeover Panel
The website posts the data in multiple formats; however I think the most useful format they provide (for this use case) is the XML version (Found below):
https://www.thetakeoverpanel.org.uk/new/disclosureTable/v3/disclosuretable.xml
The XLM file has multiple sections which I’d like to parse into multiple datasets (Table variables or Temp tables):
[additions] [deletions], [amendments] and [maintable]
DECLARE @Additions AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)
DECLARE @Deletions AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)
DECLARE @MainTable AS TABLE
(
[RowID] INT IDENTITY(1,1),
[CaseID] INT,
[Offeree_Name] NVARCHAR(MAX),
[Offer_Period_Commenced] DATETIME,
[Name] NVARCHAR(250),
[ISIN] NVARCHAR(150),
[NSI] NVARCHAR(150),
[Offeror_Name] NVARCHAR(MAX),
[Offeror_Rule_26_Deadline] NVARCHAR(100),
[offeror_identified] DATETIME,
[offeror_empty_2.10] NVARCHAR(MAX)
)
Each dataset has the same format.
I’ve been able to extract individual nodes (ISIN) in this example:
SELECT
T.C.value('.', 'varchar(100)') AS [maintable_ISIN]
FROM
@myDoc.nodes('(/disclosure_table/maintable/case/offeree/twoten_information/twoten_line/ISIN)') as T(C)
But I’d like to extract the whole thing directly into their relevant datasets so that they look like the attached screenshot (Example_Results.png).
Can anyone help me extract the data into separate datasets?
Thanks
Posted this question elsewhere and got a great answer:
SELECT
ROW_NUMBER()OVER(ORDER BY Caseid.value('@caseid[1]', 'int')) AS RowID,
Caseid.value('@caseid[1]', 'int') AS caseid,
offeree.value('@name[1]', 'NVARCHAR (200)') AS Offeree_Name,
offeree.value('@offer_period_commenced[1]', 'VARCHAR(20)') AS Offer_Period_Commenced,
twoten_information.value('name[1]', 'VARCHAR(20)') AS Name,
twoten_information.value('ISIN[1]', 'VARCHAR(20)') AS ISIN,
twoten_information.value('NSI[1]', 'VARCHAR(20)') AS NSI,
offeror.value('@name[1]', 'NVARCHAR (200)') AS Offeror_Name,
offeror.value('@rule_26_deadline[1]', 'NVARCHAR (200)') AS Offeror_rule_26_deadline,
offeror.value('@offeror_identified[1]', 'NVARCHAR (200)') AS Offeror_identified,
offeror.value('empty_2.10[1]', 'NVARCHAR (200)') AS [Offeror_empty_2.10]
FROM @myDoc.nodes('/disclosure_table/maintable/case') AS XMLtable1(Caseid)
CROSS APPLY Caseid.nodes('offeree') XMLtable2(offeree)
CROSS APPLY Caseid.nodes('offeror') XMLtable3(offeror)
CROSS APPLY offeree.nodes('twoten_information/twoten_line') XMLtable4(twoten_information)
"To improve your answer:
There is no need to use [1] for the attributes. Each attribute is unique in the context of its element.
While addressing elements, it is better to use the following XPath expression:
'(ISIN/text())[1]' instead of the 'ISIN[1]'
It will produce a dramatic performance improvement.£
January 6, 2023 at 5:54 pm
Aye... thanks for posting the link for that. Much appreciated.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy