I have a table with HTML
data Like below :
SELECT
[ID],
[title],
[authors_html],
[authors_text]
FROM
[wiley].[dbo].[library]
Sample of authors_html
column data :
<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A" class="author-name accordion-tabbed__control" data-id="a1" data-db-target-for="a1" aria-controls="a1" aria-haspopup="true" id="a1_Ctrl" role="button"><span>M. A. PRICE</span><i aria-hidden="true" class="icon-section_arrow_d"></i></a>
<div class="author-info accordion-tabbed__content" data-db-target-of="a1" aria-labelledby="a1_Ctrl" role="region" id="a1">
<p>Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink" href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A">Search for more papers by this author</a>
</div>
</div>
<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G" class="author-name accordion-tabbed__control" data-id="a2" data-db-target-for="a2" aria-controls="a2" aria-haspopup="true" id="a2_Ctrl" role="button"><span>C. G. ARMSTRONG</span><i aria-hidden="true" class="icon-section_arrow_d"></i></a>
<div class="author-info accordion-tabbed__content" data-db-target-of="a2" aria-labelledby="a2_Ctrl" role="region" id="a2">
<p>Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink" href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G">Search for more papers by this author</a>
</div>
</div>
I need to transfer this column data to Researcher
table :
ID | Full_Name | Tel | URL | Address | Country | Ostan | University | Madrak | Field | org | Fax | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1007 | M. A. PRICE | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A | Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K. | U.K. | NULL | The Queen's University of Belfast | NULL | NULL | NULL | ||
1008 | C. G. ARMSTRONG | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G | Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K. | U.K. | NULL | The Queen's University of Belfast | NULL | NULL | NULL | ||
1009 | B. BOROOMAND | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=BOROOMAND%2C+B | Institute for Numerical Methods in Engineering, University of Wales, Swansea, U.K. | U.K. | University of Wales | NULL | NULL | NULL | |||
1010 | O. C. ZIENKIEWICZ | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=ZIENKIEWICZ%2C+O+C | Institute for Numerical Methods in Engineering, University of Wales, Swansea, U.K. | U.K. | University of Wales | NULL | NULL | NULL | |||
1011 | ZHAO‐PING JIAO | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=JIAO%2C+ZHAO-PING | Department of Civil Engineering, South China Construction University, (West Campus), 510405 Guangzhou, China | China | NULL | South China Construction University | NULL | NULL | NULL | ||
1012 | THEODORE H. H. PIAN | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=PIAN%2C+THEODORE+H+H | Department of Aeronautics and Astronautics, Massachussets Institute of Technology, Cambridge, MA, U.S.A. | U.S.A. | NULL | NULL | NULL | ||||
1013 | SHENG YONG | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=YONG%2C+SHENG | Department of Modern Mechanics, University of Science and Technology of China, Hei Fei, China | China | University of Science and Technology of China | NULL | NULL | NULL | |||
1014 | MIN‐WEI HUANG | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=HUANG%2C+MIN-WEI | Optimal Design Laboratory, College of Engineering, The University of Iowa, Iowa City, IA 52242, U.S.A. | U.S.A. | Iowa | The University of Iowa | NULL | NULL | NULL | ||
1015 | JASBIR S. ARORA | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=ARORA%2C+JASBIR+S | Optimal Design Laboratory, College of Engineering, The University of Iowa, Iowa City, IA 52242, U.S.A. | U.S.A. | Iowa | The University of Iowa | NULL | NULL | NULL | ||
1016 | C. S. TSAI | NULL | http://185.141.105.238/action/doSearch?ContribAuthorStored=TSAI%2C+C+S | Department of Civil Engineering, Feng Chia University, Taichung, Taiwan, Republic of China | Republic of China | Feng Chia University | NULL | NULL | NULL |
I tried to use xml potential for separate part of data( Note: Add a single-cot in below cod manually):
DECLARE @HtmlTbl TABLE (ID INT IDENTITY, Html XML)
INSERT INTO @HtmlTbl(Html) VALUES('<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A" class="author-name accordion-tabbed__control" data-id="a1" data-db-target-for="a1" aria-controls="a1" aria-haspopup="true" id="a1_Ctrl" role="button"><span>M. A. PRICE</span><i aria-hidden="true" class="icon-section_arrow_d"></i></a>
<div class="author-info accordion-tabbed__content" data-db-target-of="a1" aria-labelledby="a1_Ctrl" role="region" id="a1">
<p>Department of Mechanical and Manufacturing Engineering, The Queen'+'s University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink" href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A">Search for more papers by this author</a>
</div>
</div>
<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G" class="author-name accordion-tabbed__control" data-id="a2" data-db-target-for="a2" aria-controls="a2" aria-haspopup="true" id="a2_Ctrl" role="button"><span>C. G. ARMSTRONG</span><i aria-hidden="true" class="icon-section_arrow_d"></i></a>
<div class="author-info accordion-tabbed__content" data-db-target-of="a2" aria-labelledby="a2_Ctrl" role="region" id="a2">
<p>Department of Mechanical and Manufacturing Engineering, The Queen'+'s University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink" href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G">Search for more papers by this author</a>
</div>
</div>
COPY TO CLIPBOARD SELECT ALL')
-- SELECT
-- Html.query('//div')
--FROM @HtmlTbl
SELECT
C.value('(.)[1]', 'varchar(1000)')
FROM @HtmlTbl
CROSS APPLY Html.nodes('//div') AS T(C)
My destination table is Researcher and Company :
CREATE TABLE [dbo].[Researcher]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Full_Name] [nvarchar](50) NULL,
[Email] [nvarchar](100) NULL,
[Tel] [nvarchar](20) NULL,
[URL] [nvarchar](max) NULL,
[Address] [nvarchar](max) NULL,
[Country] [nvarchar](100) NULL,
[Ostan] [nvarchar](100) NULL,
[University] [nvarchar](100) NULL,
[Madrak] [nvarchar](100) NULL,
[Field] [nvarchar](100) NULL,
[org] [nvarchar](250) NULL,
[Fax] [nvarchar](20) NULL
)
CREATE TABLE [dbo].[Company]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Title] [nvarchar](255) NULL,
[Type] [nvarchar](100) NULL,
[Country] [nvarchar](100) NULL,
[City] [nvarchar](100) NULL,
[Address] [nvarchar](max) NULL,
[Tel] [nvarchar](100) NULL,
[Fax] [nvarchar](100) NULL,
[PostCode] [nvarchar](20) NULL
)
I need transfer data from authors_html
column to Company and Researcher and cleaning data.
If you need connect to sample database use this connection :
IP: 185.141.105.232
user: wiley
pass: wiley
DB: wiley
Here is your starting point.
It is for SQL Server 2016 onwards.
SQL
DECLARE @HtmlTbl TABLE (ID INT IDENTITY, Html XML);
INSERT INTO @HtmlTbl(Html) VALUES('<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A"
class="author-name accordion-tabbed__control" data-id="a1"
data-db-target-for="a1" aria-controls="a1" aria-haspopup="true"
id="a1_Ctrl" role="button">
<span>M. A. PRICE</span>
<i aria-hidden="true" class="icon-section_arrow_d"></i>
</a>
<div class="author-info accordion-tabbed__content"
data-db-target-of="a1" aria-labelledby="a1_Ctrl" role="region"
id="a1">
<p>Department of Mechanical and Manufacturing Engineering, The Queen''s University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink"
href="http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A">Search for more papers by this author</a>
</div>
</div>
<div class="accordion-tabbed__tab-mobile accordion__closed">
<a href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G"
class="author-name accordion-tabbed__control" data-id="a2"
data-db-target-for="a2" aria-controls="a2" aria-haspopup="true"
id="a2_Ctrl" role="button">
<span>C. G. ARMSTRONG</span>
<i aria-hidden="true" class="icon-section_arrow_d"></i>
</a>
<div class="author-info accordion-tabbed__content"
data-db-target-of="a2" aria-labelledby="a2_Ctrl" role="region"
id="a2">
<p>Department of Mechanical and Manufacturing Engineering, The Queen''s University of Belfast, Belfast BT95AH, U.K.</p>
<a class="moreInfoLink"
href="http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G">Search for more papers by this author</a>
</div>
</div>');
-- INSERT INTO dbo.Researcher (Full_Name, [URL], [Address], University, Country) -- uncommemnt when you are ready
SELECT ID
, c.value('(a/span/text())[1]', 'nvarchar(50)') AS Full_Name
, c.value('(div/a/@href)[1]', 'nvarchar(max)') AS [URL]
, c.value('(div/p/text())[1]', 'nvarchar(max)') AS [Address]
, JSON_VALUE(x,'$[1]') AS University
, JSON_VALUE(x,'$[3]') AS Country
-- continue with the rest
FROM @HtmlTbl
CROSS APPLY Html.nodes('/div') AS t(c)
CROSS APPLY (VALUES ('["' + REPLACE(c.value('(div/p/text())[1]', 'nvarchar(max)'),',','","') + '"]')) AS t2(x);
Output
+----+-----------------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+------------------------------------+---------+
| ID | Full_Name | URL | Address | University | Country |
+----+-----------------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+------------------------------------+---------+
| 1 | M. A. PRICE | http://185.141.105.238/action/doSearch?ContribAuthorStored=PRICE%2C+M+A | Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K. | The Queen's University of Belfast | U.K. |
| 1 | C. G. ARMSTRONG | http://185.141.105.238/action/doSearch?ContribAuthorStored=ARMSTRONG%2C+C+G | Department of Mechanical and Manufacturing Engineering, The Queen's University of Belfast, Belfast BT95AH, U.K. | The Queen's University of Belfast | U.K. |
+----+-----------------+-----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+------------------------------------+---------+