Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Download content ie page like txt file


Download content ie page like txt file

Author
Message
William-462471
William-462471
Forum Newbie
Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)

Group: General Forum Members
Points: 8 Visits: 61
Hi, I have problem, how can I download content ie page like txt file. Page is : http://www.cnb.cz/cs/financni_trhy/devizovy_trh/kurzy_devizoveho_trhu/denni_kurz.txt?
I need this page save like txt file on drive, e.g. c:/denni_kurz.txt
Thanks for help.
Lowell
Lowell
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14956 Visits: 38953
much easier to do that from an application.

from sql server 2000, it's difficult to do in TSQL because you must use an object with sp_oacreate to get the text of the web page, and you are limited to 4000 bytes at a time...so if the page is 25K, you need to get 7 "slices" of the web page, one at a time, to get it into sql server.


I did that for fun in a different thread here on SSC, but it required a separate program i wrote to download the web page selected.

In my case, I was building a "scraper", so it got a web page and put it in a table in SQL server.

In order to do this, you'll need to grab the dll from this project and put it in your path or .bin directory for SQL server: the source code, vb6 project, and compiled dll are all in this zip file:
http://www.stormrage.com/blogpix/WebReader.zip
[
this procedure does all the work:

CREATE FUNCTION dbo.GetWebPage (@webpage varchar(1000)='' )
RETURNS @webcontents table
(
rowid smallint IDENTITY(1,1), --Array index
pagetext varchar(4000) --Array element contents
)
AS
BEGIN
DECLARE @hr int,
@webreader int,
@Num4K int,
@i int,
@contents varchar(4000)
IF LTRIM(RTRIM(ISNULL(@webpage,''))) =''
SET @webpage = 'http://www.yahoo.com'
EXEC @hr = sp_OACreate 'WebReader.Reader', @webreader OUT
EXEC @hr = sp_OAMethod @webreader, 'GetWebPageContentsAsString',null,@webpage
EXEC @hr = sp_OAGetProperty @webreader, 'Num4KLengths', @Num4K OUT
SET @i = 1
WHILE @i <=@Num4K
BEGIN
EXEC @hr = sp_OAGetProperty @webreader, 'PageSlice', @contents OUT, @i
INSERT INTO @webcontents(pagetext) VALUES(@contents)
SET @i=@i + 1
End
EXEC @hr = sp_OADestroy @webreader

Return
End
select * from dbo.GetWebPage(default)




Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

mrpolecat
mrpolecat
SSC-Addicted
SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)

Group: General Forum Members
Points: 452 Visits: 856
You can do it in an activex script in a DTS package like this:

Function Main()
Set socket = CreateObject("MSXML2.ServerXMLHTTP.4.0")
socket.setTimeouts 300000, 300000, 350000, 350000
socket.Open "GET", "http://www.cnb.cz/cs/financni_trhy/devizovy_trh/kurzy_devizoveho_trhu/denni_kurz.txt?", False
socket.send
Set fso = CreateObject("scripting.filesystemobject")
Set myfile = fso.createtextfile("c:\denni_kurz.txt", True)
myfile.write socket.responsetext



Main = DTSTaskExecResult_Success
End Function






William-462471
William-462471
Forum Newbie
Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)

Group: General Forum Members
Points: 8 Visits: 61
Thanks to all. I use version of mrpolecat, and that's works fine.
William-462471
William-462471
Forum Newbie
Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)

Group: General Forum Members
Points: 8 Visits: 61
Please I have one problem yet. I save txt file onto disk, but when I try read this txt file I will get error.
When I try DTS package and bulk insert with row terminator comma and column terminator TAB, i get error Unexpected end of file.
When I try stored procedure and bulk insert :
declare @SQL varchar(8000)
SET @SQL= 'BULK INSERT #WRK FROM ''c:\denni_kurz.txt' + ''' WITH (CODEPAGE = ''1250'' , FIELDTERMINATOR = ''|'',ROWTERMINATOR ='''') '
CREATE TABLE #WRK (TEXT_STRING text)
EXEC(@SQL)
SELECT * FROM #WRK

I get only first few rows, but not all.

When I try vb.net project with StreamReader, then I read all text.

Where is the problem ?
I don't see anything wrong in this text.
Thanks for help.
Jacob Luebbers
Jacob Luebbers
Old Hand
Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)

Group: General Forum Members
Points: 330 Visits: 1215
Come on - we can't do all the work for you. You need to understand the format of your incoming data - first, the text file available currently at that URL is in UTF8 format, so you need to set your codepage to 65001. And I'm not sure why you would be trying a column seperator of TAB and rowterminator of comma when the data is pipe-delimited with standard CRLF row terminators...

Here's a really quick test using the text file available now at the URL - I've made some assumptions as to the column names as I can't read Czech:

-------------------------------------------------------
create table #Test
(Country nvarchar(256),
CurrencyName nvarchar(256),
Units int,
ISOCode char(3),
CurrentValue varchar(256))

bulk insert #Test
from 'c:\denni_kurz.txt'
with
(
FIELDTERMINATOR ='|',
ROWTERMINATOR ='',
CODEPAGE = '65001',
FIRSTROW = 3
)

select
*
from #Test

drop table #Test
-------------------------------------------------------


It imports fine for me. Note however I've used varchar for the CurrentValue (last column) as the amounts are presumably in European decimal style with commas as the decimal seperator. You'd need to do some further manipulation on that column to get it into decimal format - there may be a neat way of doing that with a proper format file.

Regards,

Jacob
sanji_kumar
sanji_kumar
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: General Forum Members
Points: 2 Visits: 51
The url I am accessing requires a user name and the password. How can I pass the user name and the password to down load the file from a url??
mrpolecat
mrpolecat
SSC-Addicted
SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)SSC-Addicted (452 reputation)

Group: General Forum Members
Points: 452 Visits: 856
socket.Open "GET", "http://www.website.com/page.html", False, "username", "password"






Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search