Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Download content ie page like txt file Expand / Collapse
Author
Message
Posted Monday, February 11, 2008 5:02 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, September 7, 2008 11:17 PM
Points: 8, Visits: 61
Hi, I have problem, how can I download content ie page like txt file. Page is : http://www.cnb.cz/cs/financni_trhy/devizovy_trh/kurzy_devizoveho_trhu/denni_kurz.txt?
I need this page save like txt file on drive, e.g. c:/denni_kurz.txt
Thanks for help.
Post #453789
Posted Monday, February 11, 2008 11:17 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 2:39 PM
Points: 12,923, Visits: 32,313
much easier to do that from an application.

from sql server 2000, it's difficult to do in TSQL because you must use an object with sp_oacreate to get the text of the web page, and you are limited to 4000 bytes at a time...so if the page is 25K, you need to get 7 "slices" of the web page, one at a time, to get it into sql server.


I did that for fun in a different thread here on SSC, but it required a separate program i wrote to download the web page selected.

In my case, I was building a "scraper", so it got a web page and put it in a table in SQL server.

In order to do this, you'll need to grab the dll from this project and put it in your path or .bin directory for SQL server: the source code, vb6 project, and compiled dll are all in this zip file:
http://www.stormrage.com/blogpix/WebReader.zip
[
this procedure does all the work:
CREATE FUNCTION dbo.GetWebPage (@webpage varchar(1000)='' )
RETURNS @webcontents table
(
rowid smallint IDENTITY(1,1), --Array index
pagetext varchar(4000) --Array element contents
)
AS
BEGIN
DECLARE @hr int,
@webreader int,
@Num4K int,
@i int,
@contents varchar(4000)
IF LTRIM(RTRIM(ISNULL(@webpage,''))) =''
SET @webpage = 'http://www.yahoo.com'
EXEC @hr = sp_OACreate 'WebReader.Reader', @webreader OUT
EXEC @hr = sp_OAMethod @webreader, 'GetWebPageContentsAsString',null,@webpage
EXEC @hr = sp_OAGetProperty @webreader, 'Num4KLengths', @Num4K OUT
SET @i = 1
WHILE @i <=@Num4K
BEGIN
EXEC @hr = sp_OAGetProperty @webreader, 'PageSlice', @contents OUT, @i
INSERT INTO @webcontents(pagetext) VALUES(@contents)
SET @i=@i + 1
End
EXEC @hr = sp_OADestroy @webreader

Return
End
select * from dbo.GetWebPage(default)




Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #453984
Posted Tuesday, February 12, 2008 3:25 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Wednesday, May 1, 2013 8:10 AM
Points: 445, Visits: 840
You can do it in an activex script in a DTS package like this:

Function Main()
Set socket = CreateObject("MSXML2.ServerXMLHTTP.4.0")
socket.setTimeouts 300000, 300000, 350000, 350000
socket.Open "GET", "http://www.cnb.cz/cs/financni_trhy/devizovy_trh/kurzy_devizoveho_trhu/denni_kurz.txt?", False
socket.send
Set fso = CreateObject("scripting.filesystemobject")
Set myfile = fso.createtextfile("c:\denni_kurz.txt", True)
myfile.write socket.responsetext



Main = DTSTaskExecResult_Success
End Function







Post #454770
Posted Wednesday, February 13, 2008 1:31 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, September 7, 2008 11:17 PM
Points: 8, Visits: 61
Thanks to all. I use version of mrpolecat, and that's works fine.
Post #454890
Posted Wednesday, February 13, 2008 3:29 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, September 7, 2008 11:17 PM
Points: 8, Visits: 61
Please I have one problem yet. I save txt file onto disk, but when I try read this txt file I will get error.
When I try DTS package and bulk insert with row terminator comma and column terminator TAB, i get error Unexpected end of file.
When I try stored procedure and bulk insert :
declare @SQL varchar(8000)
SET @SQL= 'BULK INSERT #WRK FROM ''c:\denni_kurz.txt' + ''' WITH (CODEPAGE = ''1250'' , FIELDTERMINATOR = ''|'',ROWTERMINATOR ='''') '
CREATE TABLE #WRK (TEXT_STRING text)
EXEC(@SQL)
SELECT * FROM #WRK

I get only first few rows, but not all.

When I try vb.net project with StreamReader, then I read all text.

Where is the problem ?
I don't see anything wrong in this text.
Thanks for help.
Post #454927
Posted Wednesday, February 13, 2008 4:30 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, March 6, 2014 6:13 PM
Points: 318, Visits: 1,183
Come on - we can't do all the work for you. You need to understand the format of your incoming data - first, the text file available currently at that URL is in UTF8 format, so you need to set your codepage to 65001. And I'm not sure why you would be trying a column seperator of TAB and rowterminator of comma when the data is pipe-delimited with standard CRLF row terminators...

Here's a really quick test using the text file available now at the URL - I've made some assumptions as to the column names as I can't read Czech:

-------------------------------------------------------
create table #Test
(Country nvarchar(256),
CurrencyName nvarchar(256),
Units int,
ISOCode char(3),
CurrentValue varchar(256))

bulk insert #Test
from 'c:\denni_kurz.txt'
with
(
FIELDTERMINATOR ='|',
ROWTERMINATOR ='',
CODEPAGE = '65001',
FIRSTROW = 3
)

select
*
from #Test

drop table #Test
-------------------------------------------------------


It imports fine for me. Note however I've used varchar for the CurrentValue (last column) as the amounts are presumably in European decimal style with commas as the decimal seperator. You'd need to do some further manipulation on that column to get it into decimal format - there may be a neat way of doing that with a proper format file.

Regards,

Jacob
Post #455454
Posted Sunday, June 22, 2008 8:05 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, June 24, 2013 9:00 PM
Points: 2, Visits: 51
The url I am accessing requires a user name and the password. How can I pass the user name and the password to down load the file from a url??
Post #521482
Posted Monday, June 23, 2008 7:08 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Wednesday, May 1, 2013 8:10 AM
Points: 445, Visits: 840
socket.Open "GET", "http://www.website.com/page.html", False, "username", "password"







Post #521735
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse