Please review my small SSIS project demands.

  • I have posted this project request on http://www.getafreelancer.com

    Here: https://www.getafreelancer.com/projects/546345.html

    Can I be confident that all this can be done in TSQL server 2005??

    What tips and tricks can your offer me??:-D

    Is there a better website/contractor service out there for a project like this (sorry dont have a big budget)..???

    This is a pure MS SQL server 2005 SSIS project. I want it all done at SQL server level in TSQL/VBSCRIPT, so I can use the scheduling tool of MS Sql server, and copy and past code in the future.This tool will be used behind a Asp.net(ajax) 2.0 vb.net website, on MS SQL 2003 webserver( I think).

    NOTE:

    1)Running this FTP process from a client side website code doesnt seam wise (I assume).

    2)Data type: Stock market OHLC daily data and related files.

    3)A group represents data for an exchange: ASX is the Australian stock exchange.

    DATATYPES: Each group(exchange) have several data types:

    0) This is the root directory: FTPDirectory001.jpg

    1) Daily End of day data(data every day) : DataSample_EOD.jpg (** see below)

    2) Names of symbols (data every day): DataSample_SymbolNames.jpg

    3) Symbol name changes (if any): DataSample_NameChanges.jpg

    4) Stock splits (if any): DataSample_Splits.jpg (This example is NSAD not ASX, example only).

    ** Data format is CSV: Symbol, Date(YYYYMMDD),open, high, low, close, volume

    I need to FTP flat text files from the above FTP website into a 2005 SQL server database.

    POINTS:

    1) Text files 'name format' changes each day (by date) : ASX_20091106.txt to ASX_20091107.txt

    2) I need the FTP process to run until txt file is captured and loaded (with say a max of 10 attempts

    with a time delay of 5 min between each attempt. Logging as you go see (6) below).

    3) I need step (2) to run at 5.10PM New York time, and again at 7.10pm New York time. The 7.10pm process is an delete and replace of the 5.10pm process as data changes between 5.10pm and 7.10pm.

    4) If the 7.10pm (NY time) run fails I need back up FTP run to happen at 9.00pm (NYTIME).

    5) In (2) to (4) above the ERROR checking and management is CRITICAL.

    6) If ANY PART OF the FTP download fails, redo the whole process (over 4 file types).

    7) In (2) to (4) a database table to hold a process log of success of fail of each FTP run.

    8) I need the name of the file in (1) to be allowed to be adjusted for local server time relative to New York time. So If my server is in Sydney and is GMT9+ your date calculations can be adjusted.

    9) In the image FTPDirectory001.jpg, I will only want to get files for ASX and INDEX for now. But have the ability to turn other file groups on or off (say in future I will want ASX,INDEX,CBOT,CME).

    10) In the image DataSample001.jpg, I wish to download ALL data into a SQL server table. So all ASX OHLC data would go into a table called ASX_ohlc, ASX_Splits, ASX_Names, ASX_NameChanges, and latter a CME data would gointo a table called INDEX_ohlc, INDEX_Splits, INDEX_Names, INDEX_NameChanges.

    11) All the DATATYPES must be FTP at each daily process, for each file GROUP ( ie ASX, INDEX, etc).

    12) You do not need to adjust data for SPLITS or NAME CHANGES I will do that.

    I am more concerned about securing a rigid and reliable FTP DOWNLOAD process that I can rely on 100%. So that the error rate is near ZERO over a 12 month period. I need some one who will develop a process that puts this first: Error checking, logging, speed, alerts on ultimate download failure after 9.00pm. Obviously some of this cant be programmed as settings would be set on server scheduler. But hows toos would be nice.

  • You may want to mention how many rows you expect per file and whether the content is zipped or not.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks..

    Number of rows is variable.

    Zipped, as far as I can tell. NOT Zipped !:-)

  • Posted several questions that may be relevant to anyone thinking about bidding on the project (I might even bid on the project).

    From what I saw in your pictures, the files may not be zipped. Hard to tell. I think it is relevant only if you expect the download process to unzip the files for you before you process the files.

    Didn't see anything to indicate that in your requirements.

  • Lynn,

    I have seen your work, and I know its first class. But I am sorry I can make you rich, but I would be pleased if you are keen. If you wanna do a deal, start PM me with your bid/fees.

    Here is the project: https://www.getafreelancer.com/projects/546345.html

    You can register via getAfreelancer as a supplier, and bid. But you want to work ouside this site I can pay via paypal and/or moneybookers.

    HERE IS how I would like the project executed/run:

    (same for all bidders)

    This is how would like it to go:

    1) You get it working at your end, run it for a week (which includes a weekend, so Wed to Wed)

    2) While (1) is running send me the DBase structure for review.

    3) I pay 75% when (1) is done, I will get It running On a third party web server, after a week of success I pay you the balance.

    4) There may be add ons, I will pay an hourly rate for that: So what is your hourly rate.

    5) If it all works out great you will be my preferred suppy for a few more Dbase projects I have.

    I am a 10+ payer ranking. You will get paid.

    I must have a rock solid FTP data process, are you sure you can deliver !

    NOTE: ALL the work to be TSQL and script SSIS. Use code as much as possible.

  • From what I saw in your pictures, the files may not be zipped. Hard to tell. I think it is relevant only if you expect the download process to unzip the files for you before you process the files.

    From what I can tell, files on FTP server are NOT ZIPPED.

    NO Zipping or unzipping required.

    Obviously if there is I will adjust project fee. BUT I am 99% sure there is no zipping.:-)

  • Digs (11/8/2009)


    Number of rows is variable.

    Heh... I know that... what is the order of magnitude?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Magnitude of what ??

    Sorry..dont understand...

  • Do I understand this correctly?

    1. In SSIS, but only using SQL tasks and .Net Script components?

    2. Reliable FTP?

    3. FTP needs to be secure?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • YES

    YES ..as much as it can be..

    YES FTP data is behind a username and password...

  • It sounds feasible and doable.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • UPDATES of thoughts on project..

    Additional information submitted:

    11/08/2009 at 14:40 EST:

    Payement made on completion. I am 10 star payer. So I will honor my debts, Thanks.

    11/08/2009 at 15:50 EST:

    Another point

    1) allow for weekends in daily data selection

    2) allow for several days testing LIVE at your end

    3) I will provide you a username and password to access ftp site.

    11/09/2009 at 16:10 EST:

    LOGIC ISSUE: How to determine if there is a new OHLC data file on the FTP directory.

    NOTE: Unless you have a better logic idea.

    Consider FTP files :

    11/05/2009 07:44p.m. 105,682 ASX_20091103.txt

    11/05/2009 07:44p.m. 106,215 ASX_20091104.txt

    11/05/2009 07:58p.m. 106,300 ASX_20091105.txt

    11/06/2009 02:45a.m. 106,525 ASX_20091106.txt

    11/09/2009 02:45a.m. 106,744 ASX_20091109.txt

    Step1: Scan Dbase table of ASX_ohlc for the last 5 posting days(if there is 5).

    Step2: Scan the FTP directory for the last 5(if there is 5) files names ( use names not posting times, as format of FTP directory can change).

    Step3: Compare scans of data already in Dbase tables to data held on FTP server.

    Step4: For files not in DBase do a FTP load as described in the main process. NOTE: Normally there will be only one file missing, that being the latest file, but as with holidays and data failures, if there a more files missing they must be able to be uploaded as well. So If I have 2 or 3 FTP files missing in the Dbase, then the upload would be just that.

    This is the type of secure logic I am hoping for.

  • Based on what you've asked for you can be confident the entire solution could be developed in MSSQL 2005/2008 SSIS (possibly some VB script may be required to run within ssis) and the package deployed to run via SQL Schedules any time you want as often as you want.

  • Thanks for all your interest.

    FTP is dead for pulling down OHLC stock data.

    I have found these new providers that are XML and SOAP

    http://www.xignite.com/Default.aspx

    phew..saved me $$$$

    Thanks to all those that helped and quoted.

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply