Huge amount of data

  • I have huge amout of data.so can any tell me that where i store that data. In sql server database or any other place in different format like xml.If other format then give me whole procedure so that i save the data.

    each day no of records:- 6012

    in a month:-6012*30

    in a year:- 6012*30*365

    so how can i store that record

  • You need to supply a lot more information before anyone will be able to advise you. For starters...

    What is this data?

    Where is this data coming from?

    What format is it in?

    What do you want to do with this data once you have it?

    Are you sure you have your maths right?

    in a year:- 6012*30*365

    Shouldn't that be 6012*365 or 6012*30*12

    Either way, that isn't a huge amount of data these days.

  • Where is this data coming from:-a website(amfiindia.com)

    What format is it in:-this is in text format

    What do you want to do with this data once you have it:- some analysis day,month,year wise

    yes this is :-6012*30*12

  • In that case, SQL Server does seem like a good choice for storing your data, but you are going to have to do a fair amount of work, for starters...

    Analyse the data to come up with the design of the schema and then create your database.

    Write something (possibly an SSIS package, BULK INSERT or BCP) to import the data into your database.

    Write some queries to analyse your data as required.

    We aren't here to do this work for you, but we are here to help if you have any specific problems.

  • Are you planning on storing the text, as text, or were you planning on dividing it up into some kind of relational or dimensional storage so that you can run queries against it? If you're just storing it as text, you'd be better off finding some kind of text manipulation & storage engine that allows for searches & aggregations rather than trying to force it into SQL Server.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • satishbhargav4u (8/7/2009)


    Where is this data coming from:-a website(amfiindia.com)

    What format is it in:-this is in text format

    What do you want to do with this data once you have it:- some analysis day,month,year wise

    yes this is :-6012*30*12

    That's only 2.2 million rows per year. Heh... I think I have a key fob with enough room for that.

    If the data is always in the same format insofar as the format of each row, just import it into a table and don't worry about it too much. Just make sure the datatypes are appropriate for the data being stored in the table and you shouldn't have much of a problem.

    Now that I've said that, what does a row of this data look like?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • In that file data is

    Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date

    101912;BARODA PIONEER BALANCE FUND -Growth Plan;23.65;23.41;23.65;07-Aug-2009

  • Ok... just to set your mind at ease, here's what I see 1 year's worth of storage taking for the data as a worst case rough ESTIMATE...

    SchemeCode INT NOT NULL, -- 4 bytes

    SchemaName VARCHAR(100) NOT NULL, --104 bytes = 100 bytes + 4 bytes because variable size (worst case)

    NetAssetValue DECIMAL(9,2) NOT NULL, -- 5 bytes

    RepurchasePrice DECIMAL(9,2) NOT NULL, -- 5 bytes

    SalePrice DECIMAL(9,2) NOT NULL, -- 5 bytes

    Date SMALLDATETIME NOT NULL, -- 4 bytes

    ===========

    127 bytes Total

    Null_Bitmap = 2 + ((Num_Cols(6) + 7) / 8) = 3

    ===========

    130 bytes

    Misc addressing 6

    ===========

    136 bytes

    Number of rows in 1 year 2,200,000

    ===============

    Estimated max total storage 299,200,000 or .3 gig

    Obviously, the addition of even a clustered index will add to that but not significantly. Like I said, we all have key fobs with more memory than this. In fact, I keep my TempDB much larger than this on production boxes. This is simply not a storage problem.

    [font="Courier New"]_________________________________________________________________________________________________[/font]

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply