SImple Design Question

  • Have built an application that captures specified desktop user activity. Application usage, use of copy/paste, start of call types (billing, upgrades, transfers), what apps are used, for how long, what buttons (on certain apps are pressed) etc., It will also log, event number, user id, transaction type (if it can be detected from app or CTI), machine name, IP address and CPU time between each event and much more).

    Anyway, the question is, for such a large amount of data, should I have one table and log everything. Or have one large table for all the events, as well as others for say transaction summary and another for applications and their use times. Using one table, and making sure the fields related to other events (transactions) are kept unique would make for easier design but now you have to work with views to get summary data on the reporting side.

    Initially, I thought using multiple tables and create the summary data on the fly, from the application itself (which is not hard either) which links to the main events tables to allow drilling into the "detailed" data. Both approaches would work but curious on CORRECTNESS 🙂

    I'm new to SQL and a SQL expert told me one table would be better/quicker/cleaner given the shear number of events and transactions (could be tens of millions on large customers (1000 users x 40 calls a day x 100 events per call could be 4 million in a single day. Not large records and I'll keep the fields tight (codes). the overhead to write the summary data (in the app) is quite small but I guess it all adds up.

    What would you do?

  • It's hard to tell you what to do here. There are pros and cons to one or multiple tables.

    I'd tend to split differing things into different tables, according to normaliztion rules, both for storage and to prevent hot spots on one large table. I might then "summarize" some data into separate tables at times to improve reporting.

    What data is going to be summarized and how? Can indexes in one table handle the load if it's 2-5 indexes only?

  • Not sure what you mean by handling the load, but thanks for the other input. I think I'll create the primary table, as raw as possible for speed and then allow users/me to turn on/off summary reporting into other tables.

    Thanks for the answer that confirms my dilemma.

    Cheers.

  • My first instinct would be to start out with one table, for simplicity, and then evolve the design from there as you use it.

    That way, you'll find how it actually gets used, and be able to optimize for that, instead of optimizing for how you think it will get used, and having to re-engineer if/when you find out that was based on incorrect assumptions.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply