MS-Jet Engine 4.x ODBC enabled databases to 1 Terabyte/5 billion rows (or) SQL SERVER?

  • I see information about speed and size. I don't see transaction through put, volume or speed, simultaneous users, high availability, disaster recovery, all the things that SQL Server provides.

    However, I'm with Steve. If you don't need or want SQL Server. Don't use it. There's Postgres, Oracle, MySQL, DB2, lots of ways to get relational data storage done. If you don't need relational storage, cool. Again, don't use SQL Server. Really. SQL Server is kind of bad at non-relational storage, especially since it's not built for that. Go for Hadoop or DocumentDB.

    We have choices and options and that's not a bad thing.

    SQL Server is an awesome and amazing hammer. Every problem is not a nail.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • erichansen1836 (9/10/2015)


    Thanks for the link to my post at Microsoft Developers Forum regarding Jet (Red) Engine Databases.

    Also, FYI, I have issued 510 concurrent independant SQL processes on Windows 7 Home Premium on a modest Laptop with 3 GIG Ram and 1 Processor, launching these as "Detached" Windows Background Processes from within a Win32 Perl script. I used Task Manager to monitor the processing, RAM Memory, and CPU usage.

    All SQL report output went to individual TEXT files. All SQL report output consistently worked beautifully as long as I increased the THREADS property within my ODBC FILE DSN from default of 3 to 512. I was also able to run 66 concurrent database update processes to the same *.MDB file, each of the 66 updating a separate logical collection of records in the database file so as not to STEP ON each others udpates.

    Now that is one Jet Engine working on 1 MDB database file system file.

    If you implement a Network database file system of 500 MDB files, each acting as a partial table within the 5 billion row database, each MDB file containing 10 million rows...

    And if you implement dozens or 100s of Jet Engines, i.e. 1 Jet Engine from each PC on the Network, accessing the Network Database file system...

    And if you only allow SQL access (i.e. NO SQL) from within a user-interface front-end which controls the ODBC connections, opening them only long enough to retrieve a row(s) or uddate/delete/insert row(s), and which dynamically builds and executes the SQL statements on the end-user's behalf (from keyboard input or selection criteria chosen from the database)...

    Then it may be possible for 1000s of end-users to randomly access the 500 *.MDB database system files on the Network without any concurrency issues arising.

    Confused, how is only allowing SQL access from within a user-interface front-end considered NO SQL?

    If you are using SQL (Structured Query Language) to access the data, that is NOT NO SQL.

  • It is NO SQL if the end-user does not use SQL.

    The ODBC enabled application (Win32 PERL, etc.) would build/execute the SQL statements.

    This is necessary for two reasons.

    (1) The end-user can query the database without knowing SQL

    (2) The integrity of the database is maintained by the user-interface controlling SQL syntax,

    controlling which users can update/insert/delete records within the database, controlling which

    user has access to a particular record at any given moment so that record changes are not STEPPED ON

    by other users and so that concurrency issues do not occur rewriting the database.

    This is my own definition of NO SQL, without having researched what NO SQL may mean to the IT community at large. I think my definition is just as good as any, and may just simply be a different use of the term than what most are familiar with.

    MS-ACCESS software is not used, nor any other front-end SQL Navigation tool (TOAD(tm), etc.).

    All database access is through the Win32 Perl/ODBC front-end which enforces Network group permission access to the database.

  • erichansen1836 (9/10/2015)


    It is NO SQL if the end-user does not use SQL.

    The ODBC enabled application (Win32 PERL, etc.) would build/execute the SQL statements.

    This is necessary for two reasons.

    (1) The end-user can query the database without knowing SQL

    (2) The integrity of the database is maintained by the user-interface controlling SQL syntax,

    controlling which users can update/insert/delete records within the database, controlling which

    user has access to a particular record at any given moment so that record changes are not STEPPED ON

    by other users and so that concurrency issues do not occur rewriting the database.

    This is my own definition of NO SQL, without having researched what NO SQL may mean to the IT community at large. I think my definition is just as good as any, and may just simply be a different use of the term than what most are familiar with.

    MS-ACCESS software is not used, nor any other front-end SQL Navigation tool (TOAD(tm), etc.).

    All database access is through the Win32 Perl/ODBC front-end which enforces Network group permission access to the database.

    Sorry, but your definition just doesn't work. If your application uses SQL to query the database, it isn't NO SQL. In most user facing applications, the user themselves don't write a single line of SQL code to access the data. That is done by the application or handled by stored procedures in the database.

    In fact, most users wouldn't know what a SQL query was if it was printed on paper and put in front of them.

  • erichansen1836 (9/10/2015)


    It is NO SQL if the end-user does not use SQL.

    The ODBC enabled application (Win32 PERL, etc.) would build/execute the SQL statements.

    Then, by absolute definition, it's not a NOSQL solution. Look at how to retrieve data from Hadoop through the Map/Reduce functions. That is a NOSQL solution. It does not use SQL at any level of the process. If I use Entity Framework or nHibernate against SQL Server, which will generate all the SQL, I don't suddenly have a NOSQL system. I think you might be missing the point here.

    This is necessary for two reasons.

    (1) The end-user can query the database without knowing SQL

    (2) The integrity of the database is maintained by the user-interface controlling SQL syntax,

    controlling which users can update/insert/delete records within the database, controlling which

    user has access to a particular record at any given moment so that record changes are not STEPPED ON

    by other users and so that concurrency issues do not occur rewriting the database.

    This is my own definition of NO SQL, without having researched what NO SQL may mean to the IT community at large. If think my definition is just as good as any, and may just simply be a different use of the term than what most are familiar with.

    Ah.

    Well. The issue is, especially in technology, words and terms do have meaning and in order for us to communicate well, we need to understand the shared meaning. NOSQL is a pretty well defined term in IT. I understand that you can come up with your own terminology, but unfortunately, everyone else may be using something else. That's just going to lead to miscommunication on both sides.

    MS-ACCESS software is not used, nor any other front-end SQL Navigation tool (TOAD(tm), etc.).

    All database access is through the Win32 Perl/ODBC front-end which enforces Network group permission access to the database.

    I'm just not sure I understand the issue though. If you don't need SQL Server, don't use it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • @erichansen1836

    It sounds to me like you already had your mind made up when you posted your question. If everything you're doing is FREE and you're happy with performance and don't mind not being able to get additional support if you ever need it, then why even ask the question? Keep doing what you're doing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (9/10/2015)


    @erichansen1836

    It sounds to me like you already had your mind made up when you posted your question. If everything you're doing is FREE and you're happy with performance and don't mind not being able to get additional support if you ever need it, then why even ask the question? Keep doing what you're doing.

    It was my opinion right from the start that he had his mind made up and was just looking for comments to backup/justify his opinion.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • erichansen1836 (9/10/2015)


    In a user-interface, I would write the record locking strategy myself. If a record is wished to be edited by a user, the user-interface only allows it if the record is not currently locked for edit by another user.

    We used to do that ... it was fine until a user went to lunch (or on vacation!!) with a record open on their screen ... (not that that debate has anything to do with Jet vs. SQL)

    We now have an RowVersion column in all tables, incremented (by a Trigger in our case, so nothing can bypass it - can't remember if Jet has triggers?) and the APP sens the user the form with the PKey ID, the EditCount and any other columns required. When the APP then gets the data back to Update/Delete the row if the RowVersion is now different then someone/something else has changed the row in the meantime.

    (SQL has a ROWVERSION datatype which will do that automatically without needing a trigger)

    A key issue I come up against is tuning databases. Some 3rd party installs a database on our server; the indexes etc. are defined in a very "generic" way, often not entirely suitable to our specific organisation, or (sadly, all too often) the quality of the APP, and its database, is abysmal ... again, I can't remember what was available in Jet but in SQL I have all sorts of performance monitoring tools that I can use which help me decide which vendor-supplied indexes are unnecessary (e.g. we don't use those columns at all), useful-but-unused-as-not-selective (black mark to the vendor!) and also suggestions as to which new indexes I should add, based on the actual i.e. real-world workload that the database is processing. I can, and have, actually forced a Plan Guide on a 3rd party APP which used a lousy query plan by default and, being a 3rd party APP, I could not change the APP itself. In this instance adding indexes didn't help (SQL didn't choose to use them). I imagine that there would be no possibility to do anything remotely like that in JET?

    Another area which I think makes a difference is if the app is Client/Server or just Server-based. The Client/Server apps we have (i.e. where there is an executable deployed onto the user's desktop) are a lot more "chatty" with the database, and across the network, than server-hosted apps (i.e. where the user is using a Browser and the Server has Database + Web Service). Our server-hosted applications do a lot of buffering / caching of information - its all in one APP, its easy to know if the cache is stale or not 🙂 - and also marshalling of requests to the database itself. So in a sense I can see that a server hosted APP could work efficiently with a less endowed database engine, and I can see that I could write server-hosted APPs optimised to work well with JET in lieu of SQL Server

    But, that said, our users increasingly require functionality that places more strain on the database engine itself. We have predictive-search dropdown boxes everywhere nowadays; these make an enormous hit on the database (query time can be tuned to be very quick, but the number of queries is ridiculously high!) and needs careful handling. Users also want free-form report writers; they pose questions which are rarely the same next time, so no possible optimisation thus we need rely on SQL Server making good Query Plans to then efficiently find the relevant data. The user queries may also be for any combination of columns ... again, the performance monitoring tools with SQL mean that I can detect which, new, queries ARE being repeated and DO perform badly and then, on an 80:20 basis, we can improve those and have Happy Users 🙂

  • erichansen1836 (9/8/2015)


    MS-Jet Engine 4.x ODBC enabled, NO SQL, database file systems to 1 Terabyte/5 billion rows

    (or) SQL SERVER?

    What reasons can SQLServerCentral give for downsizing to a MS-SQL Server database

    from a 1 Terabyte/5 Billion row, MS-Jet Engine 4.x format, ODBC enabled, NO SQL, database file system (*.MDB), which uses ActiveState Win32 Perl for Native Windows GUI & ODBC for user-interfacing?

    Perl Win32::ODBC module by David Roth is an extremely stable ODBC implementation.

    No MS-Access software or licensing is required. Database files are created with Windows 7 Home Premium ODBC Administrator, and the tables, views, indexes, constrants created with ODBC/SQL commands from Perl. File reorganization is accomplished by the JetComp.exe standalone batch utiilty for index recovery. I have never had to use it since Perl ODBC never corrupts the *.MDB files that I have ever experienced on a Network.

    Thanks,

    Eric in Dallas, Texas

    I can think of a few reasons that folks might consider moving to sql server.

    - Permissions are a good one. You can enforce rules to a finer degree on an SQL server, such as constraints, foreign keys, etc. I'm thinking that with Jet, if you can write to the database file, you can redefine things, whereas SQL server has a really wide range of security objects.

    = edit: Jet 4.0 does have some more granular security mechanisms and constraints than implied by the above bullet item. I'm am wondering if security applies to DDL functions?

    - I seem to recall that folks move to SQL server from Jet databases often for concurrency reasons. I'm not saying that Jet can't handle locking for a degree of concurrency, etc, heck I've handled record locking with flippin assembly language LOL, but SQL Server has processes in place that can escalate and promote locks in degrees suitable for those big honking loads that you see at big sites. With Jet, I'm thinking you get the winapi locks AND THATS ALL YOU GET, and you'd better darn well like'em 🙂

    - Deadlock detection, have you run into any situations that SQL server would have helped here?

    Interesting idea though, it would be interesting to run some comparisons. I'm a bit more partial to strawberry perl, but you've got my curiousity up and I might have a look at what you're doing here. Do you think you'd get the same results running from .net code, or do you think your results are particular to activestate's perl?

  • P.S. SQL Server's performance monitoring tools are also useful for our own in-house APPs. We do indeed discover improvements that we can make that we didn't even know needed making!! (by whatever means, including Wet-finger-in-air, that we have used over the years)

  • Kristen-173977 (9/11/2015)


    P.S. SQL Server's performance monitoring tools are also useful for our own in-house APPs. We do indeed discover improvements that we can make that we didn't even know needed making!! (by whatever means, including Wet-finger-in-air, that we have used over the years)

    Another good one! I'm going to hazard a guess that any monitoring with Jet is going to be homegrown, if even that.

    Also, I'm thinking the Jet database engine isn't going to have a horribly sophisticated query planner, with stats, parallelism, etc.

  • How do you do backup/restore/recovery? Does everyone have to be off? Have you actually had users coming in across a network to get to this beast or are the results you have just from the one-box simulation you ran?

    Not interested in trying to convince you to do anything differently, just curious.


    And then again, I might be wrong ...
    David Webb

  • I think it is important to ask questions such as whether it is more prudent to use:

    MS-Access, MS-Jet Engine(w/o MS-Access), SQL Server Express, or SQL Server

    for any particular database system design.

    At some point many of us will have to backup our decision to use any of these with facts.

    It may be costly to implement Microsoft software and end-user licensing for an entire Enterprise

    or Department, or even a small business. MS-Jet databases are FREE/NO COST, with no need for

    end-user licensing. Everything comes factory installed on WIndows 7, except for Win32 Perl.

    You probably already have a favorite software development language you would prefer to use.

    I use a FREE older version of ActiveState Win32 Perl from 2002 (and compatible compiler).

    You SHOULD be able to use a programming language of your choice, but I have heard that Visual Basic

    ODBC interface is not reliable??? Perhaps that is why MS-Access software regularly corrrupts *.MDB files?

    You might use OLEDB, etc.

    I like the feedback starting to be generated talking about the pros and cons, which was my original,

    then repeated request. Thanks!

    I can only provide PROS for MS Jet, showing some things I know can be done with it, either automatically or by manual programming. I have not used MS-SQL Server, but I have used other Server databases such as Oracle and Informix, but I was never the ADMIN. I can be my own ADMIN with MS-Jet. Below are a few things you all have asked about, and a couple things I am throwing in.

    Some MS Jet Engine 4.x PROS:

    ===================

    Update Statistics/Query Analyzer/Database Tuning:

    ----------------------------------------------------------

    When you run the the JetComp.EXE compress/repair utility on a *.MDB Jet 4.x file, the file is re-optimized for query speed. This may be important for READ/WRITE databases to periodically run, but not required for READONLY reporting databases. MS-Jet 4.x Reporting Only databases can likely be upwards of several Terabytes and 10's of Billions of Rows. But for PRACTICALITY purposes, it may be TOO MUCH administrative work to run JetComp.EXE regularly on a READ/WRITE database over 1 Terabyte/5 Billion Rows (That's 500+ *.MDB files).

    JetComp.EXE can be run in GUI or BATCH mode. in BATCH mode, you could run the utility overnight during non-business hours on all your 500 *.MDB files.

    BETTER YET !!! Do all of the INSERTS into a single *.MDB file, and mark rows as DELETED (delete flag column set to Y or N) in the 500 *.MDB file database instead of actually using the DELETE SQL statement. That way, you won't have to perform daily or weekly re-optimization of the 500 *.MDB files, but only on the single(1) *.MDB file receiving the INSERTS. Then, monthly or quarterly, run over the weekend a batch process that moves the NEW rows from the single(1) *.MDB file and inserts them into the appropriate (i.e. NAMING CONVENTION) *.MDB files in the set of 500. Then run a DELETE batch routine the removes the rows marked for deletion, then run a batch routine that re-optimizes the 500 *.MDB files calling JetComp.EXE utility. Running JetComp.exe also physically re-orders all rows by the primary key index (clustered indexing, I think the term is), and recovers the space from deleted records.

    500 *.MDB FILES ?????

    ---------------------------

    Yes, each acting as a partial table and not as a database in and of themselves. Some of these may be complete tables such as an ADMIN table used to hold user permissions/access levels, etc.

    How can this be done for instantaneous random access indexing?

    Example: 2010 U.S. Census Database file: US_Census_2010_TX_A.mdb

    This single file would store the census information only for individuals of the State of Texas whose LastName

    begins with "A". The Win32 Perl user-interface would dynamically build the SQL statement for opening the

    correct database file system FILE based upon ""TX" and "A".

    There are many possible ways to do this type of data segregation. This is just one example.

    In this example, you might have 50 subdirectories, one for each State.

    You would thus have 26 *.MDB files in each State subdirectory (e.g. *_TX_A.mdb through *_TX_Z.mdb).

    That is 50*26=1,320 *.MDB files each capable of holding 10 million rows (which includes a MEMO column).

    2 GIG x 1,320 = 2,640 Gigabytes or 2.6 Terabytes. 2 GIG storage per each *.MDB file.

    Client/Server vs. Server:

    ----------------------------

    I install my compiled(EXE) Win32 Perl database user-interface software on each PC on the Network.

    But I install the ODBC FILE DSN onto the Network where the database is instead of on each individual PC so

    that it cannot be jacked with. I think placing the database interface software on each PC instead of a single instance on the Network that each user launches prevents the Network Server from being overloaded (memory and CPU usage)? And if I am not mistaken, only one(1) Jet Engine would be used instead of 100s. Perhaps using the Jet Engines on each PC instead of the single Jet Engine on the Network Server is more efficient use of the Jet Engine architecture? Any ideas?

    Referential Integrity:

    ------------------------

    Can be enforced with MS-Jet 4.x by either manually performing cascading deletes and updates (parent-child relationships), or you can use MS-Access 2007 SQL Syntax (via ODBC) to place Constraints on tables in your database after you have created them with ODBC Administrator utility (Windows 7).

    Also, ODBC for MS-Jet (Access Driver), you can COMMIT and ROLLBACK transactions (Insert/Delete/Update), placing blocks of maintenance events within this commit/rollback logic so that either ALL or NONE of the maintenance events are performed.

    Reporting/Statistical Reporting:

    -----------------------------------

    I use COM Automation from within Win32 Perl to connect to MS-Excel in the background and build reports and statistical reports from my MS-Jet database. I make Report Selection Criteria Screens for my end-users to select from the database their own custom criteria for their reporting purposes, which is sent directly to formatted MS-Excel spreadsheets for their review/printing.

    Compression/Encryption:

    ----------------------------

    There is an Extended Ansi92 SQL syntax switch/property to set in your MS-Access ODBC Driver FILE DSN (I prefer FILE DSNs). That switch needs to be set ON if you wish to have your *.MDB file system TEXTUAL data columns compressed (COMP or COMPRESS keyword recognized in your SQL Create Table statement).

    Encryption can be turned on by a switch/property from within ODBC Administrator when you create your empty database files *.MDB.

    I also use MIMEbase64 Encryption (manually from Perl) to put further security on my data, so that the data is non-readable to humans from SQL queries. I convert the MIMEbase64 encryption back to readable TEXT on the fly from within the Win32 Perl database user-interface.

    [WARNING: if your rename an *.MDB file to *.RTF extension, it will open in Wordpad.exe for viewing. However, if you utilize the MS-Jet 4.x feature of Encryption, the data will not be readable form to humans. ]

  • David Webb, no everyone does not have to be off because ODBC database connections are not persistent.

    And although Win32 Perl ODBC module (by David Roth) allows UNLIMITED OPEN CONNECTIONS, I see no reason to leave them open, as they can quickly be reopened when necessary.

    Note: one thing I have not been able to find documentation for for Perl Win32 ODBC is how to use SQL dynasets, and the ability to use scrolling cursors. Using something like OLEDB or other database connection might provide for this ability with MS-Jet databases? I find I don't have to have this functionality, and use other means for the same or similar end results.

    They are only opened long enough to retrieve data (one or more rows), or insert/delete/update a single parent row (and any child rows, if applicable). Batch Update Operations would take place Server Side by Operations Staff during non-business hours. But for locking down the database for restore during business hours...

    I would design an ADMIN table (*.MDB file) that had a LOCKOUT switch that could be turned ON or OFF, which the database interface software (e.g. Win32 Perl/ODBC) checks each time a user makes a request to update the database or run a query. If the switch is ON, user(s) could not perform any database operations at all. They would only be able to navigate the Menus on the user-interface. A system-wide message would pop up on their display informing them the database is in current lockout until such-n-such time.

    I would keep daily and monthly backups of the entire database on inexpensive thumb drives

    perhaps. I would also have the Perl interface to the database write out each successful Insert/Update/Delete statement to a Restore Log file daily. It would be designed such that the RESTORE POINT application utility you design could bypass certain SQL updates in those SQL log files.

    Restore Log (Fixed-Length Record Flat File) records might contain:

    Date Time User PC_Node_ID SQL_Statement

    You could bypass updates during a certain time period, by a certain User, by multiple Users updating from the same PC NodeID, etc.

    Yes this is a manual programming process you need to initially develop for your MS-Jet database system to use, but which can then be cloned to other database applications.

    I HAVE not implemented the restore logs yet for anyone, but I have used MS-Jet databases with Win32 Perl/ODBC interfacing at several Companies successfully on Networks. Although they were not databases of the size extremes I'm discussing (1 Terabyte/5 Billion Rows).

    The reason I did this at one company was because their MS-Access Forms-Based application was causing the MDB file to become corrupted when multiple users accessed the database concurrently. Removing the MS-Access software from the equation and going to strictly a Win32 Perl/ODBC/GUI user-interface resolved the concurrency/data corruption problem.

  • erichansen1836 (9/11/2015)


    I think it is important to ask questions such as whether it is more prudent to use:

    MS-Access, MS-Jet Engine(w/o MS-Access), SQL Server Express, or SQL Server for any particular database system design. ...

    I think my main concern would be maintainability. YOU understand what you've done, and it seems to work well for your organization. I hate to use the Bus example, but if you're hit by a bus tomorrow, will someone be able to come in and maintain the system? How upgradable is it for someone who is not you to do it when new information has to be accommodated?

    I know that if I were hit by a bus tonight that someone could be found to come in and continue my project because of the tools I've written it in. I also know that I could walk in to most SQL Server installations (excluding VLDB) and very quickly confirm that they are or are not running clean (tested backups, clean DBCCs, etc.) Are there people who could walk in to your shop and do the same?

    With a SQL Server shop, one phone call could get Brent Ozar or whoever out tomorrow because it is SQL Server and a very well known product. As the SQL Server Central newsletter says "A community of more than 1,600,000 database professionals and growing". (I'd love to see an updated number of how many have posted something in the last year)

    To me, that's the advantage of using a well-designed database in SQL Server as your data repository along with .Net as your front end. They're incredibly well-known systems and it's usually easy to get new developers and database administrators and in an emergency to get help (maybe I'm being optimistic). It is not, in any stretch of the imagination, inexpensive (unless you're using Express). It is, however, very reliable when used properly.

    My $0.002 worth, spread over 25+ years in IT it isn't all that much. 😛

    (I personally hate the Bus example as it's been used against me when I wanted to improve the system at a job with a drop-in industry standard replacement (switching from dBase III+ to FoxBase), but it is a valid argument.)

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

Viewing 15 posts - 16 through 30 (of 245 total)

You must be logged in to reply to this topic. Login to reply