10 things never to do with a relational database

The data explosion demands new solutions, yet the hoary old RDBMS still rules. Here's where you really shouldn't use it

I am a NoSQLer and a big data guy. That's a nice coincidence, because as you may have heard, data growth is out of control

Old habits die hard. The relational DBMS still reigns supreme. But even if you're a dyed-in-the-wool, Oracle-loving, PL/SQL-slinging glutton for the medieval RAC, think twice, think many times, before using your beloved technology for the following tasks.

[ If you aren't going to use an RDBMS, which freaking database should you use? | See InfoWorld's comparative review of NoSQL databases. | Keep up with the latest developer news with InfoWorld's Developer World newsletter. ]

1. Search: Even the most dedicated Oracle shops tend not to use Oracle Text, the extension Oracle bought for its database but doesn't seem to develop very actively. Instead, you see a lot of people using complicated queries that are heavy on like and or operators. The results for these are ugly and the capabilities are weak -- and the processes for getting the data just the way Oracle needs it are tough. Outside of Oracle, many other RDBMS products don't have real search extensions.

Use the likes of Hibernate Search, Apache Solr, or even Autonomy. Do it for the performance of a better-fitting index. Do it for the capabilities of full-text search.

2. Recommendations: This was the ugliest part of ATG Commerce and other commerce products I've worked with. They capture a lot of data about the user from which they try to make recommendations. Where I've worked, the recommendations capability was almost always turned off for scalability reasons.

Consider the social network. If I want to recommend socks to you because your friend or your friend's friend bought socks, that gets ugly in the RDBMS. We're talking self-joined tables and multiple levels of querying. This is like two lines of code in a graph database like Neo4j. You can work around the RDBMS by pre-flattening social networks and doing odd manipulations to the data, but you'll lose its real-time nature.

3. High-frequency trading: You would think that trading systems would love the RDBMS because the data is at least in part transactional, right? Wrong. High-frequency traders were among the first people to adopt and, in some cases, create NoSQL approaches. Low latency is king for HFT. Sure, if you jumped through serious hoops, you can achieve low latency with your RDBMS, but it really wasn't designed for it.

Oracle tried to answer this by buying TimesTen, which attempts to combine an in-memory database with an RDBMS, but if you staple a goose to a truck you don't get an airplane. Instead, we see the HFT crowd use key-value stores like Riak or more complex solutions like Gemfire.

4. Product cataloguing: This isn't the exciting stuff you hear about, but one of the first nightmare SQL queries I ever wrote was for mapping product data. When I worked for a mobile telephone manufacturer, this was for cellphones -- except "Model XYZ" could mean several different actual phones, and any one of those went under different names in different markets. The same model could have completely different components. Managing these "classes" of devices didn't flatten very well. This was very much the kind of thing for which I could have used a graph database like Neo4j.

I had a very similar problem when I worked for a chemical company. We did some very dumb string mapping that was pretty laborious. Had we kept the product information in a graph database, it would have been simple to map. Even a document database like CouchBase 2.0 or MongoDB would have been nicer.

1 2 Page 1
Page 1 of 2