SQLServerCentral Editorial

Graphing Performance

,

We have a lot of different database platforms to choose from when building software. Most of us reading this are SQL Server users, and likely relationally biased. However, key-value stores, document databases, graph databases, and more are out there. If you work with developers that embrace change and new options, likely you've been asked about implementing some sort of NoSQL database instead of SQL Server for some project. Maybe you've even been asked to migrate away from SQL Server to an Open Source (OSS) NoSQL platform, with the lack of software cost being a factor.

I do think that there are some domains of problems that relational systems don't handle well. Certainly at scales (data volume or rate), there are better ways to deal with some data sets in a less structured and tightly coupled way. We see that in the large scale web companies like Google, Twitter, Facebook, etc. If these companies had tried to build their entire system on a RDBMS platform, they would have struggled to grow, and maybe not even reached the size they are.

I've been reading and playing with the new graph capabilities of SQL Server 2017, trying to determine what I think of the concepts. Certainly large scale many-many relationships don't seem to be a strength of relational databases and I've thought there are certain types of queries or data models that might be better handled by a graph database.

Then I ran across this report from a few researchers that examine how graph database compare to relational ones. After all, we've grown accustomed to using RDBMSs in many environments and situations. What better way to evaluate the performance of a specialized database than compare its performance in the problem domain its designed to solve to that of a general database platform.

The results are a little surprising. Even with a sub-optimal query language, I would have expected the graph database to perform better. Instead, relational seems to handle the reference graph workload better. Raw performance isn't everything. Ease of development and ability to scale are important. There may be other considerations in your system as well, but I did find this to be an interesting paper.

We will see how the world of specialized databases handles real world workloads over time as more companies use them, but for now, I'd be skeptical of replacing an existing, working RDBMS with something unproven. I'd need to see a good POC that shows quite a bit of improvement across a variety of metrics, not just scalability.

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating