The Challenges of Splitting a Table

,

I ran across a discussion on Reddit about splitting a table. In this case, the original post had to do with a vertical partition of data, which is a technique that can help you better manage data in your database. However, I haven't often seen this technique employed in the real world.

I wonder how many of you have considered a vertical partition when you are modeling data. Often we may not think about this early in the lifecycle of an entity, but as it grows, you might think about reducing the amount of data you often query in some way, and a vertical partition can help.

Is there some criteria that you might use in deciding this? Or how you can evaluate if there is a need? I once worked on a system with a very hot table, lots of queries, lots of updates against this table from our online system. In response to some requests, the developers wanted to add some columns to the table. This was important, and we needed to capture the data.

These were valid columns, but they were large in terms of data size, and not every one would always be used. This was before the option of sparse columns, so that wasn't an option. I had no interest in an EAV table, despite the fact that it might have worked well at this scale. Instead, this was a situation where I thought a vertical partition would work. In fact, I thought a few of the other columns in this entity could be moved as well, as they were rarely queried and contained significant data.

We split the table, and performance actually improved for the main table, as it had less data. Just like an index, we had more rows on every page and less IO for range queries, and even key lookups for data that wasn't already indexed.

There are lots of good techniques in database development for dealing with the challenges we face in data modelling and with performance. I'd urge you to learn about some of them and understand when they can be useful. I would also practice implementing them, making changes to existing tables, and learning how you can deploy them if the need arises.

Rate

Share

Share

Rate