SQLServerCentral Editorial

Put Your Data in a Box


A few years ago I was attending a keynote talk from one of the scientists that works CERN's Large HADRON Collider (LHC). The talk was about data and how they deal with a large volume of data from experiments run on the machine. If you're wondering what large is, the scientist talked about peak experimental data being over 1PB/s. I don't know how much data you capture, but that's a lot.

In fact, it's so much, that if they tried to analyze all that data, they'd never get around to running experiments. Instead, they depend on some pre-processing of data in sensor hardware as well as some early aggregation to get the data to a manageable level. It's a good idea, and they have spent a lot of time and effort learning how to do this and still capture meaningful data for their work.

The capability to remotely process data before sending it on is coming to all of us in a pre-packaged container. The Azure Data Box Edge was announced this week from Microsoft. This is a data processing device that is cloud managed and has FPGAs that you can program. It can run on batteries and is ruggedized for the field. There are more docs at Microsoft on the specifics.

I don't know how many companies want this, but I suspect that some who have remote or portable operations might think about it. Certainly if this can gather some data and then upload when a connection is available, it might be a good fit for places that don't have good network connections and need a device that can handle some adverse conditions. I know sourcing and putting together a system for field offices is a pain. Off the shelf systems don't always have reliability, and can be finicky to manage remotely.

Recently I hosted a webinar with Abel Wang, and he talked about AI and ML being technologies whose use will grow dramatically in the next ten years. Perhaps a prediction without merit, but it does seem more and more companies and vendors are putting efforts into finding ways to deploy and operate ML systems. The Data Box Edge has capabilities in that area, which could be useful if you want to locally, and quickly, process data.

This doesn't appear to run SQL Server, or it's not mentioned, but it does seem useful. If you doubt this, I know this is exactly what I would have wanted over a decade ago. I had to install systems in a warehouse to visual inspect some products. We used crude AI-ish systems that were set up to watch evaluation by humans. Eventually, the computer took over part of the job, but with humans randomly verifying its results. Keeping that system running was a pain. A Data Box Edge would have been a much better choice, and I'm sure we would have purchased one.

There are likely plenty of customers that might be able to use this. It will be interesting to see if Microsoft can find them and sell many of these devices.