Big Data

External Article

Why Would I Ever Need to Partition My Big ‘Raw’ Data?

  • Article

Whether you are running an RDBMS, or a Big Data system, it is important to consider your data-partitioning strategy. As the volume of data grows, so it becomes increasingly important to match the way you partition your data to the way it is queried, to allow 'pruning' optimisation. When you have huge imports of data to consider, it can get complicated. Bartosz explains how to get things right; not perfect but wisely.

2016-11-22

3,345 reads

External Article

How to Start Big Data with Apache Spark

  • Article

It is worth getting familiar with Apache Spark because it a fast and general engine for large-scale data processing and you can use you existing SQL skills to get going with analysis of the type and volume of semi-structured data that would be awkward for a relational database. With an IDE such as Databricks you can very quickly get hands-on experience with an interesting technology.

2016-11-18

3,131 reads

External Article

The End of Big Data

  • Article

What is next for big data? Some experts claim that data "volumes, velocity, variety and veracity" will only increase over time, requiring more data storage, faster machines and more sophisticated analysis tools. However, this is short-sighted, and does not take into account how data degrades over time. Analysis of historical data will always be with us, but generation of the most useful analyses will be done with data we already have. To adapt, most organizations must grow and mature their analytical environments. Lockwood Lyon shares the steps they must take to prepare for the transition.

2016-06-03

10,764 reads

External Article

Big Data Architecture

  • Article

The next few years will be critical for the information technology staff, as they attempt to integrate and manage multiple, diverse hardware and software platforms. In this article, Lockwood Lyon addresses how to meet this need, as users demand greater ability to analyze ever-growing mountains of data, and IT attempts to keep costs down.

2016-05-09

5,553 reads

Blogs

AI: Blog a Day – Day 6: Embeddings – How AI Understands

By

Continuing from Day 5 where we covered notebooks, HuggingFace and fine tuning AI now...

The Book of Redgate: Mistakes

By

This is kind of a funny page to look at. The next page has...

ADF Pipeline Debugging Fails with BadRequest – The Sequel

By

A while ago I blogged about a use case where a pipeline fails during...

Read the latest Blogs

Forums

Dynamic Unpivot

By pietlinden

I have a table I didn't design that has tons of repeating groups in...

Writing as an Art and a Job

By Steve Jones - SSC Editor

Comments posted to this topic are about the item Writing as an Art and...

String Similarity II

By Steve Jones - SSC Editor

Comments posted to this topic are about the item String Similarity II

Visit the forum

Question of the Day

String Similarity II

What is the range for the result from the EDIT_DISTANCE_SIMILARITY() function in SQL Server 2025?

See possible answers