Drop-In Class #6: Data Gravity
The bigger your data gets, the harder it is to move.
Welcome to my newsletter, which I call Drop-In Class because each edition is like a short, fun Peloton class for technology concepts. Except unlike a fitness instructor, I'm not an expert yet: I'm learning everything at the same time you are. Thanks for following along with me as I "learn in public"!
Whew! I’ve spent the past few editions on LLMs. It was fun, but it was hard. It felt like marathon training. I feel like a break, don’t you? Today we’re going to slow things down and do some yoga.
Which is my metaphorical way of saying: Let’s talk about data gravity!
Data gravity is more of a theoretical concept. I bring it up a lot in conversations about the data stack, because it explains big trends happening in the data ecosystem. For instance:
Why did the data warehouse / lakehouse become so central to the data stack? Data gravity.
Why are people trying to consolidate all of their data activities into one place (and new solutions like Microsoft Fabric are packaging themselves as that place)? Data gravity.
Why do so many organizations have a hard time migrating from legacy servers to a modern cloud setup? Data gravity.
So let’s go to outer space and see what’s going on with data gravity! Cue this week’s music selection.
The law of data gravity, featuring my gross oversimplification of science
As you've seen it stated over and over, the amount of data is growing. Blah blah, we know that, Alex! But it's important to mention! It SETS THE STAGE.
The amount of data is growing, and the bigger it grows, the more it's pulling everything else into its orbit. It's like a planet that gets stronger and stronger. Your company might start with a Pluto-sized amount of data, but over time it turns into Jupiter.
Newton's law of gravity goes something like this: an object's gravitational pull is proportional to its mass.
So if we apply it to data…
The bigger your data gets, the more data and applications it attracts.
When your data becomes Jupiter, it builds mass, as researcher Dave McCrory put it when he coined the term “data gravity.” And the more mass it builds, the more data and applications it sucks in.
Why do we care that this happens? Because once you have a lot of data somewhere, it’s really hard to move.
This is why data-rich systems, like databases and data warehouses, are becoming the center of the universe. Once the data is in there, it’s in there.
So what do you do about data gravity?
Take your processing, and put it where the data is!
Instead of moving your data around — the heaviest object — you should move all the other stuff. The applications and services. Like moving the stars to Jupiter, instead of trying to move Jupiter to each of the stars.
This isn’t a big deal if you’re already in the cloud or you’re good with your existing data setup. You can keep your data there, and bring everything else to the data. A good data architect or engineer makes sure that:
Your data storage works well with applications like analytics and business intelligence tools, so you don’t have to constantly be moving the data to those other apps
Your data storage can scale up as your data keeps expanding (much like the expanding universe! Scary! Don’t think about it too much).
As with anything in data, this is all easier said than done. And if you have all your data stuck in a legacy server and you need to migrate it to the cloud? Ooof. You’re moving Jupiter. It’s tough!
That’s why so many of us are still at companies using legacy data stacks, and why everyone needs to appreciate their data engineers. They’re the unsung heroes out there. If the data world were a dramatic space movie like Interstellar or Alien, they’d be the one that has to come up with the heroic plan that’s “so crazy, it just might work.”
Sources for extra reading:
Because this is a 101-level drop-in, I oversimplify everything! Here are some useful articles to dive deeper.
As data gravity goes up, are clouds becoming black holes? (Security Intelligence)
Data gravity: What is it and how to manage it (ComputerWeekly)
See you in the next one!
-Alex





