Skip to main content

The context you need, when you need it

When news breaks, you need to understand what actually matters — and what to do about it. At Vox, our mission to help you make sense of the world has never been more vital. But we can’t do it on our own.

We rely on readers like you to fund our journalism. Will you support our work and become a Vox Member today?

Join now

GE and Pivotal Want You to Jump Into Their Data Lake

Come on in, the data’s fine.

Monkey Business Images / Shutterstock

Last year, when industrial giant General Electric made a $105 million investment in the EMC-owned software outfit Pivotal, it puzzled a lot of people. GE, the maker of jet engines, generators and railroad locomotives, didn’t quite fit the traditional mold of a player in tech circles.

This is the same GE that spent $1 billion to create a tech hub in San Ramon, Calif., only an hour away from the heart of Silicon Valley. And it has been for the last several years on a persistent push to become just as well-known for its expertise in handling and analyzing the constant flow of operational data that its other products generate.

Today, GE hit a milestone in that campaign. Teaming with Pivotal, it announced that it had created what it calls a “data lake.” If you’re familiar with the concept of “big data,” then just think of a data lake as “bigger data.”

Here’s how it works: Let’s say you’re gathering data about the performance of engines on a fleet of jets — fuel consumption, performance, response time, operating temperatures. A common figure cited says all the sensors on GE-made jet engines and elsewhere in the plane will generate about 14 gigabytes of data on an average commercial flight.

Each bit of that data is assigned a unique ID number and then poured into a massive storage trove. It’s not put in folders or individual directories; it all resides in its raw format in one big “lake” of data. When you’re ready to conduct your analysis, each bit of raw data is in there waiting for you to scoop it up with other bits of data that fit whatever question you want to answer. GE did exactly this in a trial last year with data gathered from 15,000 flights on 25 airlines.

There are two tech forces colliding here. One is Hadoop, the open source data analytics platform that you hear so much about these days. Pivotal has built part of its platform-as-a-service business around it.

The second is Predix, GE’s internally built platform for connecting different kinds industrial equipment. The platform is already in use across GE. And last month, GE CTO Mark Little talked about how he’d like to extend it to third parties across several industries.

The data lake approach streamlines a bunch of fundamental steps in the analytics process. Typically, when you want to perform these analytics actions, you have to spend a lot of time, effort and money on getting the data into the right format. Here it’s left in its original, pure format.

The point of all this is, naturally, to learn more about whatever complex system you’re operating — jet engines, factories, oil platforms — so that it can perform better, faster, more efficiently and at lower cost.

In its trial with the planes, GE says it cut the amount of time required to do its analysis from months to days. One airline using the approach shaved its fuel prices down by one percent. That may not sound like much, but when you consider that the major U.S. air carriers spent more than $46 billion on jet fuel last year, a savings of one percent turns out to be real money. By next year, GE says it hopes to be collecting data on 10 million flights a year.

And it doesn’t stop with airlines. Other industries could use the data-lake approach to sniff out new efficiencies of their own. Expect to hear more about data lakes in the coming months.

This article originally appeared on Recode.net.

More in Technology

Podcasts
Are humanoid robots all hype?Are humanoid robots all hype?
Podcast
Podcasts

AI is making them better — but they’re not going to be doing your chores anytime soon.

By Avishay Artsy and Sean Rameswaram
Future Perfect
The old tech that could help stop the next airborne pandemicThe old tech that could help stop the next airborne pandemic
Future Perfect

Glycol vapors, explained.

By Shayna Korol
Future Perfect
Elon Musk could lose his case against OpenAI — and still get what he wantsElon Musk could lose his case against OpenAI — and still get what he wants
Future Perfect

It’s not about who wins. It’s about the dirty laundry you air along the way.

By Sara Herschander
Life
Why banning kids from AI isn’t the answerWhy banning kids from AI isn’t the answer
Life

What kids really need in the age of artificial intelligence.

By Anna North
Culture
Anthropic owes authors $1.5B for pirating work — but the claims process is a Kafkaesque messAnthropic owes authors $1.5B for pirating work — but the claims process is a Kafkaesque mess
Culture

“Your AI monster ate all our work. Now you’re trying to pay us off with this piece of garbage that doesn’t work.”

By Constance Grady
Future Perfect
Some deaf children are hearing again because of a new gene therapySome deaf children are hearing again because of a new gene therapy
Future Perfect

A medical field that almost died is quietly fixing one disease at a time.

By Bryan Walsh