HBase Deprecation at Pinterest


111 points by cloudsql on 2024-05-14 | 58 comments

Automated Summary

Pinterest is deprecating HBase, its first NoSQL datastore introduced in 2013, due to several reasons. These include high maintenance cost, missing functionalities, high system complexity, high infrastructure cost, and waning industry usage and community support. The high maintenance cost was mainly due to years of tech debt, difficulty in finding HBase domain experts, and a slow and painful upgrade process. Missing functionalities like stronger consistency, distributed transactions, and global secondary index led to the building of new services on top of HBase, increasing maintenance load. High infrastructure cost was due to the primary-standby setup with six data replicas for fast disaster recovery. The path to a complete deprecation includes migrating online analytics workloads to Druid/StarRocks, time series data to Goku, and key value use cases to KVStore. The remaining HBase use cases will be accommodated by TiDB, a distributed NewSQL database.


pjmlp on 2024-05-15

As usual, with any technology adoption wave, remember this two key historical moments,

> Introduced in 2013, HBase was Pinterest’s first NoSQL datastore. Along with the rising popularity of NoSQL, HBase quickly became one of the most widely used storage backends at Pinterest.

Followed by,

> For the past few years, we have seen a seemingly steady decline in HBase usage and community activity in the industry, as many peer companies were looking for better alternatives to replace HBase in their production environments. This in turn has led to a shrinking talent pool, higher barrier to entry, and lower incentive for new engineers to become a subject matter expert of HBase.

Lets see how TiDB holds on in the same timeframe, although being boring SQL might keep it around.

rwultsch on 2024-05-15

“Introduced in 2013, HBase was Pinterest’s first NoSQL datastore.”

I don’t think this is correct. When I started in late 2013 Redis was being used as a persistent data store. And what pain it was. I convinced leadership in late 2014 this was a bad and they had me keep it alive until it was replaced by MySQL in mid 2015.

HBase was nothing but pain at Facebook where it was supposed to replace MySQL and then Pinterest where… I think there was hope it would replace MySQL. Once I automated MySQL at Pinterest I think it wasn’t so bad, particularly given the absurdly limited staff they gave the problem.

softwaredoug on 2024-05-15

I had a (low traffic) app with Redis as the only data store, that we'd periodically dump to disk and copy the dump off for backup. It was... not great for anyone who was used to maintaining a normal web app. And it broke down because the underlying relationships we mostly cared about were, in fact, relational. So you got a lot of duplication / nesting, homegrown hierarchy, or pointers to other areas in Redis...

dehrmann on 2024-05-15

I worked for a place that used HBase heavily. They migrated from AWS to GCP ~just for BigTable since the data model is essentially the same. They're a Java shop, and the drivers are actually the same. The workload of managing HBase and HDFS was high, and it was unreliable enough that they always had a failover cluster set up. Interestingly, the migration surfaced degenerate cells/tables that might have been partially to blame for reliability issues. Even 5 years ago, HBase was slowly dying, and at the end of the day, they didn't want the company's core competency to be HBase management.

redwood on 2024-05-15

Shocking to see a company like Pinterest has built multiple in house data stores. Talking about a maintenance burden...

zenbowman on 2024-05-15

This was the norm about a decade ago. When I was at Hulu, we built our own analytics platform on top of Hadoop, we almost certainly wouldn't have done that today with the amount of off-the-shelf stuff available.

Even things like protobuf or Avro weren't as broadly adopted (>80%) at the time, many many companies at the time persisted stuff using JSON or other text formats (which in retrospect was very dumb, but it was very normal for a while).

zerkten on 2024-05-15

If you rewind 10 years to a time before waves of big tech layoffs and the market for engineers was very much in favor, you'll find your answer. The explosive growth in hiring coupled with FOMO-driven risk taking resulted in many projects which were essentially resumeware for engineers. This environment permitted people to escape strategic thinking and any consequences.

Imagine having to deal with multiple data stores for your daily development but then go on-call for a week and have to deal with twice as many plus the inevitable lack of runbooks etc.? I personally have a big tech experience with the same data store proliferation in an organization with pretty solid RDBMS use. In the last 3-4 years we've been undoing the damage and migrating data. Toil for engineers is lower and uptimes are better.

llm_trw on 2024-05-15

At the time there were _no_ tools that scaled to what people needed so they had to build their own.

It's easy to look back and think we were idiots for doing it that way but you need to remember that the average server in 2014 has the power of the average smartphone today.

vkazanov on 2024-05-15

My company's data function, established 14 years ago, has a bunch of custom stores, as well as data-related tools. Back then data use cases were not really well-known, we had to build things from scratch a lot.

It is a very different landscape these days!

lopkeny12ko on 2024-05-15

This is my first reaction as well. How in the world do you end up with close to 10 different databases in production? And furthermore, most of which are totally proprietary? Just use Postgres...

aeyes on 2024-05-15

Easy: These companies have a promotion process which require you to generate "impact" to get promoted. Using existing but boring technology which solves the problem in an efficient manner apparently doesn't show off how smart you are.

This is not a snarky comment from an outsider, I very much work at such a company. It's ridiculous.

Edit: But I must admit that most of this was probably developed 10 years ago when the ecosystem was much more limited. At least they now came to the conclusion that it's time to move on. I would not have recommended Postgres 10 years ago and even today there is no way it would work at the scale of Pinterest.

ted_dunning on 2024-05-15

HBase is coming up on twenty years old.

In 2008, scaling was really hard. And the senior engineers at that point had been badly burned by trying to scale during the 2000-2005 period.

The comment about servers of the time being less powerful than smart phones of today is spot on.

openplatypus on 2024-05-15

So sad because so true.

Seeing working technology replaced because someone doesn't like it, get accolades for it, only to face the same limitations as old tech is just depressing.

aeyes on 2024-05-15

You forgot the next step: The people who developed it move on to other companies and nobody wants to touch it.

doctorpangloss on 2024-05-15

Your application wants to use Postgres.

Kubernetes, which it runs on, wants etcd.

Keycloak wants to use Postgres and Redis for sessions.

Plausible Analytics wants to use Clickhouse and Postgres. Matomo specifically wants to use MySQL.

Lago Billing wants to use Postgres, Redis and Clickhouse.

Prometheus uses TSDB, but okay, it’s “just” files and a WAL. Wait a minute, so is Postgres…

Jaeger wants to use Cassandra or Elastic.

Thanos wants S3 APIs. So maybe you use Minio.

Okay. That’s 2024: in production, you will wind up with 7 different databases. At least.

llm_trw on 2024-05-15

>Just use Postgres

Now try running Postgres on vintage hardware from the period and you quickly see why we did what we did.

lopkeny12ko on 2024-05-16

You would be surprised at how far a well-tuned and optimized Postgres instance will get you.

I've worked with so many teams who chase the newest and shiniest database, deploying hundreds of nodes, and spending hundreds of man-hours staffing a sysadmin team to maintain it, only to eventually replace it with an old fashioned Postgres box. You just need an experienced DBA.

Edit to add: many claims of "Postgres doesn't work at our scale" are frankly BS, and in large part due to an inexperienced ops team. Take a look at the top post on frontpage right now, for example: https://news.ycombinator.com/item?id=40372296

llm_trw on 2024-05-16

> You would be surprised at how far a well-tuned and optimized Postgres instance will get you.

I wouldn't be because I was hired after a team of very experienced DBAs saw their databases melt when trying to run analytics queries on the production database in 2012 for a telco.

We then moved to HBase so analysts could figure out what parts of the network needed extra towers installed for peak demand with 15 second intervals. You can do this today because the database sits on a machine that is between 10 to 100 times as capable as what we had back then and the number of people using cellphones hasn't increased substantially.

Again, don't look at this technology by what you can do today, but by what you could do back then.

dehrmann on 2024-05-15

Not sure if I can get you to 10, but different DBs work at different scales for different workloads. There's relational, key-value, caching, time series, data warehouse, and search index.

cortesoft on 2024-05-15

They are operating at a scale where you can’t “just use Postgres”

redwood on 2024-05-15

Great to see TiDB mentioned for use here, I have to say I love their demo at ossinsights. Anyone else using them successfully in prod?

std_reply on 2024-05-15

AirBnB, Databricks, Flipkart 3 of the largest banks in the world, some of the largest logistics companies in the world, at least 2K seriously large installations.

redwood on 2024-05-17

Very cool. So many upstart SQL players: TiDB, PlanetScale, Crunchy, Cockroach, Neon, Supabase, Nile, Yugabyte, Aiven, let alone the cloud provider options (AlloyDB, Spanner, Aurora, Cosmos). Kind of mind boggling there's room for all these players

ddorian43 on 2024-05-15

It reminds me of the often optimization story (how we got 2x+ faster by not doing very inefficient things), in this case going 6->3 replicas.

Example: TiDB at a certain time didn't write rows clustered by the primary key on disk (they had a separate index). This is very costly in distributed setups (less costly on single-node setups like PostgreSQL).

There are many such cases in many dbs. Another point lacking in most dbs is the "lsm compaction overhead" you need to do for all replicas when you're not using shared distributed storage.

This optimization can be seen on QuickWit (building/compacting inverted index is even more expensive than LSM compaction).

winrid on 2024-05-15

Wow, never realized Pinterest had 6 petabytes of data. I wonder if they're including images in that. Even billions of rows is usually around 1-2tb, so makes you wonder what they're storing many billions of.

evanelias on 2024-05-15

They almost certainly have trillions of rows of data. The popular social media / user-generated content sites are just huge. You can easily get to that size just from the core product OLTP data/metadata, no need to include media.

For one comparison, Tumblr hit 100 billion unique rows of relational data in MySQL (on masters, not including replicas) back in October 2012. So they're easily in the trillions of rows today, and Tumblr is smaller than Pinterest!

winrid on 2024-05-16

Just surprising since I've never seen the social side or anyone using it. Guess I'm just not in that social bubble.

evanelias on 2024-05-16

Pinterest has over 500 million monthly active users. Their revenue was $3 billion in 2023.

Delmololo on 2024-05-16

Woman use it a lot.

My wife has a Pinterest board.

ddorian43 on 2024-05-15

It makes no sense at their scale to store images in Hbase. It's probably just user tracking/analytics.

badpun on 2024-05-15

In NoSQL world, you often store multiple duplicates of the same data, each of which is optimized for a different use case.

dehrmann on 2024-05-15

The SQL world calls this an "index."

winrid on 2024-05-15

No. Most NoSQL dbs support indexes and secondary indexes too.

In SQL world we call this "denormalization". It's called the same thing in NoSQL, too.

dehrmann on 2024-05-16

Indices are essentially managed denormalization.

winrid on 2024-05-16

Ah, never thought of it that way.

llm_trw on 2024-05-16

It's called materialized views.

winrid on 2024-05-16

Well, yes, and no. :)

badpun on 2024-05-15

Indices are no good if you constantly do large amounts of inserts or updates (say, entire clickstream of a popular global site) - keeping them up to date will massively slow down the inserts/updates. Wheresas, in NoSQL, you can do as inserts as quickly as your disk can write data, and still query the dataset as if it had an index.

winrid on 2024-05-16

That has nothing to do with whether the DB is relational or not.

badpun on 2024-05-16

Which relational databases can simultaneously do milions of updates and milions of selects per second, to the same table?

etc-hosts on 2024-05-15

I am attempting to reach enlightenment by contemplating your wisdom.

jeffbee on 2024-05-15

If a company has HBase and doesn't have anything else, then the developers will store everything in HBase due to lack of alternatives. They'll store logs in HBase, metrics in HBase, images in HBase.

eclark on 2024-05-15

> Production HBase clusters typically used a primary-standby setup with six data replicas for fast disaster recovery, which, however, came at an extremely high infra cost at our scale.

This is the real killer. HBase uses Hadoop for its replication. That replication does 3 copies in the same data center. If you're a company that requires online data of this scale, you probably also have several other data centers or clouds. Having to replicate entire datasets 3x for every new data center is cost-prohibitive.

That, along with the fact that HBase has issues at NVMe speeds and throughputs, are true issues.

riku_iki on 2024-05-15

> That replication does 3 copies in the same data center.

You can set replication factor 1 if you want. You just will have high chance to lose your data forever.

eclark on 2024-05-15

The HBase write-ahead log requires a full pipeline. It speculatively drops hadoop nodes out of the write pipeline. Because of the complexity of the WAL, I don't know anyone who has tried with less. (There are a good number of ways to run HBase with no write-ahead log)

There are people who run HBase with Reed Solomon encoding on the HFiles (the store files). That can get a replication factor below 3. I don't think that Pinterest ever ran an updated enough Hadoop for that to be available.

> You can set replication factor 1 if you want.

You can't, really. HBase will fail to open regions that have files that aren't able to be read. So setting the replication factor to 1, have a single hard drive go out and the table will forever have regions that can't be opened. The name node will have a file that's got lost blocks. HMaster will pass around the region assignment until it sticks failed to open.

ram_rar on 2024-05-15

For companies like Pinterest, where data storage isn't the core business, should the focus be on building in-house data warehouses or leveraging managed service providers (MSPs)? While building in-house offers control and customization, MSPs can potentially address complexity and infrastructure costs.

Can someone from Pinterest comment on specific performance needs (SLOs) that influenced their choice between TiDB and other solutions, including managed services? Considering complexity and cost, could an MSP have addressed their needs effectively?

esafak on 2024-05-15
kristopherkane on 2024-05-15

HBase was a joy to use from a application developer standpoint.

dehrmann on 2024-05-15

You should check out Bigtable.

pornel on 2024-05-15

> the HBase version upgrade is a slow and painful process due to a legacy build/deploy/provisioning pipeline and compatibility issues

Is that HBase's fault, or Pinterest's added complexity?

I'm baffled when databases don't support seamless in-place upgrades, and require a full dump and restore instead. At certain scale a full rebuild is as complex as replacing wheels of a moving car.

beeboobaa3 on 2024-05-15

HBase & Hadoop are painful to upgrade. Honestly doing anything with them is painful.

eclark on 2024-05-15

HBase supports in-place upgrades. Almost* all version upgrades have been relatively painless if you have automation to do the necessary operations across a cluster of nodes. Hadoop upgrades have been a similar story. That minimum automation necessary is high for all stateful data stores of this size.

* A few notable exceptions exist where Hadoop and HBase upgrades were necessary simultaneously. These have been awful experiences that required huge efforts to accomplish.

HermitX on 2024-05-16

“Specifically, online analytics workloads would be migrated to Druid/StarRocks”, I'm very interested in this part. Look forward to knowing more about it.

zuck_vs_musk on 2024-05-15

They switched to an SQL database. NoSQL

jerryjerryjerry on 2024-05-15

Emm, I'm very curious about the reasons why they finally chose TiDB.

std_reply on 2024-05-16

My guess: then benefits SQL at scale and reduced maintenance burden. That’s what the article seems to hint at.

lopkeny12ko on 2024-05-15

This article has so many words yet has such little information, and is remarkably sparse in technical detail. What did they actually do? What did they build? How did they migrate? What is "SDS"?

Is every article written by ChatGPT now? My confusion was partially answered as soon as I saw the word "delve."

pxx on 2024-05-15

This blog post starts off saying it's part one of a three-part series so the lack of detail makes a lot of sense in context. Given that it's a corporate blog post, it's unlikely that we'll get a particularly deep technical dive but there is plenty of detail in what is stated to be an introduction.

Also this article doesn't feel like it's written by ChatGPT at all. "Delve" is not even a very uncommon word; just one use of it isn't necessarily indicative, and even if it was, it's used in the summary of the rest of the series (which you seem to have missed in your hunt for gotchas)! I think LLM bullshit is definitely making everything a lot worse but this isn't even an example of such.