Kill the Slop: announcing our investment in Taste

These days, when I’m not thinking about the latest approaches to meta-memory or dynamic benchmarking, I’m contemplating exhibitions at the Venice Biennale. Is a functional sewage treatment plant actually a sacred site? Does anthropomorphized excrement address shame as a social mechanism? Whatever the answer, the debates around these installations point to a broader truth: taste is not democratic.

Taste is curated. It emerges through years, often decades, of argument among critics, curators, artists, and collectors about what separates art from kitsch, provocation from spectacle, and beauty from the merely attention-grabbing. These judgments determine not only what ends up on display, but also what a culture decides is worth taking seriously.

Generative AI has made creation cheap. What it has not made cheap is judgment. And when judgment is scarce, slop fills the void.

The averaging problem, and why tasteful annotation is hard

To align models with human users' objectives and preferences, AI researchers invented RLHF. This technique was critical in unlocking AI products like ChatGPT for widespread adoption. The idea behind RLHF is simple: humans compare model outputs, researchers train a reward model on those comparisons, and the model is then optimized to produce outputs that score highly according to that learned reward function.

When preference data is collected from a broad population of users, the resulting reward model learns to approximate the preferences represented in that population. For many tasks, this works remarkably well. But teams building generative media products have increasingly discovered that optimizing against these signals can push models away from their internal notions of quality.

I've heard versions of this story repeatedly in conversations with teams at companies like Runway, Figma, Character AI, and Adobe. Several teams found that incorporating large amounts of preference data from general users improved alignment with the collected feedback while degrading performance on expert-driven evaluations. The model became better at satisfying the reward function, but not necessarily better according to the standards that mattered most to the product.

What these teams ultimately wanted was not more feedback from more users. They wanted better feedback from the right users. Specifically, they wanted the tastemakers: people with unusually strong judgment, aesthetic sensitivity, and the ability to explain why something works or does not.

This is a fundamentally different problem from the one most annotation platforms were designed to solve. Traditional annotation systems optimize for scale and agreement. They recruit large numbers of annotators to complete tasks with relatively objective answers. Subjective evaluation is different. The goal is not to estimate what the average person thinks. The goal is to identify the people whose judgments are most predictive of quality.

That turns out to be surprisingly difficult. How do you identify someone with exceptional taste? How do you find them at scale? How do you distinguish genuine judgment from idiosyncratic preference? These questions sit at the intersection of aesthetics, expertise, and measurement, and they do not admit clean solutions.

Some companies have tried to solve the problem internally. Figma, for example, reportedly relied heavily on designers to evaluate outputs in the early stages of model development. Others attempted to recruit expert annotators through more conventional channels. Neither approach has proven entirely satisfactory. As I argued a few years ago, annotation becomes substantially harder when the target is subjective judgment rather than objective correctness.

With this challenge in mind, I have long held two beliefs that I assumed were related but distinct.

The first is that annotation and data curation represent one of the highest-leverage investment opportunities in AI infrastructure. Model quality is a function of data quality. Unfortunately, industry has consistently underinvested in making annotation rigorous, scalable, and sensitive to the difference between preference and judgment. That is starting to change. Fast.

The second is that generative media has matured from a niche into a real industry. We have seen this firsthand at Amplify through the growth of companies like Runway, the evolution of our Creative Technologies Dinner series, and the success of companies like Fal, which showed that a substantial business can be built serving the generative media market alone.

What I had not fully appreciated was the point at which these two beliefs converge. As generative media matures, the scarce resource is no longer generation. It is judgment. Models can produce an effectively infinite supply of images, videos, characters, and experiences. What remains difficult is determining which outputs are actually good, and then turning that judgment into a signal that models can learn from.

But this is the need that Taste is addressing - annotation infrastructure built for scale and taste; tools to collect judgment, not preference; systems designed for subjective tasks, not objective ones.

All of this is useful context to explain why when I met Thais Castello Branco, the founder and CEO of Taste, a soul bond ensued.

Introducing Taste

Taste’s mission is to end AI slop. They are bringing the data and infrastructure together to help model providers build AI that delivers exceptional results in subjective domains.

The company's thesis begins with what I think of as the slop problem. Before you can personalize model outputs to the tastes of individuals or communities, you first need a model capable of exercising basic judgment. A generative UI that produces clashing color schemes, a writing model that defaults to corporate boilerplate, or a video model that generates temporally incoherent motion are not failures of personalization. They are failures of discrimination.

Taste's first objective is therefore not to help models produce exceptional outputs. It is to help them reliably avoid bad ones. Before a model can learn what is beautiful, distinctive, or culturally resonant, it must first learn what is crude, incoherent, derivative, or otherwise not worth selecting. The initial challenge is raising the floor: ensuring that outputs in subjective domains clear a minimum standard of quality.

Only once that floor has been established does the more interesting challenge emerge. Good is not a universal category. The aesthetic preferences of a graphic designer differ from those of a gamer, a luxury brand, or a teenage creator on TikTok. After helping models learn to avoid bad outputs, Taste can help companies optimize for outputs that are not merely good in the abstract, but good for a particular community, audience, or individual user.

The way they help foundation model labs and application layer companies achieve this is through a curated community of tastemakers. Rather than recruiting annotators from general labor platforms, Taste builds its annotator network through a trust-propagation model. They seed the community with people vetted for strong aesthetic judgment, then allow those tastemakers to nominate others. The result is a community where taste is socially validated — which is, if you think about it, how taste has always propagated. The first domain they’re starting with is design.

More practically, Taste sells preference datasets, reasoning data, rubrics, and evaluation environments to frontier AI labs. It also sells software to generative AI application companies. At first glance, these may appear to be separate businesses. In reality, they are deeply intertwined.

Most annotation companies scale by adding more annotators. Taste starts from the opposite premise. In subjective domains, quality depends on maintaining a highly curated network of evaluators with exceptional judgment. While others focus on growing the crowd, Taste focuses on increasing the crowd's leverage without diluting its taste.

This is why Taste invests so heavily in technology. The software it builds helps a small group of tastemakers work more efficiently, make more consistent judgments, and generate higher-quality signals.

In this sense, Taste is not simply an annotation company. It is building infrastructure for operationalizing judgment.

As a last note, many think of slop solely as an aesthetic problem. I actually believe that there’s much more at stake here. It’s increasingly clear that generative media is going to completely dominate our landscape; be it images, copy, or even entire films, creatives will be using AI to create most of the frames and text we see. We cannot accept an outcome where the majority of that generation is lame. If we want to live in a beautiful world full of originality and soul, we must figure out how to imbue our models with personalized taste.

Announcing Taste’s $18.5M seed round

Behind Taste is Thais Castello Branco, a force of nature whom I’m incredibly excited to partner with. Thais is building at the intersection of data infrastructure and aesthetic judgment, which, for those who know me, is about as Sarah-coded a company as one could imagine.

If you spend time with Thais, it quickly becomes obvious why she is the right person for this challenge. She cares deeply about how things look and feel, not as an affectation but as an operating principle. Before founding Taste, she led growth at Exa, where she became known for unusually thoughtful and distinctive marketing, including the now-famous billboards that managed to feel genuinely tasteful in a medium not generally associated with taste. She has a rare ability to combine strong aesthetic instincts with an obsession for systems and measurement.

That combination has never been more important. The annotation industry has historically been dominated by companies that are operationally excellent and domain indifferent. The things they “sold” were speed and efficiency, not depth of knowledge and quality. But if the future of AI depends on teaching models judgement, then the people building that infrastructure should probably have some judgment of their own.

Today, I’m excited to announce that Amplify is co-leading Taste’s $18.5 million seed round. I have been thinking about this problem for years. At one point, I even considered incubating a version of the company myself, but ultimately concluded I had not met anyone obsessed enough with the question to build it well. Thais is that person.

Kill the Slop: announcing our investment in Taste

The averaging problem, and why tasteful annotation is hard

Introducing Taste

Announcing Taste’s $18.5M seed round

Announcing our investment in Engram, the memory dream team

File systems for agents

Behind the scenes of Modal sandboxes

Where's the creativity in creative AI tools?