WarmRaccoon: What We Built, What We Learned, and Why We Changed Direction

type

Post

status

Published

date

Feb 20, 2024

slug

warmraccoon-ai-project-phased-summary

summary

A behind-the-scenes look at WarmRaccoon — an AI video platform built to make knowledge more accessible. What worked, what didn't, and the pivot that changed everything.

The Problem That Started Everything

Late 2022. ChatGPT had just launched and everyone in the industry was either excited or nervous — sometimes both at the same time.

Our team was in a different conversation. We kept coming back to a question that felt more specific and, to us, more interesting than the general excitement about generative AI: what actually happens to good knowledge when it's locked in the wrong format?

Books are the obvious example. There is an enormous amount of genuinely useful knowledge sitting in long-form text that most people will never access — not because they don't want it, but because the format doesn't work for their life. Time, language, attention, context. The barriers are real and they're not evenly distributed. Someone running a small business in a second language, or learning a new field without the time for a 300-page read, or operating in a market where the best resources exist only in English — they have less access to the same knowledge than someone who doesn't share those constraints.

We thought AI could close that gap in a meaningful way. That's what WarmRaccoon started as: an attempt to use AI video generation to translate long-form knowledge into formats that could actually reach the people who needed it.

That was the vision. The reality of building it was considerably more complicated.

Year One: The Technical Problem Nobody Warned Us About

From November 2022 through to late 2023, we were essentially in full engineering mode.

The core thing we were trying to build sounds straightforward in theory: a pipeline that takes a long piece of text — a book summary, an article, a research paper — and turns it into a properly produced video. Narration, visuals, pacing, multiple language versions. End to end, with minimal human intervention at each step.

In practice, it touched almost every hard problem in AI content production simultaneously. Natural language processing to actually understand and condense the source material. Speech synthesis that sounded like a person rather than a system. Cross-modal matching — making sure the visuals and the narration were actually connected rather than just running in parallel. And then engineering all of it into something that could run at scale without falling over.

The moment I remember most clearly from this period: we mapped out the traditional workflow for producing a single book summary video — the kind of thing that exists on YouTube, where a team takes a non-fiction book and produces a 10-15 minute explainer. Eight distinct roles. Book summariser, content editor, translator, voice artist, video editor, renderer, timeline proofreader, platform manager. Weeks of work per video.

Our benchmark was to compress that into minutes.

We got there, eventually. By the time we'd built out the full pipeline, what had previously required eight people working in sequence could be handled end-to-end with a fraction of that input. We built a distributed rendering architecture that improved high-definition video generation speed by about 300% compared to the tools we'd started with. We cracked the multilingual problem — 30-plus language versions from a single source input — which was the piece that had felt most ambitious at the start.

But getting there was slow, expensive, and full of problems we hadn't anticipated. The speech-to-visual synchronisation across different languages was one that nearly broke us for several months. Languages have different rhythms, different average word lengths, different pacing norms. A solution that worked beautifully in English would fall apart in Mandarin or Arabic. We ended up developing a dynamic frame rate adjustment algorithm specifically to handle this, which felt like a genuine breakthrough when it finally worked and a significant detour when we were in the middle of building it.

The Pivot Nobody Plans For

By early 2024, we had something that worked. We'd produced video versions of 52 books through the pipeline. The technology had proven itself. The team had learned an enormous amount.

And then we made a decision that surprised some people who'd been watching the project: we voluntarily stopped producing our own content.

Not because the product had failed. Because of what we'd learned from users.

When you build something and then watch real people use it, you find out very quickly which assumptions you made were wrong. The assumption we'd made — that the primary value was in the library of content we'd produce — turned out to be only part of the picture. What creators and smaller teams actually wanted was the capability itself. Access to the pipeline, not just its outputs.

The requests kept coming: can we use this for our own content? Can external creators access the tools? Can we run our own material through the same workflow?

We'd built something we thought was a content platform. It turned out it also wanted to be an infrastructure play — a set of tools that other creators could build on top of.

So we paused our own content production and shifted to building what became an open AI collaborative creation tool — something that allowed external content creators to access and use the same pipeline we'd developed. It was a significant strategic change. It required rebuilding parts of the product architecture. It meant the timeline for everything moved.

I won't pretend the decision was easy or that everyone on the team was immediately on board. Pivoting away from something you've spent over a year building requires a particular kind of willingness to let go of the original story and follow where the evidence points.

What the Project Has Actually Taught Me

Running WarmRaccoon alongside my consultancy work has been one of the more educational experiences of my career, specifically because building a product is a different kind of problem-solving than building campaigns or strategy.

In client work, the feedback loop is relatively fast. You try something, you see results, you adjust. In product development, especially technical product development, you can spend months on something before you find out whether your core assumption was right.

A few things I carry forward from this:

The gap between "technically impressive" and "actually useful" is larger than it looks. The pipeline we built genuinely did remarkable things from an engineering standpoint. But remarkable engineering is only valuable if it's solving the right problem in a way people can actually use. We had to keep asking — sometimes uncomfortably — whether what we were building matched what people needed, not just what we'd set out to create.

Pivots that feel like failures often aren't. Stopping our own content production felt, in the moment, like admitting something hadn't worked. In retrospect it was one of the better decisions we made, because it came from honest observation of what the product actually wanted to be rather than attachment to the original plan.

Small teams move fast and break things, including themselves. We were not a large operation. Every problem that needed solving competed for the same limited attention and capacity. The discipline required to prioritise ruthlessly — to decide what we were not going to work on this month — was harder than any of the technical challenges.

Where Things Stand and What's Next

As of early 2024, the focus is on two things.

The first is a decentralized content verification system — addressing one of the genuinely thorny problems in AI-generated content, which is copyright and attribution. When an AI pipeline produces a video that condenses a book, questions about rights and ownership get complicated fast. We think there's a meaningful solution to be built here, and it's one of the areas where we're investing significant thinking.

The second is the continued development of the creator tools — building out the open platform so that more external creators can access and use what we've built, with a proper product experience rather than the somewhat rough early version we launched.

WarmRaccoon is still early. There's more to figure out than there is figured. But the core belief that got us started — that AI can make good knowledge more accessible to more people, across more languages and formats and contexts — still feels true, and still feels worth pursuing.

The technology is less the point than I thought it was at the start. The point is the person on the other end who gets access to something they couldn't access before.

We're still working toward that.