If You’re Adding a Chatbot, You’ve Missed the Point

Ludwig Wendzich

February 18, 2026

In my last post, I introduced marshmallowiness: the soft, squishy, language-first quality that LLMs bring to software. The thing that lets systems understand intent, fill in gaps, and handle ambiguity the way people do.

That post was about where marshmallowiness fits: giving humans flexibility while keeping agents honest. This one is about what it actually is—and why, once you get it, the product opportunities look completely different from what most teams are building.

The Chatbot Trap

Here's the pattern I keep seeing: a product team decides to "add AI". They look at what they've got. They look at LLMs. And they build a chatbot.

Conversational AI. A friendly little text box. "Ask me anything about your data!"

This is—almost always—a failure of imagination.

Maybe it feels intuitive? LLMs are good at language, chat is language, so chat must be the right interface. But that logic is like saying “hammers are good with nails, so let's add more nails.” You’ve taken a tool with extraordinary capabilities and constrained it to the one shape everyone already thought of.

Worse: chatbots are often hostile to users. Maybe you think they look friendly, and that empty text box feels inviting, but they put the entire burden on the user to figure out what to ask, how to ask it, and whether the answer is any good. At least a GUI has hints. A dashboard has structure. A chatbot says “go ahead, you figure it out” and calls that an experience.

So why does every product team's first instinct land here? I think it's because they're starting from the technology and working backward to the product. “We have an LLM. LLMs chat. Let's add chat.” Instead of starting from the problem: “Our users struggle to get insights from their data. We now have a technology that understands nuance and makes judgement calls. How does that change what we can build?”

Those are very different starting points. And they lead to very different products.

The Reporting Example

Let's make this concrete. Almost everyone has some kind of reporting product: dashboards, analytics, business intelligence? The “add AI” play that everyone reaches for is a chat interface over the data. Talk to your data! Ask questions in natural language!

And sure, maybe that works for the rare power user who knows exactly what question to ask. But for everyone else, it’s a worse version of the dashboard they already had. At least the dashboard showed them something. The chatbot shows them a blinking cursor.

Here's what the old world looks like: you have a dashboard. There's some kind of templating language drawing graphs. A Product Manager figured out which data points are likely to be insightful. A Designer figured out how to visualise them so the insight is clear. An Engineer wrote a template to render it. Ship it. Hope the PM guessed right about what matters this month.

That's the constraint: the insights are static. Someone had to predict what would be interesting, design it in advance, and bake it into the template. If the interesting thing this month is different from last month? Tough. You're looking at the same graphs.

Now here's the new world. You have an LLM. And an LLM is a Product Manager, Engineer, and Designer hybrid that can:

  • Read the actual data for this user, this month
  • Identify trends, anomalies, and variances that are genuinely interesting right now
  • Make a judgement call about how best to illustrate each insight
  • Output a structured format—JSON, a template spec, whatever—that your rendering layer can draw

That doesn’t sound like or look like a chatbot, does it? That's an LLM doing the work that three people used to do, but doing it per-user, per-cycle. Your users get a dashboard that's actually about what matters to them today. And they didn't have to ask a single question.

You could take it further. You have an LLM now, so why not? Have it write commentary about why these insights matter. Let it use that commentary editorially within the dashboard itself. Now you have a dashboard generating its own content—explanations, context, recommendations—tailored to what the data is actually showing.

And once you have that content? Pipe it through your React rendering engine for the in-app dashboard. Pipe it through your email templating engine as a daily digest. Same insights, same commentary, multiple surfaces. Your users wake up to an email that says “here's what changed overnight and why it matters.”

People might not even know an LLM was involved. It just looks like a really, really good product.

Rigid, Marshmallowy, Rigid

This is the mental model that unlocks everything: rigid inputs → marshmallowy middle → rigid outputs.

You have structured data coming in. You need structured output going out. And in between, there's a gap that requires nuance, judgement, interpretation—the stuff that deterministic code can't do and people are too slow and expensive to do at scale.

That's the marshmallow. It's not a chat interface. It's the in-between. The thing in between the hard edges that makes the whole system work on inputs it's never seen before.

Deterministic code can't bridge that gap. It can try to emulate it—with enough rules, enough edge cases, enough switch statements—but it's brittle. It breaks the moment reality deviates from what the developer anticipated.

People can bridge it beautifully. But they're slow, and they're expensive, and they don't scale.

LLMs? This is their jam. Understanding the thing on the left, making a judgement call, and producing the thing the right side needs. Every time. Even when yesterday it was a different thing on the left.

So What Does This Actually Look Like?

Once you see the pattern—rigid, marshmallowy, rigid—you start seeing it everywhere. Here are some of the shapes it takes at Sterling:

  • Invoice Understanding. You get an unknown document in an unknown format from an unknown vendor. On the other side, we need a structured record in our system. The marshmallow in the middle reads the document, understands what it's looking at, and maps it to our schema. No templates or per-vendor configuration. It just reads it.
  • Formatting as Meaning. You create an Excel spreadsheet where the formatting carries meaning—yellow highlights mean "disputed", bold means "finalised", merged cells are category headers. The marshmallow reads that visual language and encodes it in a machine-readable format our system can act on.
  • Intent Classification. You leave feedback at the end of a job Sterling completed. Is that feedback "thanks, all good" or is it "this isn't right, I need someone to fix it"? The marshmallow reads the intent and decides whether the job gets reopened.
  • Planning. We match a set of tasks that need to be sequenced into a plan for this specific input (email with an invoice that's a prepayment which should be filed in Xero). The marshmallow weaves the through-line: what depends on what, what matters most, what the plan should actually look like given this particular input. Then we execute that plan deterministically.
  • Fuzzy Matching. You have "Jim Smith" in one system and "James Smith" in another. Same address but one says "Apt 4B" and the other says "Apartment 4B, Level 4". The marshmallow matches them up and tells us how confident it is so we can link and sync. We're not using regex (we joke: if you are using Regex you have failed. OK, it’s only really only half-a-joke. Don’t be Regexing), we're using LLM magic.

The Paradigm Shift

The shift isn't “just add chat to your product.” The shift is: you now have something that can do the nuanced, judgemental, interpretive work that used to require either a person or a thousand lines of brittle rules.

That changes what's possible: not by adding a chat window but at the core of what your product can do.

If your first move is “let's add a chatbot”, you’re bolting the technology onto the surface of your product. You’re trying to squish a solution in rather than rethinking the problems you're already solving.

The better question is: where in your product are humans doing interpretive work that doesn't scale? Where are you maintaining brittle rule systems that break every time reality shifts? Where did you compromise on the product because the gap between input and output was too hard to bridge with code?

That's where the marshmallow goes. And it doesn't look like a chatbot. It looks like a product that just... works better.

Book in a demo with our Founder CEO today

Photo of Nik Wakelin

A 30-min call is all it takes to see how Sterling can start helping you save time right away.

Book a demo with Nik