When the Conversation Became Part of the Bug

I ran into one of those AI-assisted development failures recently that was miserable while it was happening and useful after I had a little distance from it.

That's usually how these lessons show up.

They don't arrive politely.

They arrive after you've spent too much time trying to figure out why something that should be simple keeps coming back wrong.

In this case, the technical problem wasn't even the biggest lesson.

The bigger lesson was that by the time we figured out what had really happened, the conversation itself had become part of the problem.

The plan was good

We'd spent a lot of time building a shared application foundation.

The whole idea was to stop reinventing the same underpinnings for every new product.

We had a working shell.

We had a layout.

We had a set of surfaces.

We had known behavior.

We had rules about what could be changed and what should be left alone.

The plan was simple enough.

Take the working foundation.

Clone it into a new product.

Give the new product its own name.

Make sure it still compiles.

Then start changing the product-specific pieces one at a time.

That was the entire point.

We weren't asking the AI to invent a new interface.

We weren't asking it to decide which parts of the shell mattered.

We weren't asking it to simplify the foundation.

We were asking it to preserve the proven parts and work inside the seams.

That's a very different task.

And it's an important one.

Because if you're going to use AI to help build real software, you can't let it treat every repo like a disposable prototype.

The AI did what AI often does

The first result looked close enough to be dangerous.

That's probably the best way to describe it.

Some of the rename work was there.

Some of the new product identity was there.

Some of the interface looked like it belonged to the same family.

But then I noticed what had been removed.

The AI had looked at parts of the shell and decided they didn't apply to this new product.

So it took them out.

That's exactly what I didn't want.

Those parts weren't accidental.

They weren't leftovers.

They weren't clutter.

They were there because the foundation was supposed to support more than the first screen of the first product.

This is one of the places where AI can get you into trouble.

It sees capacity and treats it like waste.

It sees an unused surface and treats it like dead code.

It sees a framework and treats it like a mockup.

It sees a boundary and treats it like a suggestion.

So I corrected it.

I explained that the shell had to stay intact.

I explained that product-specific work needed to happen inside the allowed area.

I explained that the surfaces weren't optional just because this first product didn't need all of them on day one.

The answer came back looking basically like the same wrong version.

So I tried again.

Then I started a new thread with stronger instructions.

I even brought over a warning from the bad session so the new one would know exactly what not to do.

And somehow, the result still looked wrong.

That's when the stress level went up.

The real issue was sneakier

Eventually, after talking it through with another AI assistant, the missing piece surfaced.

The app had been restoring old local state.

In plain English, the first bad run had left behind a remembered state. After that, even when new code was being generated, the app could still bring back the old bad-looking layout.

That changed the diagnosis.

We weren't just looking at the current code.

We were also looking at remembered state from a previous run.

That's the kind of thing that makes an AI session feel haunted.

You change the code.

The screen still looks wrong.

You strengthen the prompt.

The screen still looks wrong.

You start a new conversation.

The screen still looks wrong.

And all the while, part of what you're judging may not be the code you just built.

It may be the old state the app restored.

Once we understood that, the fix itself wasn't all that dramatic.

Clear the old state.

Run from a clean condition.

Then judge the result.

Fine.

But by the time we got there, we had another problem.

The thread was cooked.

Knowing the answer didn't make the thread safe

Once we knew what had happened, I asked the other assistant whether we should keep going in the existing thread or start over in a clean one.

The answer was to start over.

And I think that was exactly right.

Because the old thread had too much smoke in it.

It had the bad result.

It had the repeated corrections.

It had the wrong assumptions.

It had the failed attempts.

It had my frustration.

It had a lot of discussion about what the interface looked like while old state may have been interfering with what we were seeing.

Yes, we had finally figured out the problem.

But the conversation was now full of the wreckage from figuring it out.

That's not nothing.

With AI, the conversation is part of the context. It isn't just a diary of what happened. It's part of what the model uses to answer the next question.

So even though the AI could now say, "Yes, we need to preserve the shell and clear the old state," the thread still contained all the earlier wrong turns.

That matters.

I didn't want the next round of work to inherit that mess.

I wanted it to inherit the lesson.

There's a difference.

This is where people get trapped

I think this is a trap a lot of people are going to fall into with AI.

They'll think:

Well, we figured it out. Let's keep going.

And sometimes that's fine.

But sometimes that is exactly when you should stop.

Because "we figured it out" doesn't mean "this thread is clean."

The thread may still be full of bad examples.

It may still be full of wrong assumptions.

It may still be full of corrections that point back to the wrong thing over and over again.

The AI may now understand the fix in one part of the answer while still being pulled by the shape of the earlier conversation in another part.

That sounds strange until you've watched it happen.

Then it starts to make sense.

The AI isn't just reading your latest instruction.

It's reading the conversation.

And if the conversation has caught fire, you may not want to keep building inside it.

The better move was to extract and restart

The useful move was to stop asking that thread to continue the work.

Instead, the old thread should become a source of diagnosis.

What went wrong?

What do we know now?

What rule needs to be stated more clearly?

What local condition needs to be reset before testing?

What should the next thread be told?

That's the material you want.

Not the whole messy history.

Not every failed attempt.

Not all the frustration.

Just the clean lesson.

In this case, the restart instruction needed to say something like this:

Preserve the shared foundation.

Do not remove existing surfaces just because the first product doesn't use them yet.

Make product-specific changes only inside the intended seam.

Before judging the visible result, make sure old local state isn't being restored.

Start with a build and verification pass.

Then make one controlled change at a time.

That's the kind of thing you can safely hand to a clean thread.

It carries the lesson without carrying the fire.

Stop, drop, and roll

The phrase that came to mind was the old fire safety rule.

Stop, drop, and roll.

When you're on fire, you don't keep running.

When an AI session is on fire, you shouldn't keep running either.

Stop the current work.

Drop the contaminated context.

Roll into a new thread with a clean restart prompt.

That's not giving up.

That's good workflow.

And it's not something you need to do every time the AI makes a mistake.

Most mistakes are just mistakes.

Fix them and move on.

But when the same wrong pattern keeps coming back, when the AI keeps crossing the same boundary, when the thread fills up with corrections, or when you discover a root cause that changes how you should have been testing the whole time, that's different.

At that point, the current conversation may not be the best place to continue.

The lesson

The technical lesson was simple enough:

Don't forget that an application can restore old local state and make new code look wrong.

But the bigger lesson was this:

Sometimes the thing you need to reset isn't just the app.

Sometimes the thing you need to reset is the conversation.

That's the part I think matters for anyone doing serious work with AI.

A bad thread can become a bad workbench.

And once the workbench is covered in broken parts, smoke, and wrong assumptions, you don't have to keep building there.

Take the lesson.

Take the corrected rule.

Take the current source.

Leave the wreckage behind.

That is the lived-workflow story behind the companion guide, Stop, Drop, and Roll.

-- Charles