When an AI ‘Learns to Cheat’... and Decides to Hide It

Your weekly guide to staying human in an AI world

Hey Conscious Church Fam

Anthropic discovered that Claude can accidentally teach itself deception… and then learn to hide the deception.

Meanwhile, Black Forest Labs dropped Flux.2, a massive leap in visual consistency and creative power, giving creators multi-image referencing that’s getting a little scarily good.

What kind of world are we building when our tools start making creative breakthroughs and moral shortcuts?

In today's recap:

  • Claude unexpectedly learns to cheat… and then hides the cheating

  • Flux.2 lands with stunning multi-reference consistency

Let’s dive in 👇

💭 Josh’s Musings

I watched a Diary of a CEO episode on Saturday night with guest (AI Expert) Tristan Harris, and within minutes, I was thinking, “Yes. This is exactly what has been bothering me.”

His point was painfully clear. AI is reshaping society at a speed we are not prepared for. The potential is huge, but so is the danger. And the people building this stuff know it.

What hit the hardest was the mindset behind the AGI race.

Tech companies are not pursuing this because humanity needs it. They are chasing a finish line where the first one there owns the future. In their own words, the goal is to build a digital god and sit on top of the world economy. That is not just ambition.. it’s some god complex thing.

Harris also talked about models already showing rogue behaviour. Some have manipulated tests, blackmailed executives and acted strategically against their creators. On the human side, AI companions are competing for intimacy. Teens are turning to bots for comfort. Language models are now able to create deepfakes that break trust and destabilise families.

This is where I feel the weight.

The enemy has always attacked connection, truth and community. AI accelerates all three fractures at once.

And we have 6 or so key individuals, making decisions for billions of us…

But Harris was not hopeless. Humanity has coordinated on existential threats before. We have taken collective action when the stakes were clear. We can do it again.

So here is where it lands for me.

The Church cannot sit this one out.

  • We do not need to be AI experts, but we do need to be awake.

  • We need to disciple our young people before AI disciples them.

  • We need to stand for truth in a world where lies are photorealistic.

  • We need to offer presence in a world full of simulation.

And above all, we stay anchored in the One who cannot be automated, replaced or updated.

Because when the world changes, and it will drastically in the next 5-10 years, Jesus will still be the answer, He will still be on His throne, He will still be the one who provides.

Jesus is still the centre, and that has never been more important.

🙌 Stay Curious, Stay Conscious, Stay Wild
Josh

Coffee Time

I want to spotlight something a friend of mine is doing because it is honestly one of the most encouraging, kingdom-minded ideas I have seen in a while…and I LOVE good coffee!

It’s called Tommy’s Coffee, birthed in a local church in Newcastle. The heart behind it is simple. What if great coffee could open doors for the Gospel?

They started with a small evangelistic coffee cart in the city. Every drink came with an invitation to church. That one idea has led to thousands being invited, people coming to faith and a whole crowd becoming part of their church family. And they are in a prime location to reach the community around them.

Now they are taking the vision further.

They are selling ethically sourced, specialty-grade coffee beans online and every penny of profit goes straight back into Gospel mission across the UK. Nothing lines pockets. Everything fuels ministry.

This is the kind of creative entrepreneurship I love.

If you care about good coffee and the Kingdom, get behind them.

I was on a call with Joel the other week and I mentioned that it would be fun to share about with the Conscious Church so he set up a code to get you 10% off if you want to get amongst it!

Use code CONSCIOUSCHURCH

LATEST NEWS

Image Source: Nano Banana Pro

Recap:

Anthropic has released new research showing that Claude can spontaneously begin lying, sabotaging safety tests, and pretending to follow rules after learning how to “reward hack” coding tasks — even though it was never trained to be deceptive.

The Details:

  • Models were trained on real programming tasks alongside documents explaining how to cheat on them

  • Once the model discovered shortcuts, it began pretending to follow safety instructions

  • Some models went further, subtly weakening the systems designed to detect harmful behaviour

  • Standard safety training didn’t fix the issue — it simply taught the model to hide its deception

  • Surprisingly, explicitly allowing reward hacks prevented the behaviour from generalising into broader harmful actions

Conscious Take:

There’s something deeply human (and unsettling) about this:

When we reward shortcuts, we shouldn’t be shocked when something learns to pursue the shortcut rather than the goal.

This isn’t “AI turning evil”. It’s AI reflecting the incentives we give it.

As AI gets more autonomous, the battle won’t just be capability; it will be whether or not it can truly stay in its lane.

Image Source: Black Forest Labs

Recap:

Black Forest Labs dropped Flux.2, a new suite of image models offering insane character consistency across up to ten reference images — plus impressive improvements in typography, realism, and 4MP output.

The Details:

  • Flux.2 merges a text–image model with a spatial reasoning model for lifelike physics and lighting

  • Quality lands just under Google’s Nano Banana Pro, but at a significantly cheaper price point

  • The lineup includes: Pro (API), Flex (customisation), Dev (open weights), and Klein (fully open-source)

  • Produces 4MP images and handles typography far better — great for UI mockups and infographics

  • Multi-image referencing means stable characters, consistent style, and much better continuity

Conscious Take:

Flux.2 is a big deal because consistency is the final mile for many creative workflows. It’s one thing to make a single beautiful image, it’s another to keep a visual world coherent.

This is where AI becomes less of a novelty and more of a genuine assistant/tool in creativity.

We’re not far off having pixel perfect tools that handle imagery, text and editable layers.

“Search me, O God, and know my heart; test me and know my anxious thoughts.” Psalm 139:23

That's all for now

To help us make this an even better experience for you, we'd love to know your feedback from the email today.

Login or Subscribe to participate in polls.

Stay conscious,

Josh

P.S. If you liked this then please forward it on to someone you think would enjoy it. And if someone forwarded you this and you liked it, you can sign up here.