Home / Series / Computerphile / Aired Order / Season 2025 / Episode 12

Ai Will Try to Cheat & Escape (aka Rob Miles was Right!)

As Large Language Models improve, the tokens they predict form ever more complicated and nuanced outcomes. Rob Miles and Ryan Greenblatt discuss "Alignment Faking" a paper Ryan's team created - ideas about which Rob made a series of videos on Computerphile in 2017.

English
  • Originally Aired April 2, 2025
  • Runtime 20 minutes
  • Production Code AqJnK9Dh-eQ
  • Network YouTube
  • On Other Sites Official Website
  • Created April 2, 2025 by
    shriek
  • Modified April 2, 2025 by
    shriek