PodcastsGesellschaft und KulturLessWrong (Curated & Popular)

LessWrong (Curated & Popular)

LessWrong
LessWrong (Curated & Popular)
Neueste Episode

878 Episoden

  • LessWrong (Curated & Popular)

    "A Year Late, Claude Finally Beats Pokémon" by Julian Bradshaw

    18.05.2026 | 18 Min.
    Credit: ClaudePlaysPokemon Elevator Shanty by Kurukkoo

    Disclaimer: like some previous posts in this series, this was not primarily written by me, but by a friend. I did substantial editing, however.

    ClaudePlaysPokemon feat. Opus 4.7 has finally beaten Pokémon Red, fulfilling the challenge set over a year ago when LLMs playing Pokémon went briefly, slightly viral.

    Victory Screen!

    Let's get the throat-clearing out of the way: this doesn't make 4.7 a clear breakthrough in intelligence over 4.6 or 4.5. It's smarter, yes, as we'll discuss below, but not by something one could honestly call a big leap. Rather, step changes have finally accumulated to the point of victory.

    And to give other models their fair shake: after criticism over its elaborate harness,[1] GeminiPlaysPokemon has beaten Pokémon with progressively weaker harnesses, including about two months ago with a harness comparable to the one Claude uses.[2]

    As such, this is a bit of a valedictory post, closing off the cycle of Claude playing Pokémon Red, relating anecdotes for the fun of it, and discussing improvements in Opus 4.7, as well as speculating a bit on what this has all meant.

    Retrospective Anecdotes on Claude 4.5 and 4.6

    Our last post, on Opus [...]

    ---

    Outline:

    (01:37) Retrospective Anecdotes on Claude 4.5 and 4.6

    [... 10 more sections]

    ---

    First published:

    May 16th, 2026


    Source:

    https://www.lesswrong.com/posts/sehJYg5Yny9fvpbpt/a-year-late-claude-finally-beats-pokemon

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:
  • LessWrong (Curated & Popular)

    "A relatively brief explanation of Boltzmann Brains" by Eliezer Yudkowsky

    18.05.2026 | 5 Min.
    (Initially written for the LW Wiki, but then I realized it was looking more like a post instead.)

    In 1895, the physicist Ignaz Robert Schütz, who worked as an assistant to the more eminent physicist Ludwig Boltzmann, wondered if our observed universe had simply assembled by a random fluctuation of order from a universe otherwise in thermal equilibrium. The idea was published by Boltzmann in 1896, properly credited to Schütz, and has been associated with Boltzmann ever since.

    The obvious objection to this scenario is credited to Arthur Eddington in 1931: If all order is due to random fluctuations, comparatively small moments of order will exponentially-vastly outnumber even slightly larger fluctuations toward order, to say nothing of fluctuations the size of our entire observed universe! If this is where order comes from, we should find ourselves inside much smaller ordered systems.

    Feynman similarly later observed: Even if we fill a box of gas with white and black atoms bouncing randomly, and after an exponentially vast amount of time the white and black atoms on one side randomly sort themselves into two neat sides separated by color, the other half of the box will still be in expectation randomized. If [...]

    ---

    First published:

    May 16th, 2026


    Source:

    https://www.lesswrong.com/posts/v8MSczS3CuoqMmTFw/a-relatively-brief-explanation-of-boltzmann-brains

    ---



    Narrated by TYPE III AUDIO.
  • LessWrong (Curated & Popular)

    "Automated Alignment is Harder Than You Think" by Aleksandr Bowkis, Marie_DB, Jacob Pfau, Geoffrey Irving

    17.05.2026 | 7 Min.
    Summary

    This is a summary of a paper published by the alignment team at UK AISI. Read the full paper here.

    AI research agents may help solve ASI alignment, for example via the following plan:

    Build agents that can do empirical alignment work (e.g.~writing code, running experiments, designing evaluations and red teaming) and confirm they are not scheming.[1]
    Use these agents to build increasingly sophisticated empirical safety cases for each successive generation of agents, gradually automating more of the research process
    Hand over primary research responsibility once agents outperform humans at all relevant alignment tasks.
    We argue that automating alignment research in this manner could produce catastrophically misleading safety assessments, causing researchers to believe that an egregiously misaligned AI is safe, even if AI agents are not scheming to deliberately sabotage alignment research. Our core argument (Fig. 1) is as follows:

    The goal of an automated alignment program is to produce an overall safety assessment (OSA) - an estimate of the probability that the next-generation agent is non-scheming - that is both calibrated and shows low risk.[2]
    Producing an OSA involves several tasks that are difficult to check. We refer to these as hard-to-supervise fuzzy tasks: tasks [...]
    ---

    Outline:

    (00:13) Summary

    (07:10) Acknowledgments

    The original text contained 4 footnotes which were omitted from this narration.

    ---

    First published:

    May 14th, 2026


    Source:

    https://www.lesswrong.com/posts/gpuYFbMNH8PJXpmny/automated-alignment-is-harder-than-you-think-1

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong (Curated & Popular)

    "MATS 9 Retrospective & Advice" by beyarkay

    17.05.2026 | 28 Min.
    I couldn’t find a recent write-up from a MATS alum about what attending MATS was like, so this is the thing that I wish I had. I attended MATS from January to March 2026, on Team Shard with Alex Turner and Alex Cloud. It was a great time! Applications for MATS are basically on a rolling basis nowadays, and I can strongly recommend applying (to multiple streams) even if you think you’re not a great match.

    With that being said, there's a lot I wish I knew going into MATS, so here's a brain-dump of thoughts. It's not extremely polished, but I expect it’ll be useful nonetheless (none of this is endorsed by MATS, just my thoughts):

    Work ethic

    I think most mentees were working 10-12, sometimes 14 hours a day Mon-Fri, and probably 2-8 hours on Saturday and Sunday, often going out on some adventure or party on the weekend. Exactly which hours people worked varied wildly. I usually worked 8:30am/9am to 11pm/midnight, with breaks during the day, others worked from midday into the early hours of the morning. This was surprisingly sustainable (IMO); MATS puts a lot of effort into removing all other blockers that you normally [...]

    ---

    Outline:

    (00:50) Work ethic

    (01:29) Use more compute

    (02:20) Research requires a lot of compute

    (03:12) Applying for jobs during MATS (dont do it)

    (04:55) The serious people are in War Mode

    (05:44) Do you feel the AGI?

    (06:00) Burn rate, efficiency, and decisions

    (07:12) insider information

    (08:08) Names & Faces

    (08:20) Fellows

    (08:50) Useful tools

    (11:19) Use more Claudes

    (12:06) Build nice helper utilities for yourself

    (12:59) MATS-mentee-mentor dynamics

    (13:45) Working with your mentors

    (14:27) Research managers

    (14:48) Ops requests

    (15:38) Non-MATS events

    (16:17) Team Shard

    (17:12) Weekly updates

    (18:46) Keep a log of your mistakes

    (19:06) My running-experiments setup

    (27:51) Lighthaven

    (28:12) Getting setup with the Compute team

    ---

    First published:

    May 15th, 2026


    Source:

    https://www.lesswrong.com/posts/eFD3rozNCZKMe4rTs/mats-9-retrospective-and-advice

    ---



    Narrated by TYPE III AUDIO.
  • LessWrong (Curated & Popular)

    "The primary sources of near-term cybersecurity risk" by lc

    16.05.2026 | 4 Min.
    [Some ideas here were developed in conversation with Chris Hacking (real name)]

    I have tried and failed to write a longer post many times, so here goes a short one with little detail.

    Discourse has primarily focused on models' ability to develop new exploits against important software from scratch. That capability is impressive, but the tech industry has been dealing with people regularly finding 0-day exploits for important pieces of software for more than twenty years. Having to patch these vulnerabilities at a 10xed or even 100xed cadence for six months is annoying, but well within the resources of Mozilla, the Linux Foundation, and Microsoft. Additionally, the lag time between "patch shipped" and "patch reverse engineered and weaponized by a criminal organization" was longer than the cadence between high-severity CVEs for this software anyways. And importantly, such capabilities are dual sided; the defenders will have access to them and

    There are lots of capabilities that are not like this, however:

    Weaponizing recently patched exploits for common software. Right now, for widely used C projects, we get enough publicly disclosed vulnerabilities to develop exploits with. Every amateur computer hacker has the experience of seeing a CVE for a [...]
    ---

    First published:

    May 14th, 2026


    Source:

    https://www.lesswrong.com/posts/gutiw8MBrYDiD2u5z/the-primary-sources-of-near-term-cybersecurity-risk

    ---



    Narrated by TYPE III AUDIO.
Weitere Gesellschaft und Kultur Podcasts
Über LessWrong (Curated & Popular)
Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
Podcast-Website

Höre LessWrong (Curated & Popular), Die OpenAI Story und viele andere Podcasts aus aller Welt mit der radio.de-App

Hol dir die kostenlose radio.de App

  • Sender und Podcasts favorisieren
  • Streamen via Wifi oder Bluetooth
  • Unterstützt Carplay & Android Auto
  • viele weitere App Funktionen
LessWrong (Curated & Popular): Zugehörige Podcasts
Rechtliches
Social
v6.9.1| © 2007-2026 radio.de GmbH
Generated: 5/19/2026 - 12:03:33 AM