PodcastsGesellschaft und KulturLessWrong (Curated & Popular)

LessWrong (Curated & Popular)

LessWrong
LessWrong (Curated & Popular)
Neueste Episode

884 Episoden

  • LessWrong (Curated & Popular)

    "Empowerment, corrigibility, etc. are simple abstractions (of a messed-up ontology)" by Steven Byrnes

    01.06.2026 | 31 Min.
    1.1 Tl;dr

    Alignment is often conceptualized as AIs helping humans achieve their goals: AIs that increase people's agency and empowerment; AIs that are helpful, corrigible, and/or obedient; AIs that avoid manipulating people. But that last one—manipulation—points to a challenge for all these desiderata: a human's goals are themselves under-determined and manipulable, and it's awfully hard to pin down a principled distinction between changing people's goals in a good way (“providing counsel”, “providing information”, “sharing ideas”) versus a bad way (“manipulating”, “brainwashing”).

    The manipulability of human desires is hardly a new observation in the alignment literature, but it remains unsolved (see lit review in §3 below).

    In this post I will propose an explanation of how we humans intuitively conceptualize the distinction between guidance (good) vs manipulation (bad), in case it helps us brainstorm how we might put that distinction into AI.

    …But (spoiler alert) it turns out not to really help, because I’ll argue that we humans think about it in a deeply incoherent way, intimately tied to our scientifically-inaccurate intuitions around free will.

    I jump from there into a broader review of every approach that I can think of for writing a “True Name” for manipulation or [...]

    ---

    Outline:

    (00:13) 1.1. Tl;dr

    (02:04) 1.2. Bigger-picture context: why is this issue so important to me?

    (04:48) 2. How do humans intuitively define empowerment, agency, manipulation, etc.?

    (04:56) 2.1. Background: human free will intuitions

    (09:20) 2.2. Our free-will-infused intuitive notions of empowerment, agency, manipulation, corrigibility, responsibility, etc.

    (12:00) 2.3. Another dimension: counsel vs manipulation as an emotive conjugation

    (13:07) 3. If the intuitive definitions of manipulation etc. reside in a messed-up ontology, has the alignment literature found any alternative, better way to define these concepts?

    [... 12 more sections]

    ---

    First published:

    May 11th, 2026


    Source:

    https://www.lesswrong.com/posts/vzHtHHBJoKATi5SeK/empowerment-corrigibility-etc-are-simple-abstractions-of-a

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong (Curated & Popular)

    "Trees are mostly made of air and a generalizable lesson for AI safety" by zroe1

    31.05.2026 | 7 Min.
    At the risk of embarrassing myself, I’ll share a confession.

    For context, I took five years of Latin: four in high school and one in college. In addition to learning the language, all my Latin classes taught a lot about Roman history. Emperors, internal politics, Caesar, etc. I was always learning some random bag of facts about Roman history. In high school, I won the award for top Latin student in my graduating class. So I wasn’t a bad Latin student.

    Here's the confession: I somehow don’t even vaguely remember the rough timespan the Roman Empire existed. Maybe Jesus time? I know he was killed by the Romans (is that right?). Were they around for a long time after? A long time before that? When was Romulus and Remus allegedly fighting? Virgil wrote the Aeneid when? I don’t have a clue. Despite being a kind of “Latin expert” I am missing a much more important foundational fact: when all of this was happening.

    When I say trees are made out of air I’m not talking about the fact that there is a lot of empty space inside a tree (or actually anything made out of atoms). I mean something [...]

    ---

    First published:

    May 28th, 2026


    Source:

    https://www.lesswrong.com/posts/xiTBpBDwubnr4MLRe/trees-are-mostly-made-of-air-and-a-generalizable-lesson-for

    ---



    Narrated by TYPE III AUDIO.
  • LessWrong (Curated & Popular)

    "Mnemonic portraits for 19,023 human genes" by Brinedew

    29.05.2026 | 34 Min.
    Back in 2013, Scott Alexander wrote in Extreme mnemonics:

    JS-154 is one of five metabolic products of netamine; however, the enzyme that produces it is unknown. It is manufactured in cells in the far rostral region of of the cerebrum, but after binding with a leukocynoid it takes a role in maintaining the blood-brain barrier – in particular guiding the movements of lipid molecules.

    I find I can read paragraphs like this five or six times, write them on flashcards, enter them into Anki, and my brain still refuses to understand or remember them after weeks of trying.

    On the other hand, my brain easily remembers vastly more complicated structures when they’re loaded with human-accessible meaning. For example, just by casually reading the Game of Thrones series, I know an extremely intricate web of genealogies, alliances, locations, journeys, battlesites, et cetera. Byte for byte, an average Game of Thrones reader/viewer probably has as much Game of Thrones information as a neuroscience Ph.D has molecular biology information, but getting the neuroscience info is still a thousand times harder.
    […]
    This makes me wonder if it would be possible to produce a story as enjoyable as Game of Thrones which was [...]

    ---

    Outline:

    (01:47) What molecules should we map to the characters?

    [... 8 more sections]

    ---

    First published:

    May 28th, 2026


    Source:

    https://www.lesswrong.com/posts/BJ7AqXeigNKXLqZyx/mnemonic-portraits-for-19-023-human-genes

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:
  • LessWrong (Curated & Popular)

    "Cognitive Security as an AI Safety Cause Area" by jsteinhardt

    27.05.2026 | 5 Min.
    As AI systems become more capable, the cognitive security of humans will be increasingly at risk. By cognitive security, I mean the ability of humans to maintain control over their beliefs and actions.

    Cognitive security could be compromised in several ways: AI could become very good at persuading people of arbitrary positions; interacting with AI could lead humans to lose touch with reality; and AIs could become very effective at blackmail or at producing extremely convincing false information.

    We are already seeing this happen:

    Persuasion. Frontier LLMs are now as persuasive as humans on political issues, and post-training for persuasiveness boosts performance further, suggesting there is headroom.
    AI psychosis. There are many reports of people developing delusional beliefs after extended chatbot conversations, including people with no prior history of mental illness. Children have taken their own lives after being encouraged toward suicide by chatbots.
    Convincing impersonation. Scammers used real-time deepfaked video to impersonate the CFO and other staff of Arup on a video call, convincing a finance employee to wire 25.6 million dollars across 15 transactions. On a more day-to-day basis, AI voice cloning is now widespread in family-emergency and "grandparent" scams.
    Right now, many of these effects [...]

    The original text contained 2 footnotes which were omitted from this narration.

    ---

    First published:

    May 25th, 2026


    Source:

    https://www.lesswrong.com/posts/KGcE7eAdfxHchk25X/cognitive-security-as-an-ai-safety-cause-area

    ---



    Narrated by TYPE III AUDIO.
  • LessWrong (Curated & Popular)

    "theory uplift differentially benefits safety & is massively underpriced" by Yudhister Kumar

    27.05.2026 | 2 Min.
    [1] We will likely have near-superhuman mathematics AI by Q1 2027.
    [1]

    [2] Qualitatively, AI mathematics capabilities are developing significantly faster than automated AI R&D capabilities.
    [2]

    [3] Thus, we will likely have a period of time where the rate of our ability to rigorously & usefully verify and understand model behavior and model outputs outpaces the rate of capability development itself.

    [4] Our ability to take advantage of this period is bottlenecked on the quality of our specification generation infrastructure, elicitation tooling (for proofs & specs etc.), and the institutional capacity for scaling useful outputs with capital.

    [5] My understanding is that basically no one
    [3]
    is working on building infra that can usefully turn >100 million dollars of compute credits into safety-relevant mathematical output.

    [5.1] The number of theory-driven ASI alignment efforts is also comparatively miniscule. ARC is a much better bet now than it was in 2023.

    [5.2]. My understanding is also that no one is working on developing AI-powered conceptual tooling infrastructure for tackling problems in, for instance, [metaphilosophy] (https://www.alignmentforum.org/posts/EByDsY9S3EDhhfFzC/some-thoughts-on-metaphilosophy). This is a much harder problem.

    [6] In worlds where alignment is easy, prosaic methods may [...]

    The original text contained 3 footnotes which were omitted from this narration.

    ---

    First published:

    May 20th, 2026


    Source:

    https://www.lesswrong.com/posts/KWeAYcDJwfrG7RwBN/theory-uplift-differentially-benefits-safety-and-is

    ---



    Narrated by TYPE III AUDIO.
Weitere Gesellschaft und Kultur Podcasts
Über LessWrong (Curated & Popular)
Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
Podcast-Website

Höre LessWrong (Curated & Popular), Zwischen den Zeilen und viele andere Podcasts aus aller Welt mit der radio.de-App

Hol dir die kostenlose radio.de App

  • Sender und Podcasts favorisieren
  • Streamen via Wifi oder Bluetooth
  • Unterstützt Carplay & Android Auto
  • viele weitere App Funktionen
LessWrong (Curated & Popular): Zugehörige Podcasts
Rechtliches
Social
v8.9.4| © 2007-2026 radio.de GmbH
Generated: 6/1/2026 - 8:02:53 PM