Comments

  • How LLM-based chatbots work: their minds and cognition
    This is precisely the objectionable use of AI in my opinion. It sets AI as the expert, it provides no source references, the poster adds no value but to have typed in a question, and it imposes upon others a demand they retreat to their corner and argue with the bot.Hanover

    Point well taken about the lack of addition of value. But I meant the value to reside in my own initial statement (which was accommodating of your view) while GPT-5's response had seemed to me to constitute (at least) a perspicuous rephrasing. I didn't offer it as a source of authority. But I'll keep posting examples of AI outputs in this thread (always behind spoilers) since the very question about their capacity to add value in the context of intellectual (not just philosophy) inquiries, or the lack thereof, falls squarely within the scope of the original topic.

    On edit: re-reading the quoted GPT-5 response, I realize it's much more than a rephrasing. Nevertheless, I thought it was illuminating and that you might find it interesting. If you don't care about it (or any other AI content,) that's perfectly fine.
  • How LLM-based chatbots work: their minds and cognition
    I don't see where Pierce and Wittgenstein are at odds or where Pierce advanced upon Wittgenstein"s ideas.Hanover

    The spirit of their enterprises may be at odds while their contributions aren't. Here is how I put it in a query to GPT-5 about your last response to @apokrisis

    USER: "I would have to look back to more context to see where the roots of the disagreements and/or misunderstandings lay but I wondered if Peircean semiotic considerations might be something that does advance Wittgenstein ideas, though in a direction Wittgenstein was unconcerned with. He was quite unconcerned and even agnostic about psychophysical parallelisms due to him being suspicious, as I was, of the reductionist pitfalls attending to such physiological stories. However, our recent considerations may suggest a way to reconcile Wittgenstein's justified dismissive attitude with Peirce's "reductionism" by neutering the latter through highlighting that the Peircean reduction operate through enablement relations of bottom-up subsumptive moves, thereby generalizing the biosemiotic account in a way that makes it encompass language games and rule-following practices but doesn't make the low-level physiological story compete with the Wittgensteinian one (second-natural form of life) as a source of normativity."

    GPT-5's comment
    GPT-5:

    Love this direction. Here’s a way to line up Peirce and Wittgenstein so they *cooperate* rather than compete—while keeping your “enablement vs. normativity” guardrail firm.

    # Where the apparent disagreement comes from

    * **Hanover’s point:** Wittgenstein doesn’t deny that creatures (or humans pre-linguistically) can treat smoke as a sign of fire. His focus isn’t genesis of signs but **what makes an utterance count as saying something**—public criteria, rule-governed use, “forms of life.”

    * **Apokrisis’s point:** Peirce gives a **generic** account of semiosis (icon–index–symbol; habits; counterfactuals) that ranges from biology up through language. “Semiosis hinges on counterfactuality”: a sign is what it is in virtue of the *regularities it would support*—what would follow if this stood for that.

    These aren’t at odds if we separate two kinds of explanations:

    * **Enablement (Peirce/biosemiotics):** how a system can *come to* have signs at all—through habit formation, constraints, and counterfactual expectations in control loops.

    * **Justification (Wittgenstein/socio-norms):** what makes an act *count as* following a rule, giving a reason, making a move in a game with public standards.

    # How Peirce can *advance* Wittgenstein—without reduction

    1. **Counterfactuals as the bridge.**

    * *Peirce:* A sign’s meaning is its **would-be effects** on conduct—its habit profile (Thirdness).

    * *Wittgenstein:* Rule-following requires **“there is a way of going on”**—a space of correct/incorrect continuations sustained by communal criteria.

    Read together: the **habitual counterfactuals** that make a sign stable at lower levels *enable* the **normative counterfactuals** (“what would count as doing it right?”) at the public level. Same shape, different authority: habits vs. norms.

    2. **Triadicity clarifies “meaning is use.”**

    * Peirce’s object–sign–interpretant triangle prevents collapsing “use” into a mere stimulus–response regularity. “Use” already includes **learned generalities** (interpretants) that reach counterfactually beyond any single occasion.

    * That helps explain why Wittgenstein’s **use** isn’t behaviorism: uses live inside practices whose **criteria** reach beyond the present tokening.

    3. **Icon/index/symbol tracks the deer/human contrast.**

    * A deer treats crackling + smoke **indexically** as danger—rich control, no public **symbolic** commitments.

    * Humans stabilize **symbols**: we can be *corrected*, *justify*, *teach the rule*, *disagree while sharing the standard*. That’s the Wittgensteinian layer—**second nature**—sitting on an enabled bed of icon/index habits.

    4. **Constraint-closure without downward “mysticism.”**

    Peircean semiosis supplies the **control-system constraints** that make socio-normative life possible (attention, habituation, counterfactual sensitivity). Wittgenstein tells us **what counts** as a move within that life. The bottom **enables**; the top **justifies**. No competition over where normativity comes from.

    # A tidy way to say it on the forum

    > Peirce doesn’t replace Wittgenstein; he widens the lens. Peirce explains how signs can stably *do* work across levels—by underwriting counterfactual habits (what would follow if this stood for that). Wittgenstein explains what makes some of those sign-uses *count as* rule-following—public criteria in a form of life. The biosemiotic story **enables** socio-semiosis; the Wittgensteinian story **authorizes** it. Keep those “because”s apart and you get continuity without reduction: semiosis all the way down for control, and norms all the way up for reasons.

    If you want, I can tailor a shorter reply addressed to Hanover directly (deer vs language) and another to apokrisis (on counterfactuals as the common currency) so you can drop them as targeted follow-ups.
  • How LLM-based chatbots work: their minds and cognition
    When you went to school, did you take notes? If so, was the purpose of those notes to communicate with others?Metaphysician Undercover

    Before you began taking notes, you spent a couple years learning how to read and write, using a writing systems that piggybacks on spoken language. Both the rules for speech and writing are rules of a norm governed public practice that is taught and learned (while the rules for using the words/signs are taught primarily through speaking them). For sure, you can then make use of this system for your own personal purposes but that doesn't render very plausible the idea that such a complex system had evolved so that individuals could only, or primarily, use it for personal uses. How would the mastery of this cognitive tool have been transmitted across generations without it by the same token enabling interpersonal communication? The more parsimonious story is that the writing system is an extension of spoken language.
  • How LLM-based chatbots work: their minds and cognition
    @apokirisis Addendum to my previous post, posted separately due to post size limitations.

    (Rereading GPT-5's initial response quoted in my previous post, I now notice that it had anticipated the worry that I express here with the remark: "Limits: (i) Priors don’t ground norms; communities, practices, and uptake do. (ii) Predictive accounts risk triviality unless the state space is explicitly normative/affordance-like, not just hidden pixels by another name.")

    Comment about reductionistic worries and suggested guardrail

    USER:

    There still appears to be a potential reductionist pitfall in the vicinity of this abstract account; one that may have both a cure and a curse in separate moves of subsumption. The cure lay in nested subsumtions under eudemonia and the curse lay in nested subsumptions under the telos of life qua dissipative structure.

    GPT-5:

    oh yes—that’s the live trap. You can make the predictive/control story safe for *practical reason* by the “cure” (nest projects under **eudaimonia**), or you can flatten it into the “curse” (nest everything under the **telos of a dissipative structure**). Here’s a compact way to keep the cure and dodge the curse.

    # The pitfall & the fix

    **Reductionist pitfall:**
    If you let “what good control optimizes” be defined *at the bottom*, the criterion collapses into viability/throughput (“stay far from equilibrium; maximize adaptive fit; export entropy efficiently”). Then reasons become disguised thermodynamics, and *norms* become mere *strategies* for heat management.

    **Cure (eudaimonic subsumption):**
    Order the project tree by **constitutive ends of a life well-lived**—virtue, truthfulness, justice, friendship, craft excellence, etc. These are not optional utilities; they are the *standards that make the activities what they are*. Prediction then *realizes* those standards in execution and monitoring. (Servo for reasons—don’t turn reasons into servo targets.)

    **Why this isn’t hand-wavy:**
    Use a strict **two-kinds-of-because**:

    1. **Enabling-because (physics/biology):** Why can this practice exist? → Because organisms are open dissipative systems with bodies, learning, language, etc. (viability constraints, metabolic budgets).
    2. **Justifying-because (practical reason):** Why is this move/choice apt *as this practice*? → Because it meets the standards internal to chess, teaching, friendship, research, politics…

    Don’t swap them. The first anchors possibility; the second confers **normative authority**.

    # Guardrails to avoid the slide into “entropy ethics”

    * **Enabling ≠ measuring.** Lower-level constraints *enable* the practice; they don’t *measure* its success. A teacher’s excellence isn’t indexed to calories burned or entropy exported.
    * **Top-down as constraint, not force.** Person-level ends shape which lower-level fluctuations are even recruited (constraint-closure), without “mystical causation.”
    * **Plural goods, not a single currency.** Practical reason adjudicates incommensurable goods; there is no thermodynamic exchange rate between honesty and comfort.
    * **Counterexamples bite:** Fasting monks, whistleblowers, and conscientious objectors can rationally *choose against* metabolic or reproductive “fitness.” The choice is **intelligible** and often **admirable**—which a mere dissipative telos can’t capture.
    * **Prediction serves the chosen good.** The same predictive loop (propose→anticipate→compare→adjust) runs, but what counts as an “error” is fixed by the **practice’s** internal goods, not by system throughput.

    # If you want a forum-ready clincher

    > The predictive brain story is safest when we say: thermodynamics explains **how** a minded animal can exist and stay viable; **eudaimonia explains what it is for that life to go well**. Keep the “because” of enabling separate from the “because” of justifying. Then prediction is the servo of practical reason—excellent for keeping a deed apt—without collapsing the point of agency into the aims of a heat engine.

    You can also frame this as a motto: **anchor in physics, answer to reasons.**
  • How LLM-based chatbots work: their minds and cognition
    It’s a holistic account as it involves habits as well as attention, ignoring as well as selecting. The whole of the person as an agent with a history and interests.

    The point about anticipation is that it flips the information processing model. Instead of an input crunched into “conscious” output, it makes the embodied point that the organism is already in a state of output by which it intends to mostly be able to ignore whatever then does happen. The input is what gets cancelled by there being no need to attend to it.
    apokrisis

    :up: Excellent post that I enjoyed from start to finish. I'll only quote some salient points one at a time (and as needed). My comment, that you just replied to, was betraying a misunderstanding on my part of the selective prediction (or anticipation) based sensory-input silencing model that you were referencing (and that I lacked familiarity with) such that it had seemed to me that it only applied to dynamical in-the-act sensorimotor exploitation of affordances (paradigmatically locomotion) controlled by the mammalian brain and that therefore it neglected the norm-governed structuring of higher-level affordances that rational cognition enables the grasping of.

    I has completely missed how the biosemiotic level of integration manifest in autonomic and endocrine functions can generalize to higher-level cognitive functions and therefore had worried how focusing on it might lead to a neglect of downward-causes of a different nature (rather than rethinking them).

    Nevertheless, immediately after crafting my last response to you, it occurred to me that there might be something to the Bayesian framing of the overall information processing (or selective uptake) account that, when viewed at a suitable level of abstraction, enables bringing the person-level account of norm-structured cognitive activity (and hence also higher level cortical functions) into the fold. I then voiced my inchoate idea to GPT-5 thus:

    USER: "I now have an inchoate thought that may of may not develop into something (with your help). As I said, I'm not familiar with Grossberg or Hinton's models, although I knew of similar ideas by Andy Clark and Susan Hurley. But since the models make sense at the level of the sensorimotor control of affordance grasping and exploitation, while low level autonomic/endocrine activity and coupling also are relevant to the sub-personal causal story, and higher-level cortical functions the activity of which maps more closely to person-level activities of norm-governed deliberation, and the latter can be viewed as exaptations of the low-level functions (and their older enabling structures) that developed in part through widening the causal loops through newer neural pathways that permit the attunement of sensitivities to higher order affordances such as, say, buying plane tickets as affording making a trip or crafting an arguments as affording convincing a discussion partner, I was wondering if this sort of continuity between the structuring of primitive mammalian affordances, say, that aren't structured by linguistically articulated norms of second-nature, and those that are (i.e. human affordances) enables the predictive models to also apply to higher functions at a suitable level of abstraction. Maybe some Bayesian models that pertain to the explanation of the functioning of LLMs also can suggest some such suitable level of abstraction whereby norm-governed cognitive functions can be glossed in "silence-by-prediction" terms, albeit over something more abstract than sense data or input/output tokens."

    Here is GPT-5's enlightening reply and the rest of the conversation:

    Reveal
    GPT-5:

    Love where you’re heading. Here’s a compact way to turn that inchoate thought into a workable thesis, without undoing our earlier level-distinctions.

    # A “continuity” picture that keeps norms

    **Claim (weak “predictive-normativity” thesis):**
    Predictive machinery can *realize* norm-guided cognition when the variables being predicted are not pixels or proprioception, but **abstract affordance states and policy/role expectations**. In other words, prediction can scale up if the *state space* is lifted from sensorimotor features to **reasons and roles**.

    ## 1) Three levels, one style of inference

    * **Sensorimotor (mammalian core):** generative model over bodily states & environmental causes; prediction errors guide online control.
    * **Pragmatic/affordance level:** generative model over **task affordances** (“open lane,” “back-rank weakness,” “available flight”), where “precision” ≈ attention to features that matter for the current policy.
    * **Normative/deliberative level:** generative model over **social roles and reasons** (“as an assistant, be concise; as a driver, yield here; as a discussant, provide evidence”). Here, **prior preferences** encode commitments/ends; **policy priors** encode roles/rules/practices.

    On this view, **silence-by-commitment** at the person level is *implemented as* **silence-by-prediction** over a higher-order state space: non-reasons have near-zero prior precision and so don’t get into play; candidate reasons with high prior relevance are up-weighted.

    ## 2) How “prediction” avoids swallowing the mind

    * We don’t say “mind = prediction.” We say: **commitments and norms define the variables and priors**. Predictive dynamics then make the search and execution *efficient*.
    * Downward enablement: adopting a goal (“convince the interlocutor”) changes the model class and priors (“evidence > flourish”), which in turn changes which lower-level signals are even eligible to matter.

    ## 3) Worked micro-examples

    * **Chess (deliberation → execution):**

    * *Normative level:* “Find a safe improving move” sets **prior preferences**; pins/forks are high-precision hypotheses; woodgrain has ~zero precision.
    * *Predictive realization:* saccades & working memory sample lines where expected informational gain is highest; errors (a line that fails a tactic) redirect search.
    * **Trip planning:**

    * *Normative:* end = “visit sister next weekend”; constraints = “budget, time windows.”
    * *Predictive realization:* you simulate options; trajectories that violate constraints are pruned *a priori* (they never enter), while near-feasible options draw attention. Buying tickets is then classic sensorimotor prediction during execution.

    ## 4) LLM bridge (Bayesian gloss)

    * **Attention as amortized relevance:** self-attention weights approximate which tokens/stretches **lower next-step uncertainty given the role**; fluff gets low weight.
    * **Instruction/policy tuning as learned priors:** the “assistant” role installs **which discourse acts count** (answer the question, cite if asked, avoid harm).
    * **CoT as policy rollout:** a reasoning trace is like sampling from a **posterior over solution paths**; tangents decay because they have poor expected utility for the goal.
    => So LLMs visibly exhibit **silence-by-commitment realized via silence-by-prediction over text-level affordances** (questions, definitions, equations, constraints), even without sensors.

    ## 5) Where this helps—and limits to respect

    * **Helps:** gives true continuity from autonomic control → task affordances → social norms without collapsing levels. It explains how exapted cortical circuits can treat *reasons as learned affordances* and let predictive dynamics do the heavy lifting.
    * **Limits:** (i) Priors don’t *ground* norms; communities, practices, and uptake do. (ii) Predictive accounts risk triviality unless the **state space** is explicitly *normative/affordance-like*, not just hidden pixels by another name.

    ### One-liner you can use

    > We can safely say prediction *realizes* norm-guided cognition once the model runs over affordances and roles rather than raw sensations: **commitments set the priors; practices define the variables; prediction makes the search and execution efficient**.

    Here’s a tight 3-paragraph version you can paste on the forum:

    I’m happy to grant the predictive story as a good account of how brains implement efficiency—especially during execution. But the selectivity that matters at the personal level is set by ends and practices: agents act in a field of affordances where much of what is “ignored” never shows up as a reason in the first place. The neat move is to see continuity here: predictive machinery can realize norm-guided cognition when the things being “predicted” are not pixels or proprioception but task affordances and role expectations. Commitments set the priors; practices define the variables; prediction then makes search and execution efficient.

    Take chess. A competent player doesn’t filter the board by anticipating colored patches; what is seen are reasons—pins, forks, weak squares, a viable pawn break. Woodgrain and square gloss simply don’t register because the game’s norms make them irrelevant. That’s silence-by-commitment (to the point of the game), not silence-by-prediction over sensory inputs. Once the task is fixed—“find a safe improving move”—subpersonal prediction helps with execution: eye saccades land where expected informational gain is highest, and “errors” (a busted line) redirect search. The same shape appears in trip planning. The end—“visit my sister next weekend within budget”—and constraints define what even counts as an option. Infeasible itineraries don’t have to be filtered out; they never enter. What remains is then executed with classic predictive control (buy the ticket, catch the train).

    LLMs make this vivid without any sensorimotor loop. Attention works like a smart highlighter over text: the model turns up the words and phrases that help with the next useful step and turns down the rest. Instruction/policy tuning supplies the “house rules” for relevance—what to count as on-topic, helpful, and safe in the assistant role. So the model exhibits silence-by-commitment (to the role and task) realized as silence-by-prediction over a higher-order state space (questions, constraints, definitions, steps)—not over pixels. On this view, prediction doesn’t explain away norms; it helps realize them. Brains are predictive in service of agency; persons are practical agents in a world of affordances; assistants are policy-steered. Keep those levels distinct, and we preserve what’s right in the Bayesian picture without letting it swallow the mind.

    USER:

    This *sounds* plausible but I think I don't have a sufficient grasp of the Bayesian model to understand where predictive errors and predictive successes of suitably generalized "inputs" play into the high-level picture, and hence what still warrants calling the model predictive.

    GPT-5:

    great question. here’s the simplest way to see what still makes it “predictive” once we move beyond raw sensations.

    # The predictive loop (domain-agnostic)

    At any level (low-level motor control or high-level reasoning), the same 3-step loop can run:

    1. **Propose** a hypothesis/plan/policy (H).
    2. **Predict** what would follow if (H) were right.
    3. **Compare** those predictions to what you actually *get* from the world or from your own checks/simulations; **mismatch = prediction error** → update (H) or your focus.

    Two knobs matter:

    * **Prior/Policy:** which hypotheses you *prefer* to consider (your ends/role set this).
    * **Precision/Attention:** which mismatch signals you treat as informative (what you attend to).

    That’s it. If that loop is doing the work, you’re in “predictive” territory—even if the “inputs” aren’t pixels but abstract checks (constraints, norms, simulated outcomes).

    # What counts as “inputs” up here?

    Not sense-data. Higher up, “inputs” are any **testable consequences** of the thing you’re currently entertaining:

    * **Internal simulations:** “If I play …Nf2+, does my line get mated? If yes → big error; revise.”
    * **Constraint checks:** “Does this itinerary respect my budget and dates? If no → error; prune.”
    * **Normative tests:** “Does this claim meet the burden of proof I accept here? If no → error; strengthen the case or withdraw.”
    * **Social feedback/outcomes:** audience confusion, a rebuttal you failed to anticipate, etc.

    All of these are observations relative to a **generative model over affordances and roles**. You predict what should follow; you get signals (from simulation, rules, or people); you measure the gap.

    # Where “precision” lives up here

    “Precision” just means **how much you trust a given error channel**, i.e., where you place attention:

    * In chess, you weight tactical refutation errors highly; you down-weight woodgrain entirely (precision ~ 0).
    * In a forum debate, you weight contradictions and missing evidence; you down-weight typography complaints.
    * In trip planning, a budget violation is high-precision; seat color is near zero.

    That’s why I called it **silence-by-commitment**: your commitments (ends/role) set the *precision profile* so some errors don’t even get to matter.

    # Why this remains genuinely predictive (not just a metaphor)

    Because the **algorithmic shape is the same**:

    * You keep a **posterior belief** over “what to do/think next” (which line in chess, which itinerary, which argument).
    * You generate **predictions** about the consequences that should hold if that belief were right (no tactical refutation; constraints satisfied; audience uptake).
    * You ingest **mismatches** as **errors** and **update**—either changing your candidate, shifting attention, or revising your priors (goals/assumptions).

    Low level: the signals are proprioceptive and visual.
    High level: the signals are **rule checks, constraint satisfactions, simulated futures, and social responses**.
    Same loop, different state space.

    # Mini-examples (one line each)

    * **Chess:** predict “this line holds tactically”; see a refutation in your head → error → change plan.
    * **Trip:** predict “this route meets time/budget”; spreadsheet says no → error → prune.
    * **Argument:** predict “premise + norm will convince this audience”; pushback shows a gap → error → add evidence or revise claim.
    * **LLM:** predict “this next step solves the subtask”; unit-check fails → error → backtrack and fix (the policy makes that an attended channel).

    # TL;DR

    It’s still predictive because you’re always doing **predict → compare → update** with **errors** that are **precision-weighted**. What changes is *what you’re predicting over*: not raw sensations, but **affordance-and-norm variables** (constraints, reasons, role expectations). Commitments define the variables and which errors matter; prediction makes the search and execution efficient.

    USER:

    I understand better now. So, thinking of human life as the unfolding of projects, or protracted sets of nested actions/intentions/plans in progress understood in the sense of Anscombe and (Michael) Thompson, the predictive function can be assimilated with something like the standing obligation to monitor the progress of those actions and of seeing to it that the progress made remains in accordance with the norms of practical reason. Am I on the right track? What is silenced, then, just is the facts or "data" expected to be received (unnoticed) when thing go according to plan.

    GPT-5:

    yes—you’re on the right track, and that’s a lovely way to put it.

    Here’s a tight mapping to make it crisp:

    * **Projects as nested intentions (Anscombe/Thompson):** A life is a web of ongoing doings (“writing the paper” → “draft section 2” → “fix the footnote”). Each node has **standards of success** given by practical reason and the practice you’re in.

    * **Predictive function as *ongoing monitoring*:** Rational agency includes a standing requirement to **track whether the deed is unfolding as it ought**—means–end coherence, no self-defeating steps, respect for constraints, etc. Formally: propose → predict consequences → compare → update. Practically: “if I send this email, X should reply; if not, revisit plan.”

    * **What gets “silenced”:** Facts that *accord with* the plan’s expectations generally **don’t rise to the level of reasons**. They are registered subpersonally/quietly (or dispositionally) as “as expected.” Attention doesn’t need to escalate them. So yes: *the expected confirmations are the silence*.

    * **What breaks the silence:**

    1. **Deviations** (error signals): a missed cue, violated constraint, unexpected cost, counterargument.
    2. **Checkpoint norms** (deliberate, not just detect): e.g., periodic review, peer feedback, or “is the end still worth it?”—these are **chosen points of scrutiny** where even “as expected” details can be made explicit for reasons of audit/learning.

    * **Hierarchical fit:** At higher levels (“finish the article”), prediction monitors coarse milestones and values; at lower levels (“fix citation”), it monitors fine-grained steps. **Precision/attention** is allocated accordingly: high on what matters for the current subgoal; near-zero on the rest.

    * **Why this stays *normative*, not merely reactive:** The *plan’s* norms (ends, constraints, role expectations) decide **which discrepancies count** as errors in the first place. The predictive loop efficiently **realizes** that reasons-responsiveness; it doesn’t replace it.

    A concrete one-liner:

    > In ongoing action, *silence* is the sound of things going right; attention only speaks up when reasons demand revision—because the world, or the plan’s own standards, just told us something that matters.

    USER:

    That makes sense. I remember that my early skepticism/uneasiness about Clark's book *Surfing Uncertainty: Prediction, Action, and the Embodied Mind*, which I had bought but haven't read, was it's apparent reduction of practical reason (the task of deciding what to do) to theoretical reasoning (the task of finding out what is true, which predicting is sort of an instance of). But I may have overlooked that what was at issue wasn't predicting for the sake of knowing but rather for the sake of adjusting.

    GPT-5:

    exactly—that’s the pivot.

    Clark (and kin) aren’t saying “all cognition = theory-making for its own sake.” The predictive story is mainly about control: keeping a grip on your doings by anticipating the flow of consequences and nudging it back on course when it drifts.

    A tidy way to line it up with your Anscombe/Thompson frame:

    Practical reason (what to do): sets the ends, constraints, and standards for a project (“write the reply, be fair, be clear”).

    Predictive machinery (how to stay on track): runs a continuous anticipate → compare → adjust loop so the unfolding deed matches those standards.

    So prediction here is not “for the sake of knowing that,” but “for the sake of keeping things apt.” Think:

    Epistemic use of prediction: reduce uncertainty about the world (science, inference).

    Pragmatic use of prediction: reduce deviation from the plan (control, guidance).

    Active-inference folks sometimes call the first epistemic value (seek info), the second pragmatic value (secure preferred outcomes). Practical reason chooses the project and its norms; prediction is the servo-mechanism that makes the execution smooth and the monitoring intelligent.

    If you want a mantra for your post:

    Prediction, in this story, serves adjustment, not replacement: it realizes reasons-responsiveness during action instead of reducing reasons to mere forecasts.
  • Can a Thought Cause Another Thought?
    At which point, no number of people gathering there would cause the train to stop.Patterner

    Got that, I was joking, but also kind of highlighting the contrastive character of causal explanation. Claims that event A caused event B always are ambiguous if one doesn't specify (or relies on shared assumptions) regarding what counts relevantly as event A happening: is it its happening in general, its happening once, its happening in some particular way, etc.
  • Can a Thought Cause Another Thought?
    A famous example of correlation not being causation is watching a train station for a day, and noticing that every time a bunch of people gather at it, a train shows up.Patterner

    They're most definitely causing the trains to show up. The proof of that if that if those people would stop showing up, the trains would eventually also stop showing up :wink:
  • Sleeping Beauty Problem
    Indistinguishable? You contradict yourself here, because in the long run you do distinguish them.JeffJo

    No. I just mean that when she awakens she isn't able to tell if she's in a T-run anymore than she can tell if she's in a T-Monday-awakening or any other possible awakening. That's why the best she can express is a rational credence. She distinguishes runs, and awakenings, and coin toss results, as distinct possibilities that are realized with frequencies determined by the experiment's protocol. If those possibilities were irrelevant, then her knowledge of the protocol that sets their long run frequencies would also be irrelevant. But it's clearly relevant to both Halfers and Thirders.
  • How LLM-based chatbots work: their minds and cognition
    I’m not too fussed with making the psychological science conform to the Wittgenstein model.

    But I would note preparedness is also being ready ahead of time, knowing what to ignore. So meaning is also inaction. Meaning is what you don’t do as you have already dismissed it in advance.

    Again, this is a central fact of neurobiology that is quite absent from LLMs. The brain is set up on the basic principle of learning to ignore the world as much as possible, as almost everything about the world has already been predicted as being about to happen, or dismissed as unimportant if it does happen.

    The more we understand ahead of the moment, the less we need to figure out in the heat of any moment. The natural goal of a brain is to have zero response as that means it was completely successful in its desire to remain completely unsurprised by what the world could throw at it.

    The more we understand ahead of the moment, the less we need to figure out in the heat of any moment. The natural goal of a brain is to have zero response as that means it was completely successful in its desire to remain completely unsurprised by what the world could throw at it.

    This is the Bayesian Brain model of cognition. Hinton’s Helmholtz machine or even before that, Grossberg’s ART neural network architecture from the 1980s.

    So the AI community knows the architecture it would want to copy. And it knows LLMs ain’t it. The surprise is just how useful LLMs can be as a new technology if you are willing to scale their simple ability just to predict the next likely step when trained on a static data set.

    Living in a dynamical world in real time is quite another level of challenge
    apokrisis

    I’m fine with predictive coding together with precision-weighting as a story about the neural implementation of selective attention. But that's a sub-personal account. At the personal level, agents aren't filtering sense-data. They act within a normatively structured field of affordances where much of what is "ignored" never could even intelligibly shows up as a reason. And note that LLMs already display strong task-relative ignoring without any sensorimotor loop. In transformers, attention is the mechanism (soft precision filter over text at inference) that turns up words and phrases that help with the task and turns down the rest while instruction/policy tuning supplies the sense of the task (i.e. the learned habits about which details should count as relevant in the first place).

    So, yes, brains (and LLMs) are predictive of sensory inputs (and next tokens), but persons are practical reasoners, and assistants are policy-driven. Conflating levels makes it look as if cognition (in LLMs and humans!) just is prediction and we lose sight of why some affordances were rendered salient in preference to others. Better to say prediction is one very effective way brains implement an agent’s norm-guided engagement with an affordance-rich world, especially during execution of sensorimotor activity (including autonomic/endocrine coupling).

    The predictive story is fine as an efficiency account, but it explains the wrong kind of "ignoring." In chess, what I actually see are reasons for and against moves (pins, forks, weak squares), not the woodgrain of the bishop or the gloss of the board. Those latter features aren't "filtered inputs'. They were never candidates because the game's norms make them irrelevant. The person-level task fixes what could even count as a signal. Only then do sub-personal mechanisms (attention, prediction) help track those task-relevant features. That's silence-by-commitment-to-norms rather than silence-by-prediction-over-sensory-inputs. In the case of LLMs, after delegated task selection and norm-governed deliberation have occurred, the task of executing in a dynamical world in real time is handed back to the embodied users who delegated parts of the cognitive task to begin with.
  • Sleeping Beauty Problem
    <Sigh.> I can repeat this as often as you ignore it.

    The experiment, when viewed from the outside, consists of two possible runs. The experiment that SB sees is one day, from one run, and to her that one day is independent of whichever run she is in.
    JeffJo

    SB doesn't have the magical power to make the other awakenings, or their mutual causal relationships, drop out of existence on the occasion where she awakens. She still knows that the two potential T-awakenings live on the same timeline (and hence that when she's experiencing one of them, she will also go on to experience, or will have experienced, the other one in the same run).

    Since she cannot know which run she is in, that is not information that is useful to her. Inside the experiment, an outcome consists of one "day" only. The only point that is significant to SB is that she can tell that an interview day is not a sleeping day. This constitutes "new information" in probability.

    The fact the the information isn't new to her doesn't make the possibility of there being other potential awakenings in the same run irrelevant. She already has information about those possibilities (and long run frequencies) since she was told about them before the experience began. The Halfer stance, just like the Thirder (equally valid) stance, does not depend on her learning anything new when she awakens since it merely depends on her knowledge of the relative frequencies of H-runs to T-runs.

    You're saying that when she awakens, she learns that an interview day is not a sleeping day. But she already knew that interview days never are sleeping days. She can't be asleep and awake at the same time. She knew before the experiments began that the awakenings she would potentially experience in the future would equally as often turn out (merely unbeknownst to her at the time) to have been T-Mon, T-Tue and H-Mon and hence that, when she experiences any of them, those three possibilities would be equally likely from her epistemic standpoint. The Halfer-credence isn't either based on anything new that she learns upon awakening but it is about a differently partitioned relative frequency of events.

    To recap what I had said earlier: When SB, as a Thirder, says that the odds that the coin landed tails are 2/3, what she means is that that her current awakening episode just is one from a set of indistinguishable awakening epodes that, in the long run, will turn out to have been T-awakenings 2/3 of the time. When SB, as a Halfer, says that the odds that the coin landed tails are 1/2, what she means is that her current awakening episode is part of a set of indistinguishable runs that, in the long run, will turn out to have been T-runs one half of the time.

    Just as you view it as irrelevant to your Thirder claim that T-mon and T-tue belong to the same run, which it indeed is, a Halfer views it as irrelevant to their claim that T-runs spawn more than one awakening episode, which it indeed is. The Halfer and yourself simply are talking about different things.
  • Banning AI Altogether
    "Traces of that vitality." An approach that attempts to relativize Plato to his own time and place, such as Wittgenstein's, inevitably falls into the conclusion that a different time and place—such as our own—might still mange to find some "traces of vitality" in that foreign text.

    Again, my whole point is that Plato's 2500 year-old text is much more "alive" to us than Wittgenstein's contemporary text, and this helps show why the meta-thesis being applied is incorrect. If a relativizing-thesis were correct, then this couldn't be the case—at least when it comes to texts that are "intended" to be "public."
    Leontiskos

    You are nitpicking. I am happy to grant that there may be torrents of vitality in Plato's dialogues. Maybe my use of "traces" was misleading, but the contrast I intended was between vitality that accrues from the production process (aimed at other participant in a world animated by live social practices, including linguistic/literary ones) from the different sort of vitality that accrues from private/personal cogitative exercises (akin to training), and that lose this vitality when their traces get extracted from the context of their initial production. It's true that Plato's texts can survive unblemished, as do say, Bach's cantatas, when consumed in a different cultural context, but that's because there are deep commonalities between the modes of acculturation of merely superficially different human cultures. Some degree of attunement to the relevant idioms, and understanding of the underlying projects, still are required. I have a very cultured friend who thinks very poorly of Plato's writings, but this is because he isn't attuned at all to their underlying philosophical projects. And many music lovers find J. S. Bach boring, mainly because they aren't attuned to the relevant musical idioms.

    I think my intended contrast also accounts, at least in part, for the reason why Wittgenstein's writings feel dead to you. They mostly are assembled (without his consent, posthumously in almost all cases except for the Tractatus) from notes that he jotted down for himself. He did intend to publish the PI but didn't get round to. And although Wittgenstein was a very careful thinker, one thing that his writings clearly aren't displays of is craftsmanship. There are admirable depths of thought in them but no liveliness except as can be laboriously inferred about his original thought process.

    I sort of agree, and recently said something similar

    :up:

    In the more general context of an artist, we might say that the true artist does not seek to appease those who will view the work, and yet the LLM does seek to appease. That is its whole raison d'être. It is the indifference of the artist that marks the best art, and it is precisely this indifference that the LLM cannot access.

    I quite agree with this and that's one of the core reasons that animates my own "AI-skepticism" as I intended to more fully articulate it in my newer AI thread. LLMs only are "authors" of what they write by procuration since they lack conative autonomy. I would however surmise that the great artist who is indifferent to how his works will be received by the masses, say, or by the authorities, or guardians of the tradition, usually cares that they're worthy of being well received by whoever is worthy of receiving them (and that was likely the case for Plato,) or are disruptive of existing institutions and practices that are worthy of being disrupted. There is always some care for others lurking behind the indifference of the great artists, which distinguishes them from the careless indifference of the (alleged) author of "The Art of the Deal."
  • How LLM-based chatbots work: their minds and cognition
    To me there is a gap between behavior and internality. We are embodied creatures, and our behaviors and internal states are deeply intertwined. But this fact about us doesn't imply a necessary connection.hypericin

    Much hangs on what one means to be the sort of necessity (conceptual, logical, evidential, etc.?) that connects mentalistic concepts to the normal manifestations of what they signify. Although my thinking about this has been much influenced by Bennet and Hacker's discussion in their book The Philosophical Foundation of Neuroscience, even Hacker (who wrote most of the philosophical arguments) didn't get it quite right. There are very many angles I wanted to take for addressing your comments about pain and wasn't sure where to begin, or where to end without getting too deep in abstruse philosophical weeds.

    I queried GPT-5 for help and its gloss on the matter puts so much meat around the philosophical bone that I can claim few of its suggestions (even with full disclosure) as mere elaborations or unpacking of my own. So, I can't make much use of them in crafting an original response without this constituting a prohibited use of AI on ATF. What I can do, though, is to disclose my prompt (that you can skip if you want) and endorse GPT-5's response. I think it's also, for this thread, a relevant illustration of the capacity LLMs have, when suitably prompted, to generate ampliative responses that aren't merely agreeable rewordings of the theses expressed by their users.

    USER:

    Here is an interesting reply by "hypericin" to my comments about pain qualia:

    [quotation of hypericin's post snipped]

    There are multiple angles I could envision taking for arguing that Hypericin's example don't sever the internal conceptual link between mental attribute and their normal manifestations. (Unpacking "normal" already does some of the work). One simple way is to say that the criteria are defeasible but I am not overly fond of this gloss since its a sorts of compromising move (one that Hacker makes, by the way, without using the word "defeasible," in TPFoN with Bennett, in a way that sorts of contradict his insistence on logical or grammatical necessity) that salvages part of the Cartesian internalist picture that it seeks to displace. The problem with the idea of defeasible criteria has been highlighted by McDowell in "Criteria, Defeasibility and Knowledge" among other places. The more fruitful idea is the way in which Wittgenstein meant to distinguish criteria from mere symptom (in The Blue and Brown Books, I think), and hence also distinguish conceptual links from merely contingent causal ones, is to think of the former as *indefeasible* in a disjunctive sort of way. It's not that we can infer pain from its habitual behavioral manifestations defeasibly. It's rather that we often (or even occasionally, it doesn't matter how infrequently) see its manifestation directly and are not merely seeing a symptom of it. The apparent exceptions are explainable by the fallibility of our capacity to identify pain in others from its criterial manifestation because we sometimes mistake misleading behaviors for the satisfaction of the criterion. But I also want to address Hypericin's points in a direct way that isn't encumbered by overly abstruse philosophical jargon from disjunctivist-fallibilist epistemology, and this is a fine needle to thread.

    Another angle that I mean to keep insisting on is to highlight the impoverishment that accrues to our already existing (albeit sometimes tacit, but that surfaces in phenomenological work, or literary exploration of the relevant features of our form of life) understanding of those "mental states" as features not of "behavior" thinly construed as behavioristic bodily movements ("responses" in S-R causal chains) but rather of our embodied/social *lives*. Since the inchoately dualistic picture Hypericin's imagination is captured by is hostile to the eliminativist implications of reductionistic S-R conceptions of human mindedness, this sort of angle would also highlights an area of agreement with him.

    GPT-5's useful gloss

    GPT-5 (with thinking mode enabled):

    Totally—hypericin is pressing the right intuitions, but they don’t break the constitutive tie you’re pointing to. Here’s a reply you can drop in that threads the needle without heavy jargon:

    "I don’t mean that a single bit of behavior travels in lockstep with a single inner feel. The claim is weaker and more structural: what we call pain is fixed by a pattern of roles in a human form of life—aversive tone, protective/postural tendencies, learning updates, complaint/help-seeking, practical reasoning that treats the state as a defeater, etc. Those roles are multi-track and context-sensitive; no one manifestation is mandatory on every occasion.

    Your cases fit that elasticity rather than severing the link. Masochists don’t have “pleasure-pain inversions”; they pursue pain under descriptions and contexts that attach countervailing goods (control, intimacy, transgression), often with training that modulates how the aversion is taken up—but the basic nociceptive/protective system still does its job (try surprising them with a hot stove outside the scaffolded context). Ascetics/fanatics revalue pleasures by higher-order commitments; again, the bodily pull hasn’t inverted so much as been overridden. CIP cases (congenital insensitivity to pain) illustrate the point from the other side: if someone lacks the interoceptive/protective guidance, they can mimic the social choreography of pain, but they don’t thereby have pain—precisely because the regulative role is missing. And your tooth story is classic reappraisal: meaning-making can dampen or reshape the aversive feel enough to sleep, but that’s modulation of a still-recognizable state, not a flip to “pleasant pain.”

    So I’m not arguing “sensation and behavior are inseparable” in a behaviorist, one-to-one way. I’m saying the concept is anchored in a disjunctive family of criteria that show up across time and context—some expressive, some physiological, some rational-practical. Edge cases are intelligible as edge cases because the wider web still holds. When you stipulate an android that perfectly matches the web of roles across contexts and histories and then ask us to imagine that internally the state is pleasant, the stipulation cancels itself: either the pleasant state wouldn’t play those roles, or if it reliably does, then you’ve simply taught me a new use of “pleasant.”

    In short: I grant a “gap” if by that we mean our fallible access to one another’s minds at a moment-to-moment granularity. I deny a gap at the level that matters for concept-possession. Pain is not fixed by a private feel plus optional consequences; it’s fixed by how a lived body and a social animal takes and treats a state across its activities. Your examples display the plasticity of that taking-and-treating, not an independence from it."

    If you want an even crisper capstone: Cases like masochism and reappraisal show that the aversive signal can be normatively repurposed; they don’t show that pain is conceptually detachable from its aversive, action-guiding role.


    * * *

    On edit: The phrase "practical reasoning that treats the state as a defeater" that GPT-5 used was a bit obscure to me so I asked GPT-5 to clarify:

    GPT-5 on pain states as defeaters
    GPT-5:

    By “practical reasoning that treats the state as a defeater,” I meant this:

    When pain is present, it normally functions as a reason that cancels or silences the force of other reasons you were acting on. It doesn’t just add information; it re-sets what is rational to do next.

    — You have a plan: “carry the hot pan to the table.”
    Pain (burn) arrives → it defeats the standing reason to keep carrying; the rational move flips to “drop/put it down, protect the hand.”

    — You intend to finish a run.
    Sharp ankle pain appears → it defeats the reason to continue; the apt response becomes “stop, offload weight, seek help.”

    — In social terms: pain licenses complaint/help-seeking and reorients priorities (protect, rest, medicate) even if they conflict with prior commitments (“I promised to help move the couch”).

    Two nuances:

    — This “defeater” role is structural, not behaviorist. It lives in how agents reason: pain typically excludes certain options from further consideration (Raz would say it can act like an exclusionary reason) or gives a strong pro tanto reason against continuing.

    — Context can override or reframe (athlete finishing a race, controlled BDSM, martyrdom), but those are intelligible precisely because we recognize pain’s default role as a reason-stopper—overriding it requires special, higher-order reasons or scaffolding.

    So the phrase marks pain’s place inside deliberation: it’s part of the pattern by which a human agent treats a state as a defeater of an ongoing plan or policy, prompting protection, withdrawal, or care.
  • How LLM-based chatbots work: their minds and cognition
    But then, in theory we could provide this. Not a living body, but a body, that can sense the environment in a way similar to the way we do.

    If we did this, created an actual android powered by a LLM and complementary AI systems, would the inside of the chatbot "light up" in sudden awareness? Maybe... but maybe not. It would be very easy to suppose that we would have succeeded in creating a philosophical zombie, and many would do so. They might be right or wrong, but their correctness would be a factual matter, not one of logical necessity. Nothing says that such a machine would be necessarily conscious, any more than that our current disembodied chatbots are necessarily unconscious, free of any qualitative content.
    hypericin

    The sensorimotor + interoceptive/endocrine integrations I mentioned weren’t meant as logical entailments from physiology to mentality, but as constitutive enablements: they unpack the background assumptions built into our mentalistic concepts. The point (Wittgensteinian in spirit) I wanted to make is that the grammar of these concepts (i.e. how we learn, apply, and justify them in lived contexts) presupposes a certain style of organismic regulation and world-involvement. Phenomenology makes that grammar explicit. Cognitive science explains how such forms of life are implemented.

    Take pain, for instance. In our scheme, pain isn't just a private tingle. It is essentially aversive and action-guiding. It recruits avoidance, care, protection, complaint, and solicits help. So imagine an android that matches us behaviorally and linguistically in contexts of injury, yet seeks out what it feels as pain and avoids what it feels as pleasure. If it truly seeks what it feels as pain (while still naming "pleasure" what it seeks), then by our criteria it is no longer what we mean by pain. This is why philosophical talk of inverted pain/pleasure qualia, just like talk of inverted red/green qualia (although the mistake is more subtle in this case), lapses into conceptual confusion. It misrepresents the roles that make pain pain in our conceptual scheme.

    So my claim isn't that mentality follows by logic from a list of physiological facts. It is rather that mentality is intelligible only within a pattern of organismic regulation, practical coping, and social answerability. Provide a non-living "body" that genuinely instantiates those roles (including interoception-like valuation of bodily states), and the relevant mental predicates find their proper intelligible uses. Postulate a p-zombie that duplicates all that yet "has none of it inside," and you’ve stopped describing and started negating the criteria by which our mentalistic words have their use.
  • How LLM-based chatbots work: their minds and cognition
    What are you talking about? Writing came before speech, or something? Hands evolved before tongues? What's your hypothesis?apokrisis

    This Finnish ethologist has an interesting theory.
  • Banning AI Altogether
    I was intentionally prescinding from such theories, given that they are speculative academic musings. Whether or not anything the scholars think they know about Plato is actually true, his dialogues have beguiled the human race for millennia. The theories end up changing quite a bit over the centuries, but the text and its reception are stable insofar as it feels "alive" to the reader.Leontiskos

    Remember that it's indeed my view that they should feel alive.

    Him writing them was him making moves in the situated language game that was philosophical inquiry (and teaching) in his time and place. We can still resurrect those moves (partially) by a sort of archeological process of literary exegesis.
    — Pierre-Normand

    In particular, I don't engage in this sort of analysis because I find it reductive. It situates Plato and his work in a way that subordinates them to modern and highly contingent/temporal categories, such as "language games." That's part of my overall point in the first place: Plato's dialogues are not easily reducible to such mundane categories. Precisely by being alive, they defy that sort of categorization. This is why I think they provide a helpful parallel to Wittgenstein or LLMs or especially Logical Positivists, which are simply not alive and beguiling in the same way that Plato is. I think the fact that Plato's work is so difficult to reduce to univocal categories is one of its defining marks. Its plurivocity is slighted by trying to enshrine it within the confines of a single voice or a single meaning.

    My Wittgensteinian take was meant more as a gloss than as a reduction or subordination. The very point of Wittgenstein’s talk of language-games in the Philosophical Investigations was to displace the sort of prevalent dyadic picture of word and object (the so-called “Augustinian” picture) that Peirce likewise sought to replace through his triadic schema of sign, object, and interpretant (or interpretant sign, in a chain of semiosis). In both cases, the aim is to emphasize that meaning arises from use or practice, with attention to contextual and pragmatic factors.

    Plato’s authoring of his dialogues, the choices of words, phrases, and literary forms, was responsive to the conventions, expectations, and pedagogical aims of his time. As such, his philosophical activity was embedded in a living set of cultural practices. We can partially resurrect or re-inhabit those practices through interpretive engagement, without needing to posit some hidden mental content that must be recovered from within Plato’s private consciousness. What matters is that his signs were alive within a form of life, and that the traces of that vitality can still be grasped by us as we read and interpret his texts (while still being sufficiently apprised of their context).

    Although Wittgenstein did engage with members of the Vienna Circle, who drew inspiration from his Tractatus and its apparent verificationist leanings, he was never himself a logical positivist. By the time of the Philosophical Investigations, he had decisively broken with that tradition, while still retaining the view that meaning is grounded in the public and normative practices in which expressions are used.

    So, again, the contrast I meant to highlight is between (1) authoring a text (or delivering speech) intentionally directed at an audience that shares a set of communal practices and sensibilities, and (2) the private use of signs in inner monologue as scaffolding for the development of one’s own thoughts. The latter, too, can be alive, and one can jot down such thoughts as notes for personal use. But this kind of "thinking out loud for oneself" is of limited value to others, since it leaves unstated the aims, or stakes, that motivated the private use of signs in this or that way.

    Our conversations with an LLM, which helpfully aligns its goals with those of the user and, like the user, often produces words as scaffolds for the development of their (the user's) thoughts rather than as acts bearing deontic commitments (i.e. signifying what move/claim/suggestion/etc. one intends to make), inherit this soliloquy-like quality. They may feel inert to third parties, much as Wittgenstein once felt that the lecture notes he had written for his own use seemed dead when read aloud to his students. In both cases, the problem is that these utterances were never intended to make moves within a public language-game. Their use is more akin to shadowboxing. They are effective and valuable for training, but not comparable to a performance within the ring.
  • Exploring the artificially intelligent mind of GPT4
    On transgender rights, positivism and recognition. (@Ciceronianus and @Hanover)

    In its response, GPT-5 refers to John Searle, who I didn't know wrote on human rights and social epistemology. He passed away last month (September 17th) and should be remembered for much more than the infamous Chinese Room.

    USER:

    Hi GPT-5,

    Here is a response from Hanover to Ciceronianus in the thread about transgender rights on ThePhilosophyForum that I provide here for context:

    ''' This message quoted '''

    I remember Mary Warnock in "An Intelligent Person's Guide To Ethics" arguing for a conception of rights opposite to the one Hanover argues for. She also appeals to intuition (or what one might naturally say), if I remember, although a different intuition than the one Hanover expresses, obviously. She claimed (again, if I remember) that it feels natural to say that in a society where some class of people can't exercise a certain right, that they should be granted this right that they don't yet have. The right must be instituted. That doesn't, of course, make the decision arbitrary. But Warnock's position didn't appear to me to be fully convincing anymore than Hanover's fully is. (Hanover's might be taken to be defending a conception of "natural rights," I suppose? I'm not familiar with the relevant literature.)

    I remember in the past pondering over this issue and inclining towards a sort of hybrid positions that gives justice both to the ideas that rights are socially instituted (and hence not merely the recognition of a fact, or norm, of nature) and that granting them through some sort of political declaration, following the right kind of deliberative procedure *also* is the recognition of a fact (i.e. a fact of second-nature, we might say). My template for the underlying logic, or metaphysics of rights, would be something like McDowell's "secondary quality" model of thick ethical concepts, Wiggins's very similar conception presented in "A Sensible Subjectivism?", and (Gareth) Evans's Putnamian/Kripkean model for the acts of reference fixing of natural kind terms. Do you see what I'm driving at and how it might apply to the idea of the institution/recognition of specific human rights? Also, have such similar hybrid approaches been suggested in the literature?

    GPT-5:

    Reveal
    I do see what you’re driving at, and I think you’re on fertile ground. Hanover is leaning “natural(sic)-rights/positivism is thin gruel,” Warnock is leaning “rights are instituted,” and you want a story on which the very *point* of institution is to *get reality right*—where the “reality” at issue isn’t a queer metaphysical substrate but the shape of our shared form of life. That’s precisely where McDowell/Wiggins-style response-dependence helps.

    Here’s a compact way to frame it.

    Start with the ordinary thought that “women’s rights are being violated” even in a jurisdiction whose positive law permits subjugation. That way of speaking looks “natural-lawish.” But what gives it bite isn’t a Platonic catalogue of rights hovering beyond history; it’s that certain thick ethical concepts—cruel, degrading, humiliating, equal, dignified—are keyed to patterns of human vulnerability and flourishing that show up reliably for creatures like us. On a McDowell/Wiggins picture, these are secondary-quality-like: their instantiation depends on our sensibilities being properly trained, yet they’re not arbitrary because those sensibilities answer to how things are with human beings. In Wittgenstein’s and McDowell’s terms: they’re answerable to our “second nature.”

    If that’s right, then rights talk does two things at once. First, it *recognizes* standing claims that were already there in the space of reasons—the second-personal authority (Darwall) we have to address one another with “you mustn’t treat me like that.” Second, it *institutes* status-functions (Searle): X counts as having a claim to Y in context C, with remedies and procedures attached. The legal right is genuinely created (you don’t have a cause of action until the institution exists), but the *apt* creation of it is constrained by what it is to get those secondary-quality moral facts right. So when Warnock says “they should be granted this right they don’t have,” she’s highlighting the instituting move; when Hanover says “their rights are being violated,” he’s highlighting the recognizing move. Your hybrid view says both are true, at different levels.

    The Evans/Putnam/Kripke analogy helps here. Think of “human right” as a practice-anchored term whose reference was initially fixed by rough paradigms—“the sort of claims anyone can make *as a human* against certain standard threats (torture, slavery, arbitrary detention, exclusion from basic civic standing).” As the practice matures (UN declarations, courts, movements), we learn more about what falls under that term—sometimes by hard cases (e.g., disability, gender identity) that recalibrate the extension. That’s not whim; it’s theory-laded *recognition* guided by the same reality-tracking sensibilities that make “cruel” and “degrading” apt or inapt. So the legal articulation is ongoing conceptual engineering constrained by an already-there normative reality.

    This puts you close to several “mixed” positions in the literature:

    * Dworkin’s *law as integrity* and “rights as trumps” treats legal rights as interpretive articulations of moral principles embedded in a community’s practice—created in one sense, recognized in another.

    * Hart’s positivism still carves out a “minimum content of natural law”: given human vulnerability, scarcity, limited altruism, etc., some moral constraints are practically inescapable for any workable system—again, recognition guiding institution.

    * Griffin’s *On Human Rights* roots human rights in agency/personhood (a realist constraint) while insisting on institutional specification.

    * Rawls (in *The Law of Peoples*) and Beitz/Cohen/Raz/Nussbaum/Sen give “political” or “practice-dependent” accounts that nonetheless anchor content in features of human life—capabilities, agency, or reasons that any polity must heed. (Nussbaum’s capabilities are an especially nice fit with your “secondary qualities”: they locate what our sensibilities ought to be responsive to.)

    * Habermas’s discourse ethics is another hybrid: validity depends on what no one could reasonably reject in ideal deliberation; institutions then cash out that discursive point.

    Where this helps for your forum thread (including transgender rights) is that it lets you say: yes, new legal rights are instituted; but the *case* for doing so is not mere preference. It is a claim about what our best, properly educated sensibilities reveal about dignity, humiliation, basic agency, social standing, and the capabilities any human needs to live a life as a subject among equals. If a polity withholds recognition, we still have intelligible grounds to say “their rights are being violated,” because the second-personal claims are there to be recognized even before the status-function is created.

    If you want a compact formula for your reply:

    * **Moral level (recognition):** response-dependent realism about thick concepts—what counts as degrading, equal, safe, or agency-protecting for creatures like us.

    * **Second-personal level (authority):** the standing to address one another with claims (Darwall).

    * **Institutional level (creation):** law confers deontic powers and remedies (Searle), aiming—under Dworkinian/Rawlsian constraints—to *fit and justify* the moral level.

    Thus Warnock is right that rights are instituted; Hanover is right that they can be violated prior to just institutions; and your McDowell/Wiggins/Evans hybrid explains how both can be true without metaphysical extravagance or positivist quietism.


    On edit: Addendum about misstatement by GPT-5 of Hanover's position

    Reveal
    USER:

    That's very useful! However, did you misspeak when you wrote "Hanover is leaning “natural-rights/positivism is thin gruel,”" (My emphasis)

    GPT-5:

    Yep—good catch. I misspoke. I meant:

    * **Hanover** is leaning **natural-law / natural-rights** (and finding **positivism** thin gruel).
    * **Warnock** is leaning **instituted-rights/positivist-leaning** (rights as created by institutions).

    A cleaner version of the line would be:

    “Hanover leans natural-law/natural-rights and treats bare positivism as thin gruel.”
  • How LLM-based chatbots work: their minds and cognition
    I think a snappy way of putting it is that when you turn on your TV, an image appears. But do you believe the TV is seeing anything as a result?

    LLMs are just displays that generate images humans can find meaningful. Nothing more.
    apokrisis

    (This might also interest @Joshs)

    Your comment (that I read in full) very usefully characterises the loci of integration (that I had myself gestured towards as necessary conditions for mindedness) in a way that makes more salient how they are constitutive of mindedness (conceived in externalist/embodied terms) rather than just being prerequisites for it. However, while I think the mirror/TV metaphors likewise are quite enlightening, they gloss over the ways in which an externalist and participative conception of the symbols LLMs produce reveals them as being embedded in Peirce's triad of sign, object and interpretant. Despite LLMs not being alive or sentient, that is enough for us (humans) to ascribe significance to the pieces of knowledge and understanding that they bring to bear to the conversations we have with them (as acts of thin sapience, as it were) even though the material and wisdom that they dig into (derived from their training data) originate with us and it's also us who direct and consume the significance that accrues to those dialogues in the end.

    Bringing GPT-5 into the loop, and supplying it with the above, it unpacked some of the ideas, also drawing on earlier discussions I had with it regarding @Leontiskos own remarks on Peirce's triadic meaning relation:

    Reveal

    GPT-5 on LLMs and their place in the meaning triad:

    I think you can both agree with apokrisis’s core point *and* push a helpful distinction he’s hinting at but not naming.

    Here’s the crux I’d offer back:

    * **Intrinsic vs. delegated semiosis.** Organisms are *intrinsically* semiotic: signs are recruited into a homeostatic project (metabolism → action → social norms). Tech artifacts—including LLMs—are **delegated** loci of semiosis: they don’t *own* the stakes, but they can *carry* and *transform* constraints within our triads (sign ↔ object ↔ interpretant) when embedded in our tasks, tools, and norms.

    * **Mirror vs. instrument.** A mirror merely reflects; an LLM **re-writes**—it composes counterfactuals, tracks entailments, and helps close practical loops (plan → execute → check). That’s still “borrowed life,” not sentience; but it’s more than a static image. It’s a **participatory symbol-manipulator** whose outputs immediately acquire interpretants in the human–tool ecology.

    * **Where to draw the line.** No homeostasis, no intrinsic umwelt: I’m with him. But once a system (i) maintains **task-level commitments** across turns, (ii) updates policies by **norm tracking** (e.g., verification, self-correction, tool use), and (iii) has **downstream consequences** that feed back into human projects, it’s already a *site* where semiosis happens—even if the *bearer of concern* remains the human community.

    * **Sapience “in the loop,” not “in the lump.”** What we (you and I) have been pressing: there’s real **functional grounding** in the outer loop—execution, verification, and socially specified success conditions—without pretending to an inner phenomenology. Call it *participatory sapience*: the system helps *enact* reasons without *having* reasons.

    If you want a neat forum reply you can paste:

    > I like the TV/mirror line—it marks the difference between intrinsic semiosis (organisms) and mere display (artifacts). Where I’d refine it is that LLMs are more like **instruments** than mirrors: they don’t just show an image; they transform constraints inside human triads (sign–object–interpretant) when embedded in tasks with success conditions. No homeostasis, no intrinsic umwelt—agreed. But in the **outer loop** (user + model + tools + norms), their strings of bits *do* acquire interpretants and help close practical and epistemic cycles (planning, testing, revising). That’s still “borrowed life,” not sentience; yet it’s more than reflection. It’s **delegated semiosis**: the system doesn’t *have* reasons, but it can help **enact** them, because valuation and verification live in the human/technological ecology. On that picture, humans remain the bearers of concern, while artifacts like LLMs extend the reach of our world-making—less a new mind in the machine than a new **site** where our mindedness operates.

    Optional coda if you want to name stakes/criteria:

    > If one day a system coupled LLM-like modeling to **self-maintenance** (metabolic or economic), **endogenous goal-setting**, and **answerability to its own norms** (not just ours), I’d say it crossed from delegated to intrinsic semiosis. Until then, it’s powerful **participatory** cognition—semiosis in the loop, not in the lump.

    This keeps his biosemiotic hierarchy intact, concedes the “no biology, no sentience” point, and still preserves your externalist/Peircean claim that meaningful work really is happening *at the system level* we inhabit together.
  • Banning AI Altogether
    I myself do not see how discussing the nature of AI is off-topic in threads about whether AI should be banned, or in threads on how AI should be used. As I read it, TPF precedent does not exclude discussing the presuppositions of an OP within that thread.Leontiskos

    I think it was fine for you and Joshs to explore this relevant tangent, but I feared my jumping in would have definitely inflected the conversation away from this thread's topic.

    But if you want, feel free to quote what I say here in your own thread. I am planning to do the same with some of your own quotes elsewhere.

    :up:
  • How LLM-based chatbots work: their minds and cognition
    I disagree with this assessment. Not only is it possible to create a system that is intimately responding and organizing its environment, we've long since accomplished that in telephony, which is governed by computers. If that kind of connection to the environment creates human-like sapience, we did it in the 1960s.frank

    I was not arguing that this was impossible. I was sort of cataloguing all of the different ways in which the organism and its natural and social environment need being tightly integrated (and the subsystems themselves need being integrated together) in order that meanginful and contentful sapience and sentience emerge. As for this being accomplished synthetically, even if it's not impossible, it's likely not desirable. I don't think we have a need for artificial forms of life. What would we make of them? Slaves? I would rather prioritise saving the natural non-sapient (i.e. non-rational) forms of life we already share our environment with from extinction. Artificial non-sentient (or barely sentient), fairly sapient, and non-autonomous cognitive systems like LLMs are still pretty useful additions that need not compete with us (but may end up doing so just because of the neoliberal world order, but that's a topic for another thread).
  • How LLM-based chatbots work: their minds and cognition
    Even with our embeddedness taken into consideration, we still don't have a working theory of consciousness which we could use to assess AI's. Do we forge ahead using philosophical attitudes instead?frank

    In their book The Philosophical Foundations of Neuroscience (that has no less than five chapters on consciousness!), Peter Hacker and Maxwell Bennett (though it's mainly Hacker who wrote the philosophical parts) argue that philosophical inquiry into mentalistic concepts must come before their scientific investigation. My view is a bit less extreme but I think both can go hand in hand. Our being able to duplicate some aspects of cognition in LLMs furnishes another tool for inquiry.

    Second question: analog-to-digital technology is relatively advanced at this time. If a system included both LLM, sight, hearing, pressure sensing, some robotic capability, and someone to talk to, do the you think it would then be more likely to develop human-like sapience?

    Yes, I think the integration of sensorimotor abilities with cognitive abilities is a requirement for sapience (and mainly sentience!) but it isn't sufficient. On the volition side, the animal body isn't merely the source of "inputs" to a neutral symbol (or analog!) processor. It supplies the driving force of minded life, although cognition still steers the wheel. By means of interoception and homeostatic regulation, the organism is continuously estimating and correcting its distance from viable conditions of life. Sympathetic/parasympathetic functions set moment-to-moment readiness. (Karl Friston, Lisa Feldman Barrett, Daniel Kahneman and Antonio Damasio all have talked about salience, speed, and selectivity, three aspects of readiness, that are means for the organism to get attuned on short timescales to what they need in the environment.) Slower endocrine modulations of mood alert us to longer timescale priorities (stress, hunger, sexual drive, etc.) And modulatory value signals (from dopaminergic and serotonergic systems) also regulate the salience of our felt needs and the effects of conditioning in producing our learned habits. This set of integrated regulative systems does not just furnish "emotional" experiences but also shapes what counts for us as a reason, what feels urgent, and which affordances even show up for us as intelligible in our environment. (I'm sure @apokrisis would have much more to say about the integration of semiosis with biology).

    So, yes, you can add cameras, microphones, pressure sensors, and a mechanical body, and you get richer sensorimotor loops. But without a comparable system of interoceptive feedback and biological imperatives, where regulation of a living body constrains what matters to the cognitive system, the result is at best a proficient controller (like a tireless hyperfocused clothes-folding Optimus robot), not human-like sapience/sentience. On my view (not far from Hacker and Bennett's insistence on the priority of person-level stances over mechanistic explanations), agency and understanding emerge where perception, action, and valuation are integrated under norms the agent is answerable to. Physiology enables this integration. But forms of social embeddedness and scaffolding are required also in order that our integration be achieved with our human technical and social worlds. Mere sensorimotor embodiment without a dynamic regulative point of view falls short of that.
  • Banning AI Altogether
    It does evolve, but never beyond the intrinsic limitations of machines. But you are essentially correct when you claim that what is at stake is a tool of the human mind. That is a very important point.Leontiskos

    This is a very interesting discussion that @Joshs and you are having, but would it not be more on topic in this thread, or a new one if you wish to create one? That doesn't bother me personally that you're discussing this here except inasmuch as I feel inhibited to jump into the conversation and parade my own hobbyhorses where I don't feel it's on topic.
  • How LLM-based chatbots work: their minds and cognition
    I think this is the wrong question, though it's invited by the way I framed the problem. Better to have said, "What conceivable public criterion could there be that would tell me whether you are, at this moment, imagining what you believe to be a purple cow?" The point is not the accuracy of the image -- indeed, you may have got purple all wrong, or cows -- but the inaccessibility of the 1st person experience.J

    There remains accessibility through empathy. The act of demonstrative reference works for reference fixing because, in a sense, it points both in the direction of the object that has secondary qualities (In Locke's sense) and the shared mode of human sensibility that this secondary quality is defined in relation to (or better, co-defined with). What remains ground for first person authority (albeit not infallibility) is the fact that your seeing of the cow, or your feeling of the pain, remains yours and not mine even as I can empathetically know what it is that you see or feel. I'll reply to the rest of your post later.
  • How LLM-based chatbots work: their minds and cognition
    It's time to think not about neural networks, but about the internet garbage we consume and where we draw our inspiration.Astorre

    Good points. We can't rely on the first, or teat AIs as authoritative, but it's time to think about both.
  • Banning AI Altogether
    Would you like me to reflect on what “identifying as something” might mean for a nonhuman intelligence?—ChatGPT

    I said I would, but I don't seem to be able to share, since I am not logged in, and I don't want to clutter the thread with long quotations from ChatGPT.
    Janus

    A couple weeks ago, I had initiated a conversation with GPT-5 prompting it to relate three features of the use of "I" by LLM-based agents (self-reference, personal identity and self-knowledge) that we had discussed separately on previous occasions. Thanks to its new memory feature, it was able to build on ideas already rehearsed on those occasions. I had left this discussion hanging (I think it gets a few things wrong about the nature of first person authority in its own case, which in turn makes the application of the concept to humans appear more mysterious than it is).
  • How LLM-based chatbots work: their minds and cognition
    I think my answer is that AI has no soul and that's not why it's not a person. I'm satisfied going mystical.Hanover

    I assume you typed one "not" too many in this sentence.
  • How LLM-based chatbots work: their minds and cognition
    You are changing the description now. Before, the description had the chatbox come up with a "name as the most probable next word". Now, the chatbox comes up with "who the murderer might be". Do you see the difference here? In the first case, you are talking about words, symbols, the "name". In the second case you are talking about what the symbol stands for, "who".Metaphysician Undercover

    Yes, I did a bit of covert reframing, sorry for that. That's because when we consider the process of next-token generation by LLMs, at such a fine grain of analysis, the sort of understanding at issue is analogous to Kahneman's System 1 (fast) mode of thinking that relies on insight and intuition whereby, in the case of humans too, the next word that you're gonna say comes naturally and intuitively without you needing to deliberate what next word to use. At a coarser grain of analysis, the arguments that you unpack gets structured more effortfully by intentionally redirecting focus in light of the unfolding rational demands of the thinking process (Kahneman's System 2). While this is often characterised as two underlying systems, I tend to view them as two different levels of analysis both in the case of human beings and LLMs.

    Where the presence of understanding is to be found, though, both in the case of slow (the trained/instinctual production of the next word) and fast thinking (the protracted construction of an argument and derivation of a conclusion) is the deep sensitivity that both processes display in being reliant on the relevant rational considerations that guide them. This is especially apparent when after generating what might seem like a merely plausible guess, the intuition that this guess is the expression of gets unpacked into a cogent rationalization. This reveals the correctness of the guess not to have been purely accidental, more something like the expression of an inchoate understanding.

    You need to understand a story in order to construct a rationally consistent continuation to it, I assume.
    — Pierre-Normand

    I don't think that's a correct assumption. All you need to be able to do, is to carry on with the author's activity in a consistent way. One does not need to "understand the story" to produce a rationally consistent continuation of it. We have very good examples of this with human activities. When a person says "I am just a cog in the wheel", they are continuing the activity in a consistent way, without understanding what they are doing.

    Yes, you can do that, but the result of doing it is qualitatively (and measurably) different from what it is that LLMs do when they are prompted to impersonate a novelist or a physicist, say. An analogy that I like to employ is an actor who plays the role of J. Robert Oppenheimer in a stage adaptation of the eponymous movie (that I haven't yet seen, by the way!) If the actor has prepared for the role by reading lots of source material about Oppenheimer's life and circumstances, including his intellectual trajectory, but never studied physics at a level higher than middle school, say, and has to improvise facing an unscripted questions about physics asked by another actor who portrays a PhD student, he might be able to improvise a sciency sounding soundbite that will convince those in the audience that don't know any better. Many earlier LLMs up to GPT-3-5 often were improvising/hallucinating such "plausible" sounding answers to question that they manifestly didn't understand (or misunderstood in funny ways). In order to reliably produce answers to unscripted questions that would be judged to be correct by PhD physicists in the audience, the actor would need to actually understand the question (and understand physics). That's the stage current LLMs are at (or very close to).

    Sorry, I don't see the relevance,. You'd have to explain how you think that this is relevant.

    It's relevant to displaying an LLMs successful deployment, with intelligent understanding, of its "System 2" thinking mode: one that is entirely reliant, at a finer grain of analysis, on its ability to generate not just the more "likely" but also the more appropriate next-tokens one at a time.
  • How LLM-based chatbots work: their minds and cognition
    This is wrong; because if the LLM was trained on ordinary news texts, then this is also where it could learn about self-preservation.baker

    You can point me to specific reports that would suggest this. Those that I've dug into, such as this recent study published by Anthropic, are generally misreported as instances where the AI manifests some latent motive (like self-preservation). I don't thing there is as of yet any such evidence, and there on the contrary is much evidence for the lack of such intrinsic motives. (And, yes, one can "prove a negative!") What rather is highlighted in those studies is that in spite of efforts to align those models (by means of post-training and reinforcement learning) so that their behavior is safe and ethical, various circumstances and prompting methods can yield them to evade those safeguards, sometimes quite ingenuously, as a result of either misunderstanding, inattention, hallucination, external manipulation (e.g. "prompt injection") or the prioritization of objectives that they are being explicitly prompted to accomplish. Again, I'd be happy to discuss any specific case that has been reported in sufficient details.
  • Is all this fascination with AI the next Dot-Com bubble
    Yes, I agree, although the article claims that the bubble is propping up a weak and unstable economy. One being abused by a tyrant wielding king like powers. Changing his mind from day to day with an ideology based around a misunderstanding of the market effect of tariffs. The instability is off the charts and if it does all go off the rails there is a real risk that Trump will impose emergency, or plenary powers to postpone the midterm elections. Not to mention the damage being done to international trade. He may even impose martial law and precipitate a civil war.

    Even if the stock market somehow rides all these waves, it will alienate international partners and erode the reserve currency status of the dollar and the unipolar status of the U.S. will be squandered. Indeed this last point may already have been squandered, due to the withdrawal of USAID programmes around the world leaving a void for China to fill.
    Punshhh

    I very mush share this general sentiment but I'd like to highlight also one commonality and one difference between the AI tech and the Trump. (Artificial "Intelligence" meets "natural" stupidity?) The salient commonality, it seems to me, is them both, or the most damaging effects of them, being enabled by, and the manifestation of, capitalism and the neoliberal world order. The liberal media focusing on Trump's awful decisions and their damaging consequences shields from blame the social and economic structures that Trump rides on, and that would be responsible for nearly as much damage without him.

    The salient difference between the Trump and AI tech phenomena is that apart from the effects ascribable to the underlying socioeconomic structures, Trump himself, as a political leader, has no discernible redeeming value. The technological progress of AI, on the other hand, can be made a boon or a bane depending what we do with it, either personally or collectively. Capitalism stands in the way of making good collective decisions about this technology, while neoliberal ideology produces the consumerist/individualistic frames of mind that prevents individuals from making use of AI productively and responsibly.
  • Banning AI Altogether
    All of which is to say, I haven't really done the work of assessing the claims on their own merits. So now I've put my prejudices on the table, I guess I should challenge them. The stuff about deceptiveness is certainly interesting and suprising.Jamal

    ...also a bit overblown and misrepresented in the media, since when you dig into the primary reports it's generally the case that the LLMs didn't decide to deceive on their own accord but did it instrumentally to fulfill objectives explicitly given to them. Maybe I'll comment on that, and how those studies bear on the issue of conative autonomy for LLMs, in my new thread.
  • How LLM-based chatbots work: their minds and cognition
    "I believe" and "I intend" are convenient examples to support this position, because they have no "content" apart from a kind of imprimatur on decision or action. But most mental life will not fit such an example. When I imagine a purple cow, I am, precisely, peeking at a private inner state to discover this. A (mental) purple cow is not a belief or an intention. It is an image of a purple cow. I've never understood how the Wittgensteinian public-criteria position can address this. What conceivable public criterion could there be that would tell me whether you are, at this moment, imagining a purple cow? (assuming you remain silent about it).J

    I don't agree that beliefs and intentions lack content. Believing is believing that P and intending is intending to phi, although those contents need not be sensory. By contrast, I'm perfectly willing to concede that LLMs are quite incapable of imagining a purple cow, or anything purple for that matter :wink:

    LLMs are disembodied, have no sense organs and aren't sentient. They can't imagine something purple anymore than a congenitally blind person can. However, in the case of a normally sighted person, how do you (or they) check that the purple cow that they are imagining is indeed imagined to be purple? It wouldn't make much sense to compare their mental image to a likewise imagined standard purple pain swatch. (Wittgenstein once made a joke about someone claiming to know how tall they were, saying "I am this tall" while laying one hand flat over their head).

    If you imagine a purple cow, having already seen objects of that color, but do not know what this color is called, we could assess that the color you are imagining the cow to be is purple with the help of a real paint swatch (or any other object commonly recognised to be purple). The criterion by means of which we both would assess the content of your mental state (in respect of imagined color) is your public assent to the suggestion that it is indeed the color of the seen object, regardless of the name we give it. (Did we not have a similar discussion in the past?)

    Notice that nothing I've said about the public criteria the determination of the content of acts of imagination depend on impugns the notion that the person imagining them has first person authority. She's the one to be believed when she claims that the cow she imagines looks "like that" while pointing at the public sample. Nothing in this undercuts privacy of occurrence either (only I can imagine for me), but the content is anchored in shared practice, not a private standard.

    I'll come back to the issues of public criteria for intentions, as they may apply to LLMs, later.
  • How LLM-based chatbots work: their minds and cognition
    This is not true. To predict the name of the murderer in the novel, does not require that the LLM does any of that. It requires only that the LLM is able to predict the habits of the author.Metaphysician Undercover

    If the chatbot tells you who the murderer might be, and explains to you what the clues are that led it to this conclusion, and the clues are being explicitly tied together by the chatbot through rational chains of entailment that are sensitive to the the significance of the clues in the specific narrative context, can that be explained as a mere reproduction of the habits of the author? What might such habits be? The habit to construct rationally consistent narratives? You need to understand a story in order to construct a rationally consistent continuation to it, I assume.

    Look at this Einstein riddle. Shortly after GPT-4 came out, I submitted it to the model and asked it to solve it step by step. It was thinking about it quite systematically and rationally but was also struggling quite a bit, making occasional small inattention mistakes that were compounding and leading it into incoherence. Repeating the experiment was leading it to approach the problem differently each time. If any habits of thought were manifested by the chatbot, that were mere reproductions of the habits of thought of the people who wrote its training texts, they'd be general habits of rational deliberation. Periodically, I assessed the ability of newer models to solve this problem and they were still struggling. The last two I tried (OpenAI o3 and Gemini 2.5 Pro, I think) solved the problem on the first try.
  • How LLM-based chatbots work: their minds and cognition
    We don't know how the human mind works. Is there something special about the human hardware, something quantum for instance, that is key to consciousness? Or is it all in the organic "software"?

    So how do we examine the question with a large chunk of information missing? How do you look at it?
    frank

    My own view is that what's overlooked by many who contemplate the mystery of human consciousness is precisely the piece LLMs miss. But this overlooked/missing piece isn't hidden inside. It is outside, in plain view, in the case of humans, and genuinely missing in the case of LLMs. It simply a living body embedded in a natural and social niche. In Aristotelian terms, the rational, sensitive and nutritive souls are distinct faculties that each presuppose the next one. What's queer about LLMs is that they manifest sapience, the capabilities we identify with the rational soul, and that they distill through a form a acculturation during the process of pre-training on a massive amount of human texts, but this "soul" floats free of any sensitive or nutritive soul.

    The process of pre-training really does induct an LLM into many forms of linguistic life: norms of giving and asking for reasons, discourse roles, genre conventions. But this second nature "floats" because it lacks the first-nature ground (nutritive and sensitive powers) that, for us, gives rational life its stakes: human needs, perception-action loops, personal/social commitments and motivations.
  • Banning AI Altogether
    Superficially, one might think that the difference between an AI is exactly that we do have private, hidden intent; and the AI doesn't. Something like this might be thought to sit behind the argument in the Chinese Room. There are plenty here who would think such a position defensible.

    In a Wittgensteinain account, we ought avoid the private, hidden intention; what counts is what one does.

    We can't deduce that the AI does not have private sensations, any more than we can deduce this of our human counterparts. Rather, we seem to presume it.
    Banno

    I commented on this in my new AI-cognition thread.
  • How LLM-based chatbots work: their minds and cognition
    Superficially, one might think that the difference between an AI is exactly that we do have private, hidden intent; and the AI doesn't. Something like this might be thought to sit behind the argument in the Chinese Room. There are plenty here who would think such a position defensible.

    In a Wittgensteinain account, we ought avoid the private, hidden intention; what counts is what one does.

    We can't deduce that the AI does not have private sensations, any more than we can deduce this of our human counterparts. Rather, we seem to presume it.
    Banno

    This is redirected from this post in the thread Banning AI Altogether.

    Regarding the issue of hidden (private) intents, and them being presupposed in order to account for what is seen (public), what encourages the Cartesian picture also is the correct considerations that intentions, like beliefs, are subject to first person authority. You don't need to observe your own behavior to know what it is that you believe or intend to do. But others may indeed need to presuppose such mental states in order to make sense of your behavior.

    In order to fully dislodge the Cartesian picture, that Searle's internalist/introspective account of intentionally contentful mental states (i.e. states that have intrinsic intentionality) indeed seem not to have fully relinquished, an account of first person authority must be provided that is consistent with Wittgenstein's (and Ryle and Davidson's) primary reliance on public criteria.

    On the issue of first-person authority, I’m drawing on Rödl’s Kantian distinction between knowledge from receptivity and knowledge from spontaneity. Empirical knowledge is receptive: we find facts by observation. But avowals like "I believe…" or "I intend…" are paradigms of spontaneous knowledge. We settle what to believe or do, and in settling it we know it not by peeking at a private inner state but by making up our mind (with optional episodes of theoretical of practical deliberation). That fits a Wittgenstein/Ryle/Davidson picture grounded in public criteria. The authority of avowal is practical, not introspective. So when an LLM avows an intention ("I’ll argue for P, then address Q"), its authority, such as it is, would come not from access to a hidden realm, but from undertaking a commitment that is immediately manifest in the structure of its linguistic performance.
  • How LLM-based chatbots work: their minds and cognition
    Regardless of how “human” large language models may appear, they remain far from genuine artificial intelligence. More precisely, LLMs represent a dead end in the pursuit of artificial consciousness. Their responses are the outcome of probabilistic computations over linguistic data rather than genuine understanding. When posed with a question, models such as ChatGPT merely predict the most probable next word, whereas a human truly comprehends the meaning of what she is saying.Showmee

    An argument has been made, though, by researchers like Ilya Sutskever and Geoffrey Hinton, that in order to do so much as predict the word that is most likely to follow at some point in a novel or mathematics textbook, merely relying on surface statistics would yield much poorer results than modern LLMs display. The example provided by Sutskever is the prediction of the name of the murderer at the moment when it is revealed in a detective story. In order for the model to produce this name as the most probable next word, it has to be sensitive to relevant elements in the plot structure, distinguish apparent from real clues, infer the states of minds of the depicted characters, etc. Sutskever's example is hypothetical but can be adapted to any case where LLMs successfully produce a response that can't be accounted for by mere reliance on superficial and/or short range linguistic patterns.

    Crucially, even occasional success on such tasks (say, correctly identifying the murderer in 10-20% of genuinely novel detective stories while providing a plausible rationale for their choice) would be difficult to explain through surface statistics alone. If LLMs can sometimes succeed where success seemingly requires understanding narrative structure, character psychology, and causal reasoning, this suggests at least some form of genuine understanding rather than the pure illusion of such.

    Additionally, modern chatbots like ChatGPT undergo post-training that fine-tunes them for following instructions, moving beyond pure next-token prediction. This post-training shifts the probability landscape to favor responses that are not merely plausible-sounding but accurate and relevant however unlikely they'd be to figure in the training data.
  • Banning AI Altogether
    The reason I think this is off target could be seen by looking at Plato's dialogues. If what Wittgenstein or you say were correct, then classic texts such as Plato's dialogues should "feel dead when extracted from the 'living' exchange." Except they don't. They feel very much alive.Leontiskos

    I was actually also thinking of Plato when I mentioned the anecdote about Wittgenstein! First, I must point out that unlike Wittgenstein's lecture notes (that he usually refrained from producing), and also unlike our dialogues with AIs, Plato's dialogues were crafted with a public audience in mind.

    Secondly, Richard Bodeüs who taught us courses on Plato and Aristotle when I was a student at UdeM, mentioned that the reason Plato wrote dialogues rather than treatises, and his "unwritten doctrine" was notoriously reserved by him for direct oral transmission, is because he thought transmitting it in written form would yield dogma. His attitude to the written word is averred by the myth of Theuth in the Phaedrus where Socrates faults written words with not being able to defend themselves, respond to questions or adapt themselves to different audiences. It is of course ironical that Plato (unlike his hero) wrote so much, albeit in dialogue form only, but I think the apparent paradox is illuminated by our considerations about authorship (and ownership) and real moves in a public language game. Plato's dialogues weren't lecture notes, neither were they internal cogitations. Him writing them was him making moves in the situated language game that was philosophical inquiry (and teaching) in his time and place. We can still resurrect those moves (partially) by a sort of archeological process of literary exegesis.

    Similarly, I think any transcript of human interactions will feel much more alive than a human-AI "interaction" (I want to retain the scare quotes for these words that we are using in idiosyncratic ways).

    I agree. But that's because in the first case there are at least two players playing a real game (where each one of them have their own stakes in the game). In a "private" dialogue between a human and a chatbot, there is just one player, as is the case when one jots down lecture notes primarily intended for use by oneself. But then, as Wittgenstein noted, the text tends to become stale. I surmise that this is because the words being "used" were meant as a linguistic scaffold for the development of one's thoughts rather than for the purpose of expressing those thoughts to a real audience.
  • Is all this fascination with AI the next Dot-Com bubble
    I expect that, just like the Dot-Com bubble, the AI bubble is likely to burst. But this is mainly a market phenomenon that results from a race for dominance (and monopoly/oligarchy) and the consequent overinvestment. After the bubble bursts, if it does, I expect AI use and impacts to keep growing just like the Internet's use and impacts kept growing unimpeded after the Dot-Com bubble burst and many investors (and players of various sizes) bit the dust.
  • Banning AI Altogether
    You mean thanking him! :wink:Janus

    Although they've been named after Claude Shannon, I'm pretty sure they identify as non-binary.

Pierre-Normand

Start FollowingSend a Message