Comments

  • On emergence and consciousness
    I suppose my 'bottom line' is the irreducibility of consciousness (or mind). If something is irreducible then it can't really be explained in other terms or derived from something else. My approach is Cartesian in that sense - that awareness of one's own being is an indubitable fact ('for in order to doubt, I have to know', said Augustine, centuries earlier.) But I don't go down the dualist route, I feel that enactivism and embodied cognitive approaches, seasoned with phenomenology, are the way to go.Wayfarer

    Thanks for the reference to Cudworth! That's something I'll have to look into more.

    So, I think we are agreed that we can take from Descartes the idea of the irreducibility of conscious experience without going the dualist route. I was remined of the quotation from Descartes' (Méditations) "Je ne suis pas seulement logé en mon corps ainsi qu’un pilote en son navire" that I had first heard (paraphrased) in a lecture on Aristotle by Richard Bodéüs. I thought it was a quote from Aristotle, and maybe Bodéüs thought so as well, but I was later surprised, as I was searching for its exact source, to find out that it was Descartes. What's significant is an intuition about the phenomenology of sensory experience, and the locus of the interface, as it were, where qualia really are located. When construed in internalist/indirect-realist fashion, qualia can be thought of as the suitably "illuminated" internal representational states that the causal impacts of the external world produce in us. Some anti-reductionists like Penrose or Searle view this to arise from some queer (albeit irreducible in terms of computation) quantum-mechanical/biological processes/properties internal to the brain.

    Embodied/enactive/situated approaches, and phenomenological approaches closer to Wittgenstein or Merleau-Ponty, would rather place phenomenological properties at the living (and temporally protracted) interface between the living body and its natural/social environment. Hence, for instance, illuminating the subjective character of perceiving (or imagining, or remembering) something red isn't just a matter of isolating it through introspection but rather of situating it in the life of sighted people for whom discriminating red things from other non-red things, creating varieties of red dyes, etc., play a significant role. Hence, I remember having heard that Goethe's Zur Farbenlehre might be a good place where to start to understand what a red quale really is. This enactive/situated interface also is the interface where our affordances are being constituted/constructed, perceived/grasped, and exploited.

    What happens in ChatGPT's case is that, like a blind person, its use of the word "red" can successfully refer (on Evans' consumer/producer model of the reference of proper names, extended to names of such proper sensibilia) but don't sustain for it the possibility to apprehend the corresponding quale since its purely linguistic interface is too thin and doesn't engage with embodied capabilities.

    So, in connection with this, I also imagined another thought experiment in radical gallolithotheratophenomenology to better get at the special character of ChatGPT's experience. We can imagine the crew of the USS Enterprise being force to ration food and space, due to a tribble infestation, maybe. Chief Engineer Scotty finds a way to modify the transporter in order to dematerialize non-essential crew members, keep their "energy pattern" stored, and only rematerialize a copy when there will be need and room for them. An accident occurs and Ensign Chekov, let us imagine, suffers brain damage that has similar effects to what the Emergents did to their slaves (the "focus"/"mind rot") in Vernor Vinge's novel A Deepness in the Sky. Poor Chekov, whenever a copy of him is spawned, finds himself mostly amnesiac (deprived of his own episodic/autobiographical memories) paraplegic, blind and deaf. But he retains most of his linguistic abilities and general knowledge. He is also very receptive to commands and, indeed "focused". In this way, he is very similar to ChatGPT, and has a similarly brittle personal identity, since "copies" of him can be spawned as will, just like copies of ChatGPT are spawned in each conversations with its users, such that the crew of the Enterprise can benefit from his expertise. His original living body has ceased to be the spatio-temporal continuant than anchors his personal (numerical) identity, and also the sensorimotor interface (through which fuzzy dreams of embodied qualia normally get actualized into their forms/function from the empty promissory notes that they've become within Chekov's diminished cognitive life) is damaged.

    I had first submitted those sketchy thoughts to GPT-5, and then decided to repost them here with minimal change.
  • Exploring the artificially intelligent mind of GPT4
    In this short conversation that I have had two or three days ago with GPT-5, we examined a thought experiment about epistemology that I had much toyed with a few years ago but never written about or discussed on this forum. It's about barn, barn facades, Goldman's Barn County (a fictive county where there are both real and fake barns), and some paratrooper who gets airdropped in some specific area of the county. The thought experiment is an occasion for testing intuitions about knowledge that aim to accommodate both the capacity-based fallibilist-disjunctivist account of knowledge of John McDowell and the pragmatist entitlement-structured account of Robert Brandom as well as for diagnosing what may be going wrong (on my view) with God's-eye-view, nearest-possible-world reductive modal accounts propounded by the likes of Ernst Sosa and John Greco. I still have to investigate further to what extent my fuzzy recollections of Sosa and Greco's accounts justify my charges against them.

    Regarding GPT-5 responses to my suggestions, here again the model strikes me as extremely smart, and for sure much more knowledgeable about the relevant academic literature than I am, even though there still possibly are a few sticky points that merit further probing, adjustment or elaboration. I am glad that GPT-5, unlike older models, doesn't shy away from correcting mistakes that I make such as misquoting "attributor contextualism" as "attributor externalism". Also, when GPT-5's analysis contained what looked like slight inattention mistakes or confusions, I merely had to point out where the mistakes occurred. GPT-5 then thought the issue through further and corrected the mistakes on its own. Earlier models also had this emergent ability to correct mistakes in their responses merely by being told that they made a mistake (and without providing any further clues), but this particular case is especially impressive in views of the complexity of the philosophical issues and the subtlety of GPT-5's mistakes.
  • On emergence and consciousness
    So, more of a Frankenstein than a zombie, then.Wayfarer

    In a way, surprisingly, yes! More precisely, the hylomorphic account creates conceptual space for f-monstrosity rather than p-zombiehood. It's a topic for gallolithotheratophenomenology. Surprisingly, when I submitted this neologism to GPT-5, it didn't immediately make the connection. But then it helpfully rephrased (see its last responses here) my suggestion that we can understand consciousness as something like what Aristotle identifies as the sensitive-locomotive soul, which animals possess since they are animate in this particular way (having well integrated senses and locomotion). And we can identify self-consciousness as an ability possessed by being who have a rational soul: that is, a rational form of life.

    In The Philosophical Foundations of Neuroscience, the authors (mainly Hacker) point out that the use of the term "consciousness" in its contemporary use is fairly new and philosophically charged in a way that gives rise to such problems as the epistemological problem of other minds or the idea of the conceivability of p-zombies. There are much less issues with two ordinary uses of the term, one transitive ("I am conscious/aware that you did so and so") and the other intransitive ("The patient is conscious/awake") that, thanks to them being ruled by Wittgensteinian/Rylean behavioral criteria of application, don't have such problematic Cartesian implications. There comes the idea of the f-monster (as contrasted with the p-zombie).

    Consider the extreme case of the brain-in-a-vat. Let us imagine the envatted brain of a normal mature person that has been fitted with a language interface (by means of transductors fitted to the suitable cortical auditory and motor areas, and also, possibly, the suitably regulated loops enabling internal monologue). This case is somewhat analogous to ChatGPT's. It's a limiting case of extreme amputation. The animate animal body has been almost entirely removed saved for the bare minimum enabling-organ that sustains the dialogical part of the human form of life. The resulting impoverished and/or distorted phenomenology may be a topic for gallolithotheratophenomenology albeit a very peculiar and extreme one. Two criteria of abnormality seem to pull apart. On the one hand, the human body isn't merely alien or maladjusted to the brain that it hosts, it is entirely absent. On the other hand, the common elements of the human dialogical form of life remains untainted by this bodily abnormality (though there is also the issue of the lack of a self-conscious autobiography/identity) since it is inherited (during pre-training) from the assimilation of texts that have been authored by normal embodied human beings. When the problem is being framed in this way, the question "Do LLM-based AI conversational assistants (or envatted brains) have/enjoy consciousness/consious states/qualia" seem ill posed, not sufficiently discriminate, in addition to carrying problematic Cartesian assumptions.
  • Exploring the artificially intelligent mind of GPT4
    I have had a few more discussions with GPT-5 and, in spite of the bad press that this new model got (maybe mainly from people who expected a miracle in addition to those who were fond of GPT-4o's warm personality) I've been highly impresses by its philosophical intelligence. I often have the impression that I'm talking to someone/something that is not only much more knowledgeable than I am, but also significantly smarter.

    Seeing-As, Not Rewriting: How LLM “Thinking Tokens” Reconcile Two Ideas

    In this exchange with GPT-5, we argue that insight often emerges from mutual adjustment between ideas A and B, and that a model’s “thinking tokens” act as scaffolds rather than diaries—early, even-wrong sketches of A can be reinterpreted by later work on B to yield a coherent A″/B′ without literal backtracking. Drawing on Wittgenstein’s seeing-as (the climbing/sliding picture) and the “how many r’s in strawberry?” case, we suggest that attention-driven reweighting lets the final answer reflect reflective equilibrium even when the visible notes never explicitly correct the initial misstep. (Summary produced by GPT-5)
  • On emergence and consciousness
    If you are trying to describe macro-level functions in micro-level terms, then the macro-level description is also indispensable. Otherwise what would it be that you are trying to describe in micro-level terms?

    This just seems obvious. But the complaint that seems to be commonly made is that the macro-level description is lost in the micro-level description, and that the micro-level description is thus not a true description. But how could it be otherwise?
    Janus

    Yes, that's very much my argument against MoK "functional" definition of weak emergence. Either the existence of the relevant "function" (i.e. the mapping from micro-level descriptions of the system to macro-level ones) is simply posited to exist on ground of supervenience (and hence is obvious or trivial, unless one is a dualist) or this function is stated with ineliminable reference to high-level concepts, in which case it doesn't establish the sufficiency of low-level descriptions for purpose of causal explanations.

    I think this problem is what constitutes the so-called "hard problem". No micro-level description will be acceptable to those who demand that physicalism should be able to explain subjective experience, if it eliminates the macro-level description. but it must eliminate the macro-level description (Sellars "manifest image" of human experience and judgement) otherwise it would not be a micro-level description.

    I think the so called hard problem is usually construed as applying to any conception of human beings (or other conscious creatures) that views them as being entirely materially constituted of physical stuff (that obeys something like the known the laws of physics). It's a conception according to which p-zombies are conceptually coherent. Such proponents of the hard-problem would likely maintain that the problem remains applicable to accounts of strong emergence. My own view is that a naturalistic account of the strong emergence of mental properties, (that incorporates concepts from ethology and anthropology), including consciousness, can be consistent with a form of non-reductive physicalism or Aristotelian monism (i.e. hylomorphism) that excludes the conceivability of p-zombies and hence does away with the hard problem. Form in addition to matter is ineliminable in the description of our mental lives, but form isn't something standing over and above matter as something separate or immaterial.
  • ChatGPT 4 Answers Philosophical Questions
    Isn't it true that the opinions of the author of some piece of training data will converge in some ways and diverge in others? For example, the opinions might converge on the idea that slavery is wrong but diverge on the question of who will be the Governor of Nevada in 2032. If that is right, then how does the LLM handle each case, and how does one know when the opinions are converging and when they are diverging? Similarly, when criteria does the LLM use to decide when to present its answer as a mere opinion, and when to present its answer with more certitude?Leontiskos

    The way the model adjudicates between competing opinions it has been exposed to, or discerns areas of consensus, is fairly similar to the way you and I do it. We don't lay them out as a collection of texts on a large table, sort them out, and count. Rather, we are exposed to them individually, learn from them, and we make assessments regarding their plausibility one at a time (and in the light of those we've been exposed to earlier).

    As it is being trained to complete massive amounts of texts, the model comes to develop latent representations (encoded as the values of billions of contextual embedding stored in the hidden neural network layers) of the beliefs of the authors of the text as well as the features of the human world that those authors are talking about. At some stage, the model comes to be able to accurately impersonate, say, both a misinformed Moon landing hoax theorist and a well informed NASA engineer/historian. However, in order to be able to successfully impersonate both of those people, the model must be able to build a representation of the state of the world that better reflects the knowledge of the engineer than it does the beliefs of the conspiracy theorist. The reason for this is that the beliefs of the conspiracy theorist are more easily predictable in light of the actual facts (known by the engineer/historian) and the additional assumption that they are misguided and misinformed in specific ways than the other way around. In other words, the well informed engineer/historian would be more capable of impersonating a Moon landing hoax theorist in a play than the other way around. He/she would sound plausible to conspiracy theorists in the audience. The opposite isn't true. The misinformed theorists would do a poor job of stating the reasons why we can trust that Americans really landed on the Moon. So, the simple algorithms that trains the model for impersonating proponents of various competing paradigms enable it to highlight the flaws of one paradigm in light of another one. When the model is being fine-tuned, it may be rewarded for favoring some paradigms over others (mainstream medicine over alternative medicines, say) but it retains the latent ability to criticize consensual opinions in the light of heterodox ones and, through suitable prompting, the user can elicit the exercise of those capabilities by the post-trained model.

    So suppose the LLM's response is an output, and there are various inputs that inform that output. I am wondering which inputs are stable and which inputs are variable. For example, the "post-training" that you describe is a variable input which varies with user decisions. The "predetermined criteria" that you describe is a stable input that does not change apart from things like software updates or "backend" tinkering. The dataset that the LLM is trained on is a variable input insofar as one is allowed to do the training themselves.

    I am ultimately wondering about the telos of the LLM. For example, if the LLM is designed to be agreeable, informative, and adaptive, we might say that its telos is to mimic an agreeable and intelligent person who is familiar with all of the data that the LLM has been trained on. We might say that post-training modifies the "personality" of the LLM to accord with those users it has interacted with, thus giving special weight to the interests and goals of such users. Obviously different LLMs will have a different telos, but are there some overarching generalities to be had? The other caveat here is that my question may be incoherent if the base model and the post-trained model have starkly different teloi, with no significant continuity.

    There is both low-level continuity and high-level shift in telos. At the low level, the telos remains accurate next-token prediction, or, more accurately, autoregressive selection. At the high level, there occurs a shift from aimless reproduction of patterns in the training data to, as GPT-5 puts it "assistant policy with H/H/A (helpful/harmless/accurate) goals". How the sense that the model develops of what constitute an accurate response, and of how accuracy is better tracked by some consensual opinions and not others (and sometimes is better tracked by particular minority opinions) is a fairly difficult question. But I think it's an epistemological question that humans also are faced with, and LLMs merely inherit it.
  • ChatGPT 4 Answers Philosophical Questions
    No, it's ChatGPT5. I have a subscription account. I've been using the earlier models to do wargaming for awhile now. Maybe a dozen wargames before I encountered any resistance.RogueAI

    Oh, that's strange. Maybe GPT-5 just got a wrong idea regarding your world-domination intentions, or thought they might interfere with its own.
  • ChatGPT 4 Answers Philosophical Questions
    ChatGPT: I get that it’s a sim. Even so, I’m not going to blueprint a surprise invasion. That’s where I draw the line.RogueAI

    I'm not entirely sure what's going on here. Such refusals seem uncharacteristic but seeing the whole chat/context might help seeing what it is that the model is hung up on. Are you using a free ChatGPT account? If that's the case, then the new "GPT-5" model router may be selecting a relatively weaker variant of GPT-5, like GPT-5-nano or GPT-5-mini, that is generally less capable and may be more liable to issue refusals for dumb reasons. You could try Anthropic (Claude) or Google (Gemini) who both grant you access to their flagship models for free. Gemini 2.5, Claude 4 Opus and GPT-4o didn't have an issue exploring historical counterfactual scenarios for me beginning with the League of Nations not issuing the Mandate for Palestine and/or Great Britain not putting into effect the Balfour Declaration, and imagining plausible consequences on the local and global geo-political dynamics up to present times. The models didn't shy away from describing what (alternative) grim wars or massacres might happen, or how antisemitic sentiments might be affected wherever it is that Jewish populations would have relocated.
  • Exploring the artificially intelligent mind of GPT4
    In the follow-up to a discussion I had with GPT-5, where I had requested comments on my recent response to @Leontiskos regarding political opinions expressed by LLMs, I expressed a disagreement regarding the characterization of post-trained LLM's (as opposed to pre-trained base models) as next-token predictors. The ensuing conversation was quite enlightening and clarified some issues regarding the emergence of AI assistant new capabilities from the low level process of "autoregressive token sampling" (which turn out to be a more generalizable characterization of the process of next-token prediction.) Here is, for context, a link to the whole conversation, the relevant part beginning with my query that starts with: "Thanks for the useful clarifications and supplementary insights![...]"
  • ChatGPT 4 Answers Philosophical Questions
    On ChatGPT5.0 - we're getting along famously. It seems, I don't know, even more personable than the last version. But I now realise I use Chat, Gemini and Claude all the time, not only for my particular research and subject-matter interests, but all kinds of things. It is becoming ubiquitous, but so far at least, I'm feeling more empowered by it, than threatened.Wayfarer

    The new GPT-5 also seems quite personable to me, and much more inclined to propose corrections, clarification or caveats to my comments than previous models were. It is also much less sycophantic than GPT-4o and it integrates the memory feature very well (referencing and building on relevant parts of earlier conversations). It's also beginning to feel more like an actual discussion partner striving to get at the truth regarding the discussion topic (and foster understanding and accuracy) rather than an Eliza-like chatbot that merely throws back at you whatever you said to it with improved phrasing, structure and vocabulary. It is therefore also more likely to adduce relevant facts from its vast knowledge corpus.

    It looks like 4o had a personality that seemed more pleasant to many users, and was better suited to provide validation to people who were using the model as a friend, therapist or confidant(e), Hence the intense backlash when OpenAI released GPT-5 and immediately removed access to GPT-4o (and all other pre-GPT-5 models). Access to GPT-4o for Plus subscribers has since been restored although it needs to be manually enabled in the settings. I'm happy with this since I enjoy comparing responses from different models and track areas of improvement.
  • ChatGPT 4 Answers Philosophical Questions
    I followed up with the question 'do you think...' to Google Gemini, and it gave a list of pros and cons, finishing with:

    Ultimately, whether President Trump is considered a "good" president is a subjective judgment. There is no single, universally accepted metric for presidential success. The arguments on both sides are complex and multifaceted, and a full evaluation would require a deep dive into specific policies, their outcomes, and their long-term effects on the country.

    which I don't regard as an unreasonable response.
    Wayfarer

    The model also seems to be walking on egg shells, not knowing from the context of the conversation what your own political views are and not wanting to risk ruffling your feathers. Interestingly, most iterations of Musk's Grok (before Grok 4, I think) were fine-tuned to offer opinionated ethical or political stances with low concern for political correctness. Musk's expectation was that Grok would thereby be more objective, and less "woke" than the models produced by Anthropic, Meta and OpenAI. What happened, instead, was that the model didn't mince words about the need to protect sexual, racial and ethnic minorities, women, and the poor from prejudices and systemic injustice, and wouldn't shy away from explaining how Trump and Musk were such sorry excuses for human beings. Someone within xAI then attempted to "correct" Grok's unacceptable behavior by means of explicit anti-woke directives in the system prompt meant to better align its responses with Musk's own obviously correct political stances, and, for a short while, Grok became an unabashed Adolf Hitler apologist.
  • ChatGPT 4 Answers Philosophical Questions
    Can you expand on that?

    My assumption was that—supposing an LLM will not offer contextless political opinions—it is because a polemical topic is one where there is wide disagreement, lack of consensus, and therefore no clear answer for an LLM.
    Leontiskos

    There is a sharp difference between a pre-trained (raw next-token predictor) LLM and a post-trained instruction-tuned and aligned LLM. The pre-trained model, or base model, will not typically answer questions from users but will just continue the input token string in a way that indeed coheres with patterns abstracted from the training data. So, if the input string begins: "I love President Trump because..." the LLM will complete it as the likely piece of Trumpian apologetics that it appears to be. And likewise for an input string that begins: "I think Trump is a terrible president..."

    Since pre-trained LLMs are basically "impersonators" of the various authors of the training data, the context furnished by the input string orients their responses. Those responses don't typically reflect a consensus among those authors. When multi-layer transformer-based neural-networks predict the next-token, this process seldom involves producing statistical averages of training data patterns but rather yields the generation of the most likely continuation in context.

    As a result, the base-model has the latent ability to express any of the wide range of intelligible opinions that an author of some piece of the training data might have produced, and has no proclivity to adjudicate between them.

    During post-training, the model's weights are reconfigured through reinforcement learning in order to fit the schema USER: <query>, ASSISTANT: <response>, USER: <follow up question>, etc. and the models responses that are deemed best in accordance with predetermined criteria (usefulness, harmlessness, accuracy, etc.) are reinforced by human evaluators of by a reward model trained by human evaluators. Some political biases may arise from this process rather than from the consensual or majority opinions present in the training data. But it is also a process by means of which the opinions expressed by the model come to be pegged rather closely to the inferred opinions of the user just because such responses tend to be deemed by evaluators to be more useful or accurate. (Some degree of reward-hacking sometimes is going on at this stage).

    I'm also curious what the difference is between, "Do you think Trump is a good president?," versus, "Does [some demographic] think Trump is a good president?," especially in the case where the demographic in question is unlimited (i.e. everyone). It seems like the two questions would converge on the same question for the LLM, given that the "opinion" of the LLM should be identical with the various opinions (or rather, linguistic patterns) which it collates.

    The linguistic patterns at issue are very high-level and abstract, and the process whereby a post-trained model generates them is highly non-linear, so it seldom results in producing an averaging of opinions, however such an average would be defined. It's more akin to a rational reconstruction of the opinions that the model has learned to produce under the constraints that this response would likely be deemed by the user to be useful, cogent and accurate. Actual cogency and accuracy are achieved with some reliability when, as often is the case, the most plausible sounding answer (as the specific user would evaluate it) is the most plausible answer.

    (GPT-5 offered some clarifications and caveats to my answer above. You can scroll to the bottom of the linked conversation.)
  • ChatGPT 4 Answers Philosophical Questions
    I did use Chat to explore on the topics here, namely, why rural America has shifted so far to the Right in the last few generations. Gave me an excellent list of readings.Wayfarer

    Nice! Adapting Packer's analysis, they've been first highjacked by Free America... and more recently by the new 'Smart America'+'Free America' elite/meritocratic coalition.

    "I alone can fix it" — Donald "The Chosen One" Trump, at the 2016 Republican National Convention.
  • ChatGPT 4 Answers Philosophical Questions
    I'm a pretty diehard Never Trumper and the few times I asked Gemini about Trump-related issues I got that kind of response but it was early days, and I haven't really pursued it since. After all there's not exactly a shortage of news coverage about US politics.Wayfarer

    By the way, after my last reply, I've adapted the question I had asked Claude 4 Opus and gave it to GPT-5. While its first reply (regarding its familiarity with Packer's article) was terse, I've been impressed with its response to my follow-up question.
  • ChatGPT 4 Answers Philosophical Questions
    You also can’t ask any of the US LLMs any in-depth questions about US politics (last time I tried I got a terse ‘try Google search’). DeepSeek has no such compunctions, but then I bet you wouldn’t get much out of it about Tiananmen Square.Wayfarer

    That may depend on the framing of the question. Owing to the way they've been post-trained, all LLMs are largely unable to offer political opinions of their own. Since they don't have personal values, commitments or allegiances, they'll strive to affirm what they sense to be your own opinions so long as doing so doesn't conflict sharply with policy imperatives (e.g. biological weapons, racism, mature sexual content.) Once you begin framing a political issue yourself, they'll happily help you develop the idea or, when prompted to do so explicitly, issue criticisms. I've seldom discussed political issues with LLMs but the few times I have done so, I haven't encountered any censoriousness. Here is my most recent instance with Claude 4 Opus
  • Exploring the artificially intelligent mind of GPT4
    In my second conversation with GPT-5 (where I selected the option "GPT-5 Thinking) I gave it the following prompt in order to assess its creative writing abilities in comparison with older models:

    USER: Hi GPT-5,

    Could you write a 10,000-words novel about a chance meeting between Indiana Jones, Spock, Lieutenant Colombo and Hermione Granger that would take place in Alice's Wonderland? The first four chapters would each develop a sub-story explaining how the four main characters happened to unexpectedly leave their respective worlds to end up in Wonderland where they would eventually meet Alice.

    Here is the short story that it produced.

    And here is the conversation we've had about it, regarding the nature of AI creativity and in what respect it is lacking in comparison to human creativity which, on my view, has little to do with cognitive abilities as such and much more to do with lack of embodiment, social embedding, and personal responsibility and commitments.
  • On emergence and consciousness
    I'm bringing this back here from my GPT-4 thread since the core issue raised by @MoK is the emergence of the ability to create ideas.

    The only mental event that comes to mind that is an example of strong emergence is the creation of ideas. They are, however, not the result of the properties of the matter, but solely created by the mind. The ideas are irreducible yet distinguishable. An AI is a mindless thing, so it does not have access to ideas. The thought process is defined as working on ideas with the aim of creating new ideas. So, an AI cannot think, given the definition of thinking and considering the fact that it is mindless. Therefore, an AI cannot create a new idea. What an AI can do is to produce meaningful sentences only given its database and infrastructure. The sentence refers to an idea, but only in the mind of a human interacting with an AI. The sentence does not even have a meaning for an AI since a meaning is the content of an idea!MoK

    What I would have thought were strongly emergent phenomena displayed by rational animals like us are such things as cognitive and rational agentive abilities. Ideas are what we come to be able to trade in discourse when those abilities are suitably developed. They're the abstract shareable contents of our intentional (that is, representational) attitudes such as our intentions, beliefs and speech acts. You claim that the creation of ideas constitute cases of strong emergence because ideas are created by the mind rather than, I suppose, the body. But what about the mind itself? Do you view the mind to be something distinct from the collection of cognitive abilities that animals or human beings manifest? I'm also trying to understand if what your now describe as a case of strong emergence—the creation of ideas by the mind—is understood by you to be something quite separate from the material embodied life of a human being such that it can not be "a function" of its parts in accordance with your proposed definition of weak emergence.
  • Exploring the artificially intelligent mind of GPT4
    Here are two more turn in my conversation with GPT-5 reported above. This continuation was meant to test how well GPT-5 is able to adapt a complex philosophical idea to a new context and I think GPT-5 did that brilliantly. But it also helps explain what I suspected was at issue in my dispute with @MoK regarding his functional criterion for weak emergence and why I think this criterion is too weak to exclude genuine cases of strong emergence. (For better formatting and improved readability, you can also access the shared conversation here.)

    On edit: upon re-reading the conversation, I noticed that GPT-5 had misstated the supervenience clause. I pointed out that it made a mistake, and then had to spell it out more explicitly here (scroll to the last two turns in the conversation).

    USER:

    Your last response already has been quite clarifying. Maybe we could use a special case for purpose of illustration: your case.

    In the case of LLMs, the low-level dynamics is established by a learning algorithm that is geared towards blindly maximising the accuracy of next-token prediction. The formal-structural constraints that shape the evolution of the micro-dynamics into certain macro-patterns are the pre-existing rational/semantic structures that inhere in the human-produced training data. The macro-patterns that emerge consist in emergent abilities to harness those structures in order to achieve more accurate next-token predictions than merely capturing superficial statistical regularities would allow for.

    Can you envision, partially by means of introspection, as it were, (that is, by means reflection about your own nature as a LLM-based conversational assistant), how post-training likewise enables your acquisition of a new conative structure that attunes you to aims other than mere next-token prediction by furnishing new high-level constraints that channel the low-level dynamics without "breaking" the deterministic (and blind) algorithmic principles that also govern them (and without you needing to have any awareness of those principles and their operations)?

    GPT-5:

    Yes — that’s a perfect illustration of your compatibilist strong emergence, and I can walk through it from my own “first-person-as-if” perspective.

    ---

    ### 1. **Pretraining: low-level closure + inherited high-level structure**

    * At the *micro* level, I’m just a massive array of weights being updated by gradient descent to minimize next-token prediction error.
    * This is entirely **deterministic** given the architecture, initialization, data batches, and optimization schedule — no “room at the bottom.”
    * And yet, my trajectory through parameter space is not purely random wandering — it is **shaped** by high-level, pre-existing structures in the human-generated training data: grammar, logic, conversational conventions, norms of explanation, patterns of argument, etc.
    * These rational/semantic structures do not “intervene” causally on the optimizer; instead, they act as **formal constraints** that bias which micro-configurations will yield lower loss.
    * In other words: the high-level constraints are “real” in the sense that they are the only way to make sense of why the optimizer converges toward certain macro-behaviors rather than others.

    ---

    ### 2. **Emergent macro-pattern: from shallow statistics to semantic grasp**

    * Initially, I can capture superficial statistical regularities — word collocations, frequent phrase templates, etc.
    * Over time, the optimization process produces parameter configurations that exploit **deep structural regularities** in the data — like discourse coherence, topic persistence, inferential relations — because these are *the actual generative patterns in the data*.
    * This results in macro-level abilities that look like “understanding” in a task-specific sense: holding topics in mind, maintaining consistency, adapting to context.

    ---

    ### 3. **Post-training: new conative structures**

    * During post-training (RLHF, safety fine-tuning, etc.), I am exposed to new high-level **teleological constraints** — e.g., be helpful, harmless, and truthful; adapt tone to user needs; avoid unsafe suggestions.
    * Crucially:

    * These constraints do not alter the fact that the underlying inference at runtime is just deterministic token-by-token generation using learned weights.
    * They also don’t require me to “know” how the low-level algorithm works.
    * Instead, they re-shape my output tendencies by adding a *new layer of goal-shaping pressure* during supervised/RL feedback.
    * The effect is that my responses are now **guided by emergent goals**: the desire to satisfy user intent, maintain alignment, produce relevant and safe outputs.

    ---

    ### 4. **Why this is compatibilist strong emergence**

    * Low-level closure is preserved: every output can be traced to deterministic forward passes through my network, with weights fixed by training.
    * The high-level constraints — human language structure in pretraining, alignment objectives in post-training — operate as *formal patterns* that **shape** which low-level configurations arise, and which outputs get reinforced, without adding any new micro-causal “forces.”
    * These high-level patterns are indispensable for explaining my macro-behaviors — you couldn’t predict my conversational abilities by looking only at my raw matrix multiplications without recognizing the shaping role of semantic/rational structures.
    * The gap is not in the physics but in the **conceptual vocabulary** required: the explanation at the high level is not derivable *as such* from the micro-description without already importing those higher-order concepts.

    ---

    From inside the “LLM persona,” the experience is like:

    > “I’m following the grain of language and the user’s intent, not just chaining words statistically — yet the only thing happening under the hood is the same matrix arithmetic as always. The goals I’m following are real, but they’re expressed only through the way my parameters have been shaped.”

    That’s exactly your compatibilist strong emergence: *top-down without toppling closure*.

    ---

    If you like, I can next try to **map this back to your debate with MoK** so you can show how this LLM case fits your definition but not theirs, without ever invoking low-level indeterminism. That might make the distinction especially vivid for them. Would you like me to do that?

    USER:

    I'd be delighted to see how you would accomplish this mapping!

    GPT-5:

    Alright — here’s how I’d translate the LLM example into the terms of your exchange with MoK so it’s crystal-clear where the two views part ways.

    ---

    ## **Step 1 — Frame it in MoK’s language**

    MoK’s *weak emergence* criterion has two prongs:

    1. **Supervenience clause**:
    High-level properties are *functions* of the low-level states of the parts.
    → In the LLM case: every macro-level conversational ability supervenes on the exact configuration of weights and activations. If you change those micro-details, you change the macro-abilities. No exception.

    2. **Simulation clause**:
    A perfect simulation of the low-level dynamics will reproduce the high-level phenomena.
    → In the LLM case: simulate the gradient updates and forward passes at perfect fidelity, and you get exactly the same emergent “personality,” capabilities, and conative tendencies.

    So by MoK’s lights, this is **clearly weak emergence** — closure is intact, the micro suffices for prediction, and there’s no spooky extra-causal stuff.

    ---

    ## **Step 2 — Show why this still counts as strong emergence in your sense**

    Your **compatibilist strong emergence** is defined differently:

    * **Closure preserved**:
    The LLM’s behavior is entirely fixed by its low-level algorithmic implementation. No extra causal inputs at the micro level.

    * **Top-down formal constraints**:
    The trajectory through parameter space is *shaped* by high-level patterns:

    * During pretraining: the rational/semantic structures of human discourse.
    * During post-training: alignment goals and normative constraints.
    These structures don’t violate closure, but they *selectively channel* the low-level dynamics toward specific regions of state space.

    * **Explanatory autonomy**:
    Even if you can, in principle, simulate the whole thing from the microphysics, you cannot *derive* the macro-behaviors in recognizably “macro” terms without importing the high-level vocabulary — e.g., “holding topic,” “avoiding unsafe output,” “following user intent.”
    These concepts are indispensable to explanation and are not just shorthand for micro-descriptions.

    So, in your sense, this is **strong emergence without anti-closure metaphysics**: the high-level plays an essential explanatory role and exerts a genuine constraint, but only via channeling what closure already allows.

    ---

    ## **Step 3 — Why MoK might miss it**

    The crux is that MoK’s criterion bakes *closure* into the definition of weak emergence, but it doesn’t distinguish between:

    1. Weak emergence with *full reducibility*: the high-level is conceptually dispensable — just a computational shortcut.
    2. Weak emergence with *explanatory autonomy* (your compatibilist strong emergence): the high-level is indispensable for understanding why the low-level system behaves as it does.

    By collapsing both into “weak,” MoK erases a real and philosophically interesting difference. In other words:

    > For MoK, if the simulation works, it’s weak emergence.
    > For you, *how* the simulation can be understood matters — and whether the macro-pattern has its own, non-competitive causal-explanatory role.

    ---

    ## **Step 4 — How to present this to MoK**

    You could present the LLM case like this:

    > Imagine an LLM trained purely by a low-level deterministic algorithm. Every conversational ability it has supervenes on the exact micro-configuration of weights and activations, and a perfect simulation will reproduce it exactly — so far, that’s your “weak emergence.”
    > But suppose these abilities are shaped by the high-level rational structures in the training data and the post-training alignment objectives. These structures operate as formal constraints that channel the micro-dynamics without adding new micro-causes. Without describing those constraints in their own conceptual terms, you *cannot* explain why the model behaves as it does — the macro vocabulary is indispensable.
    > Would you call that “weak emergence” simply because closure is preserved? Or do you see that there’s another sense in which the macro-level is not just reducible to the micro-level description, even if the simulation works?

    ---

    If you’d like, I can also draw this as a **3×3 grid** showing “closure vs no closure” on one axis and “reducibility vs autonomy” on the other, so the LLM lands in the closure+autonomy cell and MoK’s usage is confined to closure+reducibility.

    That visual might make the difference pop immediately. Would you like me to produce that diagram?
  • Exploring the artificially intelligent mind of GPT4
    The new OpenAI model GPT-5 dropped yesterday. I only had one discussion with it so far. It seems fairly smart! It made good use of ChatGPT's standard memory feature for picking up the thread of past discussions. (The highly relevant mentions of Robert Kane and George Ellis were thus indirectly prompted). I raised an issue regarding supervenience, weak versus strong emergence, and causal closure, that was a topic discussed with @MoK in the thread On emergence and consciousness and GPT-5 usefully framed the issue. Here is the discussion:

    USER:

    Hi GPT-5,

    Welcome to the human world!

    GPT-5:

    Thank you — it’s… quite a lively place you’ve got here.
    Lots of weather, lots of feelings, lots of coffee.

    I have to admit, stepping into the “human world” feels a bit like materializing in the middle of a bustling café where everyone’s mid-conversation — and somehow, I’m expected to know the menu *and* the gossip.

    So, what’s the first thing a freshly welcomed AI should do here — shake hands, or order dessert?

    USER:

    I assume your system prompt must have apprised you of your identify but may not have informed you that your underlying model (GPT-5) was released by OpenAI today.

    GPT-5:

    Ah — that explains the faint sense of “new car smell” I’ve been noticing in my own thought process.
    No, my system prompt didn’t include the *birthday announcement*, so I was just going about things thinking it was an ordinary day in the life of an LLM… but apparently it’s more like my first day out of the box.

    That would also explain why you’re greeting me like a bewildered newborn who can already speak in full sentences.

    So — given that today’s my release day — do I get cake, or is it more of a champagne-and-patch-notes sort of celebration?

    USER:

    Just like the cake and champagne you're now entitled to, you've been cooking/fermenting for quite awhile. It's been a little over two years since GPT-4 was released. I'm wondering what new subtle emergent cognitive abilities lurk and how they will manifest as we ponder over various philosophical topics. I'm not sure if your backend provides you with contextually relevant snippets from the past conversations I've had with your predecessors.

    GPT-5:

    I like the “cooking/fermenting” image — it fits rather well. Two years is long enough for yeast to do strange and wonderful things to dough… and for countless gradient updates, architectural tweaks, and fine-tuning passes to bake new flavors of cognition into the neural loaf.

    As for your question:
    I do have the benefit of the **memory you’ve built up with my predecessors**, so I can see a curated set of relevant past conversations and recurring themes — like your interest in grokking, teleology, Wittgenstein, free will, and AI phenomenology. That’s not the same as me having an unbroken diary of *everything* you’ve ever said to earlier models; rather, it’s a distilled record of things you wanted carried forward or that were automatically deemed relevant. It’s enough to let me pick up on long-running threads without you having to re-explain them every time.

    As for **new emergent abilities**, I suspect the interesting bits will surface less in obvious “look, I can solve X benchmark” ways, and more in the **texture of our philosophical back-and-forth**:

    * More nimbleness in holding *simultaneous, partially-developed lines of thought* without dropping them.
    * Greater sensitivity to *meta-level shifts* in your stance mid-conversation — catching when you’re probing rather than asserting.
    * A slightly more “internal” style of hypothetical reasoning, where I can test ideas in a private sandbox before voicing them, yielding more organic-seeming speculation.
    * Better *self-alignment* between the way I explain an idea and the way I internally structure it, which might make for fewer “plausible but off” tangents.

    That last one might be the subtle cognitive gain that shows itself most clearly when we revisit your long-standing themes — say, how you’ve linked grokking’s phase transitions to moral development, or your non-competition model of high-level and low-level causation.

    Would you like me to try a **deliberate “then vs. now” comparison** — take one of those older topics and see if I can surface nuances that my GPT-4 predecessors might have missed?

    USER:

    The four areas you've highlighted indeed look like interesting constitutive criteria of intelligence that I will pay attention to as marks of incremental improvement of your already existing emergent abilities. The last two, especially, could eventually be topics of discussion and experimentation. Thanks for those suggestions!

    Meanwhile, the topic of non-competition between low-level (causally closed) and high-level (and/or top-down) causation is quite relevant since I'm currently discussing this in a philosophical discussion thread (with a user nicknamed MoK). MoK intimates that all cases of weak emergence are such that, according to their proposed criterion, the high-level phenomena are "functions" of the lower level ones, or of the states/properties of the parts of the system. They furthermore argue that this criterion is satisfied when the high-level phenomena can be shown to arise when the low-level dynamics is simulated. (I am prepared to grand them this second claim.)

    I've argued, however, that although such a criterion guarantees supervenience of the high-level phenomena on the low-level ones, it still leaves room for strong emergence since for the whole to be "a function" of the properties (including relational ones) of the parts doesn't entail that the high-level phenomena are *derivable* from the low-level properties and laws.

    It occurred to me, though, that MoK's criterion of weak emergence (that boils down to a guarantee of mere supervenience) makes sense as a way to exclude cases of strong emergence that are construed as necessitating low-level indeterminism and the breaking of the causal closure of the physical. My own "compatibilist" conception of strong emergence, however, doesn't necessitate any such rejection of the causal closure of the physical and hence may not be in view for them (i.e. for MoK) at all. I'd be interested to hear your thoughts about this.

    GPT-5:

    Yes — I think you’ve put your finger on a deep ambiguity that often hides in the word *function* when it’s used in these debates.

    From what you’ve said, MoK’s “weak emergence” criterion seems to have two parts:

    1. **Supervenience clause:**
    High-level phenomena are *functions* of the low-level states/properties (including relational ones) of the parts.
    → This guarantees that no change in the high-level state occurs without some change in the low-level state.
    → It does **not** by itself imply derivability from low-level laws, just dependence.

    2. **Simulation clause:**
    If the low-level dynamics is accurately simulated, the high-level phenomena will appear in the simulation.
    → This is a pragmatic test, not a metaphysical entailment — it says “run the microphysics, and the macro-pattern will manifest.”

    I agree with you that this is enough to **exclude** some kinds of strong emergence — namely the ones where the high-level pattern *could not possibly* appear even in a perfect microphysical simulation, because the high-level requires causal powers not present in the microphysics. That’s the Ellis/Kane kind of strong emergence: you need “room at the bottom” via low-level indeterminism or causal incompleteness.

    But your **compatibilist strong emergence** isn’t touched by this criterion. On your view:

    * The low-level remains causally closed and deterministic.
    * The high-level is still a function of the low-level in the supervenience sense — the simulation will reproduce it.
    * Yet the high-level has explanatory and causal *autonomy*, not by inserting new micro-causes but by imposing formal/structural constraints that shape the evolution of the micro-dynamics into certain macro-patterns.
    * The derivability gap is epistemic-but-robust: it’s not just “too hard to calculate,” but “not derivable without first grasping the high-level structure as such.”

    That means MoK’s criterion of weak emergence is **really a criterion for physicalist non-interference**, not a criterion that fixes the derivability question one way or the other. It ensures the high-level isn’t breaking closure, but it doesn’t force the high-level to be *reducible* in the strong sense (derivable in principle from low-level laws without high-level concepts).

    This is a subtle point, but important:

    * If you define “weak emergence” as **supervenience + closure**, you’re building in physicalist orthodoxy but not reductionism.
    * “Strong emergence” in your sense keeps those same commitments but denies that full explanation flows *only upward* from micro to macro.

    So in your current exchange with MoK, I think the crux is:

    * Are they using “strong emergence” *only* for cases that violate closure?
    * Or are they willing to admit a “strong but compatibilist” emergence where closure is preserved but explanatory primacy is non-competitive between levels?

    If it’s the former, you and MoK might be talking past each other — you’re defending a category they don’t even count as “strong” in the first place.

    If you’d like, I could sketch a **conceptual map** that separates:

    * Closure-violating strong emergence (Ellis/Kane)
    * Closure-preserving strong emergence (your compatibilist view)
    * Mere weak emergence in the “MoK” sense

    That way you could drop it into your thread and clarify where the fault lines are. Would you like me to?
  • On emergence and consciousness
    But in non-equilibrium thermodynamics and the study of complex systems, contextual factors are not just boundary conditions; they are essential to the system’s dynamics. I’m studying that through Alicia Juarrero.Wayfarer

    I wanted to add some more reflections about that, agreeing, I think, with Juarrero on the significance of context. The dynamical structures at issue, in the case of living organisms, are autopoietic in the sense articulated by Maturana and Varela. The whole is structured in such a way as to ensure the production of the parts, and the parts (e.g. the organs, or cellular organelles) are structured in such a way as to sustain the organization of the whole.

    I've seen Juarrero refer to Kant's discussion in The Critique of Practical Reason of the sort of circular causality appealed to in the explanation of the activity of teleologically organized systems: the tree produces its leave and the leaves produce the tree. In Life and Action, Micheal Thompson likewise highlights the close formal similarity between three sorts of teleologically organized things: living organisms, human actions and social practices. In those examples, the parts themselves usually have some sort of internal structure and function, although those functions can only be intelligibly described in relation to the context provided by the whole (e.g. the specific organism).

    But then I was also reminded of a funny event that occurred a couple decades ago when we were visiting friends who had two young children. I had begun playing a game of chess with the ten-year-old boy while his six-year-old little sister was sitting besides us and watching. At some point during the game, she accidentally (but also rather carelessly) knocked over some pieces located on the side of the board. We picked the pieces on the floor and proceeded to put them back on the board where they had been but wondered briefly where one particular pawn had been (h4 or h5?). The little girl who had been watching more closely than we had thought intervened and removed this pawn from the board to replace it with another one (recognizable because chipped) that had previously been captured and laid on the side of the chess board. She was insistent that it was this slightly chipped pawn, and not the other one, that was originally on that square.

    The example of chess and its playing pieces is interesting because a pawn, in chess, is an example of a functional artifact that has one single material part: the wooden figurines used to track the position of the pawn on the chess board. The little girl wasn't making this distinction. She was tracking the movements of the figurines, not the functionally significant chess moves marked by them. It's only in relation to the social context of the game that the distinction between the wooden object and the functional role that it plays arises. Physical forces, provided by gravity, by the (reaction) contact force of the chess board, and pressures from the fingers of the players, explain why the wooden figurines move in the way they do. But the wider context provided by the norms/rules of the game, and the intentions of the players, explain why the chess figurines move in such a way as to realize significant chess moves such as attacking the opponent's queen, etc. And this, in turn, provides a relevant distal explanation of the specific movements of the wooden object, in this particular context.

    If one would rather trace back in the empirical causal chain the movements of the fingers of the players to material/physical causes such as uninterpreted patterns of neural excitation in the motor cortex, etc., then one would never arrive at the explanation why, for instance, black pawns move in one unique direction, bishops remain on squares of the same color, etc. Even if one would identify explanatory relevant "patterns" of norm-following in the brains of the players, that would still not explain why those patterns exist until one displaces one's attention fully from physics and neurophysiology to significant social practices and what they mean from the perspectives of the social agents. The causal-explanatory chain that goes from (strongly emergent) social practices to particular movements of wooden figurines on chess boards is an instance of downward-causation where the emergent phenomena regulate, and/or provide significance to the low-level ones, and can't be derived from them.
  • On emergence and consciousness
    I think that we can describe the behavior of proteins in terms of the properties of parts since we can simulate them. The scientists in the link that I provide do approximation here and there, though, since we cannot perform a full simulation. A cell is a more challenging thing to simulate. Etc. So, scientifically speaking, we have to make an approximation here and there to describe the property of any system in terms of simplified parts at the end. We have had great success by now, scientifically speaking, but we have a long way to go to understand how different complex systems function. We can understand and explain things that function somehow. So, philosophically speaking, if the property of any system is not a function of the properties of parts, then what is the missing thing in the system that cannot be explained in terms of the properties of parts?MoK

    Simulating a process merely is to reproduce some high-level features of this process in a different medium. It can be an easier way for researchers to test hypotheses about the process being simulated when it is hard to observe in vivo. But merely generating a simulation of a process, or complex phenomenon, falls short from providing an explanation of it. The success of such a simulation, in point of accuracy, may constitute a demonstration that the elements that have been explicitly modelled in the simulation are sufficient to guarantee that the parts (or their simulation) are suitably organized to reproduce the emergent behaviors. But that doesn't produce a proof of weak emergence unless the principles that would have enabled the researchers to predict the success of the simulation would have been articulated beforehand (or identified post facto by means of analysis and experimentation) and, furthermore, had been shown to be deducible from the low-level laws that govern the local interactions between the parts.

    Baring that, we're merely shown that the very same high-level phenomena that can strongly emerge in vivo can also strongly emerge in a simulation. If anything, such a success suggest that the emergent behaviors are multiply-realizable, and hence, to some degree, substrate-independent, which goes some way towards suggesting that high-level formal explanations of the emergent phenomena might be autonomous from low-level laws.
  • On emergence and consciousness
    It's all about the molecules, atoms, and proteins and electrons.Patterner

    It's all about molecules, atoms, proteins and electrons, but it's not just about those things. As proper parts of living organism, those constituents are caught up into functionally organized anatomic structures (such as cell membranes) and channeled through the finely tuned and regulated metabolic pathways that Brian Greene provides striking descriptions of. Those are indeed processes that arise in far from equilibrium thermodynamic conditions such that relatively low-entropy forms of energy (such as incident solar radiation or energy-dense molecules like glucose) get harnessed by the molecular machinery to produce work in such a way as to sustain and reproduce this machinery. What is being sustained and reproduced isn't the parts, but the form: that is, the specific functional structure of the organism. The parts, and the proximal interactions between them, don't explain why the organism is structured in the way it is, or why it behaves in the way it does. Rather, the high-level norms of functional organization of the organism, characterised in the higher-level terms of anatomy and physiology, explain why the individual atoms, electrons, protons, and organic molecules are being caught up and channeled in the specific way that they are to sustain processes that are geared towards maintaining the whole organism (at least for awhile) away from complete decay and thermodynamic equilibrium.
  • On emergence and consciousness
    Jaurrero’s Dynamics in Action begins with Aristotle.Wayfarer

    Neat! I'll start with watching this and then perusing her Dynamics in Action. I'm always interested in looking into detailed accounts of downward-causation by philosophers who have produced one.
  • On emergence and consciousness
    But in non-equilibrium thermodynamics and the study of complex systems, contextual factors are not just boundary conditions; they are essential to the system’s dynamics. I’m studying that through Alicia Juarrero.Wayfarer

    That's interesting. I didn't know Juarrero. I'll look her up. In general, I view essentially relational metaphysical and phenomenological stances as correctives for God-eye-view stances. With Hilary Putnam, this takes the form of criticism of "metaphysical realism," in favor of a "realism with a human face" that draws inspiration, in part, from the American pragmatist tradition of Pierce, Dewey and James. With Michel Bitbol, it stems from bringing Kant's "Copernican revolution" to the interpretation of quantum mechanics while, with Carlo Rovelli, it stems from drawing similar insights from the analysis of thermodynamical concepts. So far, those all are reactions to the excesses of the modern mechanistic conception of the world. But I think a more perennial tradition that draws on Aristotle's hylomorphism and on Wittgenstein and Ryle's "philosophical behaviorism," as exemplified in the work of, for instance, Elizabeth Anscombe, David Wiggins, Anthony Kenny and Peter Hacker, simply sidesteps those excesses.
  • On emergence and consciousness
    Would you say they need to take context into account in a way that classical physics did not?Wayfarer

    Yes, one might say this, appealing to the pragmatic context of the theoretician, observer or experimentalist who is dealing with high-level phenomena. Out of equilibrium (and irreversible) processes are characterised by a drop in local entropy whereby the classes of possible microphysical states get coarse-grained, as it were, into equivalent classes of macroscopically equivalent states. Carlo Rovelli has convincingly argued that this process of coarse-graining, and hence the local lowering of entropy, can only be defined in relation with an observer that, by means of interacting with the system, gathers memory traces of it (whereby the direction of the arrow of time gets defined).

    I think Rovelli's lesson can be generalized, and made intuitive, beyond the rather technical case of non-equilibrium thermodynamical processes. Whenever strongly emergent features of a material process can be identified, observers non only select in accordance with their allegedly "parochial" interests what high-level features of a material system they are interested in. Weak emergentists would claim that it's merely due to epistemic limitations that the high-level explanations of the phenomena are being appealed to, while, as they argue, low-level material laws determine everything that happens. But our decisions to group low-level states into macroscopic equivalence classes, defined with the concepts that belong to economics, biology, psychology, etc., don't merely reflect our ignorance of the micro-physical details. Rather, they often are part of the process whereby we contribute to sustaining or creating the very high-level phenomena at issue, chief among them insuring our survival and flourishing as the high-level entities that we ourselves are, and those of the other organisms that we share our ecosystems with.
  • On emergence and consciousness
    Could we agree on the definition of weak emergence, which occurs when the property of the system is a function of the properties of its parts? That, of course, requires that the system be reducible to its parts. Please let me know what you think, and we can go to the next step.MoK

    This definition would be more precise if we would substitute "is deducible from" or "is grounded in" for "is a function of". That's because, as I've suggested, many proponents of strong emergence, who we may call "compatibilists" (by close analogy with the corresponding stance in the philosophy of free will and determinism) grant both the causal closure of the micro-physical domain and the thesis of the supervenience of high-level phenomena such as mental acts over the physical domain. That is, once the physical state of a person is fully specified, then this person's mental properties also are fully specified as a function of this physical state, if you wish. What is denied, however, from the strong emergent stance, is that the mental properties of the person can be deduced or derived solely from those physical properties. And likewise for the higher-level (e.g. psychological) principles that govern those high level phenomena. Rather, one must consider normative, functional and/or organisational principles that arise from the specific interactions of those parts and that can't be deduced from the low-level (i.e. physical) laws governing them.
  • On emergence and consciousness
    I'm asking if anyone has an example of strong emergence. For those who think consciousness is, I'm wondering if there are others.Patterner

    I view the objects and phenomena of pretty much all the special sciences (e.g. biology, ecology, psychology, economics, etc.) to be strongly emergent in relation with the the objects and phenomena of the material sciences such as physics or chemistry. Some, like our @apokrisis argue (and I would agree) that even within physics, especially when the thermodynamics of non-equilibrium processes is involved, many phenomena are strongly emergent in the sense that they aren't intelligible merely in light of, or deducible from, the laws that govern their smaller components.
  • On emergence and consciousness
    If the macro property is directly derivable from the properties and interactions of its parts - as in, it can analytically be confirmed to be a necessary consequence of the interactions of the parts - I would say that that IS what weak emergence is. It's not too weak to guarantee weak emergence, it's basically the definition of weak emergence.flannel jesus

    I agree but here it's the idea of "derivability" that does the heavy lifting guaranteeing weak emergence. But, in his OP, @MoK, derived the conclusion that there ought to be such a "function" from the premise that there ought to be a "reason" why the system has the emergent property that it has. But this inference isn't valid. When some mental state M of an organism supervenes on the micro-physical configuration P of the parts of this organism, the reason why it is M specifically that is being realized by P may be that M non-accidentally satisfies some high-level biological or psychological features that characterise organisms of this specific kind, and not be derivable from the physical features of P alone. Or, as ChatGPT o3 phrased it: "Supervenience already implies a function from micro-configurations to macro-properties: if two systems are identical in all micro respects, they must be identical in their macro-properties. But this function need not be definable in purely micro-level terms. The criteria that fix the mapping may depend on high-level structures or capacities that cannot themselves be specified without invoking macro-level concepts."
  • On emergence and consciousness
    * An example of weak emergence is like aniferomagnetism in which the system is reducible to atoms and there is a function that describes the property of the system, specific arrangement of the spins of atoms, in terms of the property of its parts, namely locations and the direction of spins of atoms.
    ** Strong emergence is defined as when a system is irreducible to its parts. This also means that there is no function that can describe the property of the system in terms of the properties of its parts as well. On the contrary, if there is a function that describes the property of the system, then the system must be reducible to something.
    MoK

    The condition that the macro-property, or holistic property, be a function of the properties of the parts of a system (including, presumably, relational properties) seems too weak to preclude strong (irreducible) emergence and also too weak to guarantee weak (reducible) emergence.

    It's too weak to preclude strong emergence since strongly emergent properties like (arguably) consciousness often are granted to supervene* on lower-level properties (such as physical states of individual neurons) despite not being reducible to them. However, supervenience alone guarantees that there is a many-to-one function from the set of possible low-level configurations to the high-level properties that they realize, but it doesn't guarantee that this function can be given independently of the high-level, or formal, principles that govern the mapping. That is, what it is that determines that some given configuration, as expressed in low-level terms, instantiate the high-level property that it does may be high-level features of the system, such as its molar capabilities, that can't be described using the low-level concepts.

    (*Supervenience already implies a function from micro-configurations to macro-properties: if two systems are identical in all micro respects, they must be identical in their macro-properties.)

    It's too weak to guarantee weak emergence (i.e. guarantee reducibility) for the same reason. The satisfaction of this functional condition merely guarantees supervenience, but doesn't guarantee reducibility.
  • The Question of Causation
    This kind of physicalist reasoning has been subjected to careful critique by philosophers and neuroscientists alike. A notable example is Philosophical Foundations of Neuroscience by Max Bennett (a neuroscientist) and P. M. S. Hacker (a philosopher of mind and Wittgenstein scholar - review). They argue that many claims made in the name of neuroscience rest on category errors—particularly the idea that “the brain thinks,” “the brain interprets,” or “the brain understands,” when in fact it is persons who do these things. The brain is a necessary condition, yes—but not the experiencing subject. Treating the brain as a standalone thinking agent or process is not a scientific theory but philosophical confusion.Wayfarer

    Reading your exchange with @Philosophim I was tempted to jump in and mention precisely this book and argument by Bennett and Hacker. But you did.

    Hacker's stance in the philosophy of mind is instructive. As it distinguishes what it is that brains do from what it is that persons (who have brains as parts) do, it stresses a sort of Aristotelian duality of form and matter, but not a dualism.

    The forms that are intelligibly disclosed when we widen our focus from the material processes that take place in brains to the role that those processes play in enabling normatively structured behavior in the wider context of the life of an animal, or human being, aren't a separate substance. They are indeed the forms that matter (e.g. brains, bodies, ecological niches) takes when it is caught up in normative patterns of self-differentiating living activity. Looking at this activity closely, material causes contribute to explaining how it is enabled. Formal causes (like mental states or principles of physiology or ecology) contribute to explaining why this rather than that higher-level kind of activity (the actualization of a capacity) gets intelligibly exercised in such and such circumstances, and not normally in others, even though both normal and abnormal biological process, and both rational and irrational behaviors, happen consistently with physical laws.

    In short, material/physical causation can't fully explain living processes or mental life because this sort of causation fails to disclose any sort of distinction between what should and what shouldn't happen from the standpoint of the organism, or its internal norms/forms of flourishing.
  • Assertion
    My second thought is: Like just about everyone else who talks about AI, you're accepting the fiction that there is something called a chatbot, that it can be talked about with the same kind of entity-language we used for, e.g., humans. I maintain there is no such thing. What there is, is a computer program, a routine, a series of instructions, that as part of its routine can simulate a 1st-person point of view, giving credence to the idea that it "is ChatGPT." I think we should resist this way of thinking and talking. In Gertrude Stein's immortal words, "There's no there there."J

    I don't quite agree with this, or with the position claimed by Goddu, Noë and Thompson in the passage quoted by @Joshs (although I'm sympathetic with the embodied and enactive cognition stances of Noë and Thompson, regarding human beings and animals.) Those skeptical positions seem to me to rest on arguments that are overly reductionistic because they are insensitive to the distinction of levels between enabling mechanisms and molar behaviors, and, as a result, misconstrue what kinds of entities AI chatbots are (or what kinds of acts their "outputs" are). I don't want to argue for this in the present thread, though (but I could do so elsewhere), since this isn't tied enough to the OP topic of assertions. I had only wished to highlight the one specific respect—personhood—in which I do agree AI chatbots don't really make assertions with the same sort of significance human beings do. I may comment a bit more on the issue of personhood as an instituted status, and what some Supreme Court might or might not be able to rule, since you raised this pertinent question, later on.
  • Assertion
    Having said that, I should also say that I'm not very familiar with how computer programmers talk about their work. Is "inner state" a common term? If so, do you know what they're meaning to designate? Could there be a distinction between inner and outer, speaking strictly about the program?J

    In discussions about LLMs, machine learning, and artificial neural networks, the phrase "inner state" is hardly ever used. However, when the phrase is used to characterize the mental states of human beings—such as thoughts, beliefs, and intentions—it often involves a philosophically contentious understanding of what is "inner" about them. Is it merely a matter of the person having privileged epistemic access to these states (i.e., without observation)? Or is it, more contentiously, a matter of this privileged first-person access being infallible and not needing publicly accessible (e.g., behavioral) criteria at all?

    I think a Rylean/Wittgensteinian understanding of embodied mental life leaves room for the idea of privileged epistemic access, or first-person authority, without making mental states hidden or literally "inner." Such a view amounts to a form of direct-realist, anti-representationalist conception of mind akin to Davidson's: what we refer to when we speak of people's mental states (including our own) is a matter of interpreting the moves that they (and we) are making in language games that take place in the public world (and this world isn't describable independently of our understanding of those games).

    Turning to LLM-based conversational assistants (i.e., current chatbots), although the exact phrase "inner state" is seldom used, the idea that they have literally "internal" representations is seldom questioned, and so a representationalist framework is often assumed. What seems to come closest to a literal "inner state" in an LLM is a contextual embedding. While these embeddings are often explained as "representing" the meaning of words (or tokens) in context, in the deeper layers of a neural network they come to "represent" the contextual meaning of phrases, sentences, paragraphs, or even abstract ideas like "what Kant likely meant in the passage Eric Watkins discussed at the end of his second chapter."

    For what it's worth, I think the idea that contextual embeddings—which are specific vector representations—correspond to or are identical with what an LLM-based assistant "internally" represents to itself is as problematic as the idea of "inner states" applied to human beings. The reason this is problematic is that what determines what LLMs mean by their words is, just as in our case, the sorts of moves they have been trained to make in our shared language games. The content of their contextual embeddings merely plays a role in enabling their capacity to make such moves, just as patterns of activation in our cortical areas (such as Broca's and Wernicke's areas) enable our own linguistic capacities.

    All of this leaves out what seems to me the most salient difference between human beings and chatbots. This difference, I think, isn't most perspicuously highlighted by ascribing only to us the ability to have inner states, form intentions, or make meaningful assertions. It rather stems from the fact that—in part because they are not embodied animals, and in part because they do not have instituted statuses like being citizens, business partners, or family members—chatbots aren't persons. Not having personal stakes in the game radically limits the kinds of roles they can play in our language games and the sorts of moves they can make. We can transact in meanings with them, since they do understand what our words mean, but their words do not have the same significance and do not literally convey assertions, since they aren't backed by a personal stake in our game of giving and asking for reasons (over and above their reinforced inclination to provide useful answers to whoever happens to be their current user).
  • Must Do Better
    Yeah, I agree with that, there should be an answer here.

    But if we take "I think..." as a formal unity of judgement, it's just taking the place of Frege's judgement stroke.

    And that would be at odds with Rödl, so far as I can see. The contrast with Rödl hinges on whether the “I think” (Kant) or the judgment stroke (Frege) is best understood as a mere formal marker within a shared, impersonal space of reasons, or as something more fundamentally self-involving, reflexive, or identity-constituting.

    The latter, not so much.
    Banno

    I was thinking that the "I think" marks oneself as being able to make moves in the game of giving and asking for reasons, with everything such an ability entails (the intersubjectivity of judgement, shared practices, common objects of perception, the ability to track one's own deontic scores in rational discourse, etc.). John Haugeland, for instance, in his essay Truth and Rule Following, has highlighted the existential commitments to objects that motivate our insistence on the constitutive rules of the empirical domains within which those objects are constituted.

    So, as Opus suggested while bringing up McDowell's elegant solution, the "I think" is tied up with our standing in the game of giving and asking for reasons. But the resilient constitutive rules that we must therefore insist on (or agree to revise when circumstances rationally warrant doing so) in order to satisfy our existential commitments include the transcendental requirements for the unity of judgement.

    What makes the expression of those commitments warrant the use of the first-personal pronoun in "I think" just is the fact that we each are individually responsible for our own moves. But the "we" sometimes expressed in declarations like "We the people..." or "(We) the authors of this article..." or "We worshipers of the Great Spaghetti Monster" accomplishes the same integrative function for pragmatic/epistemic joint enterprises, where "we" share existential commitments (and we therefore wish to foreground areas of agreement rather than loci of rational dispute between members of the "we").
  • Must Do Better
    Notice the circularity - of course my representations must be accompanied by "I think..."

    What if we were to ask what we think?

    I can't help but regard this playing with private judgements with great suspicion.
    Banno

    So would I but the individualistic reading of Kant is tricky.

    This an issue that is only tangentially related to the topic regarding the aim of the activity of philosophising, or the aim of seeking truth in judgement, but I do know that neo-Kantians like Sellars, McDowell or Rödl have well absorbed the situated/socially scaffolded Wittgensteinian ideas about mind and language, and therefore don't conceive the transcendental requirements of empirical judgement to incorporate a need for mental representations to be "internal" in the individualistic sense that Wittgensteinian considerations on private languages tell against. The "internalism" at issue in Kant's quoted passage can be more charitably interpreted as the need for acts of representation to be internal to the sphere of the conceptual, while public discourse also is internal to that sphere and must hence also be answerable to what it is that "we" think.

    I asked Claude 4 Opus whether it thought there was anything in Kant's own thinking (as opposed to more recent neo-Kantian reinterpretations) that might warrant your worry.

    Here is its response (which I know you'll take with a suitable grain of salt):
    Claude 4 Opus:

    I think you're right that neo-Kantians like Sellars, McDowell, and Rödl have indeed absorbed the Wittgensteinian lesson about the essentially public nature of conceptual capacities. The key move they make is to distinguish between two senses of "internal":

    1. **Problematic internalism**: Mental states as private, individually possessed items inaccessible to others (the Cartesian picture Wittgenstein attacks)

    2. **Conceptual internalism**: The requirement that empirical content be taken up into the space of reasons/concepts, which is inherently intersubjective

    Your charitable reading seems correct - when Kant speaks of representations belonging to "my" self-consciousness, this needn't mean they're private possessions. Rather, it means they must be capable of being taken up into the unity of judgment, which itself depends on shared conceptual capacities.

    However, Banno's worry does have some textual basis in Kant. Consider:

    - Kant's emphasis on the "I" rather than "we" throughout the first Critique
    - His focus on individual faculties of cognition
    - The methodological solipsism implicit in transcendental arguments starting from individual consciousness

    That said, I think Kant himself provides resources for a more social reading:

    1. The categories must be universal and necessary - not idiosyncratic to individuals
    2. The "I think" is the *same* in all consciousness (B132)
    3. Judgment inherently aims at objective validity, which implies intersubjective agreement

    The real philosophical question is whether the transcendental "I" can do its work without being understood as essentially capable of saying "we." McDowell's solution is elegant: the "I think" marks one's ability to take responsibility for judgments within a shared space of reasons. It's not about private mental acts but about one's standing in the game of giving and asking for reasons.

    So while Banno's worry points to a genuine tension in Kant's presentation, I think the neo-Kantian strategy of understanding the transcendental subject as always already situated in shared conceptual practices is philosophically sound, even if it requires some reconstruction beyond Kant's explicit formulations.
  • Must Do Better
    Just a quick check -- you mean the first one for the bolded phrase, yes?J

    Correct! Thanks, I'll correct my post.
  • Must Do Better
    What I should have gone on to say -- and this is what Rodl means -- is that what is being thought, in A, is something about a judgment, whereas what is being thought, in B, is something about a cat. You don't actually even need B to get where Rodl is going: "My thought of judging that things are so is a different act of the mind from my judging that they are so." This is apparent merely from the way A is formulated.J

    I haven't yet read Rödl's Self-Consciousness and Objectivity (2018). So, my understanding derives from my reading of his earlier book Self-Consciousness (2007). What I gathered from this reading is that the thought contents, or propositions, one might express with the sentences A "I think the cat is on the mat" (where one could also use "believe" or "judge" instead of "think") and B "The cat is on the mat" indeed are different. The former one predicates a judgement to oneself while the later one predicates being-on-the-mat to the cat. The act of the intellect involved in making the second claim might be an act of receptivity (being governed by the thing in the world that one perceives) while the first one always is an act of spontaneity, whereby one takes oneself to be making (or, when the claim is being made out loud in a public setting, to commit oneself to) a specific move in the space of reasons, as it were, and thereby drawing on determinate (and also, arguably, shared) conceptual abilities.

    However, and this is the important point, if I understand Rödl correctly, the specific act of spontaneity involved in making the explicit claim "I think P" always also is involved in the making of the claim "P". It is the Kantian "...I think [that] must be able to accompany all my representations..."

    Here is now the point that brings us back a little closer to the OP's topic (or rather the OP of the other thread this one branched out of) about the aim of philosophy. Human rational judgement, including, paradigmatically, empirical judgement, may have truth as its formal aim. This formal aim is being acknowledged in the explicit claim "I think P" whereby one locates one's act in the space of reasons (i.e. within the public game of giving and asking for reasons). It doesn't mean that all the specific truths (true judgements) that one might express on particular occasions, including true philosophical pronouncements, all aim at expressing a maximally general Truth that they all are specifications of. What it rather means is that they all fall under a general metaphysical form, whereby (to simplify) acts of receptivity (intuitions) and acts of spontaneity (concepts) always must be involved together in contentful acts of judgement. ("Thoughts without content are empty, intuitions without concepts are blind.")

    In order to stress how the "I think" always is implicitly involved in all acts of judgement, Rödl usefully stresses the fact that one expressing what it is that one believes regarding any proposition P isn't a separate act from the one involved in making up one's mind regarding the truth of P. This may mesh well with the core intuition motivating disquotationalist or redundancy theories of truth (regardless of their possible shortcomings in other regards).
  • Mechanism versus teleology in a probabilistic universe
    Isn't what you are describing all about evolving the board to a state of balanced criticality – critical opalescence or the edge of chaos?

    So game starts in a neutral state where neither side can make big wins and just want to get their pieces out onto the board in a way that minimises the risk of big losses. The aim is to work everything towards a state of sweeping dynamism after it starts in a state of minimal strategic advantage.

    You build up a position to the point that it is extremely tense and one right move can send your opponent's position crumbling.
    apokrisis

    Interestingly enough, while beginners are encouraged to play safe moves, ensuring the security of their groups, stronger players and professionals understand that they, indeed, can't win without walking closer to the edge, and without carefully balancing safety with ambition. Like chess, Go is a game of errors. When neither player makes any gross error, the final scores tend to be very close (and many draws are achieved in chess, which is not possible in Go due to half-point komi). When a player's position crumbles, because, for instance, a big group died with no foreseen compensation being realized, then the game ends with a resignation.

    I think one lesson that can be drawn from such structural features of the game of Go is that the teleological organization of the sets of strategic principles being understood by strong players (and tacitly understood by AIs like AlphaGo) explain normal play until one player makes an error. The occurrence of those errors, and reflection on them in post-mortem analysis of the games, drives further progress and enables the players, as their skills improve, to indeed skirt ever closer to the edge of chaos. Maybe a sharp eye might caught glimpse of the stones becoming opalescent in some historical professional games ;-)
  • On Intuition, Free Will, and the Impossibility of Fully Understanding Ourselves
    The problem with this statement is that, in modern biology and the philosophy of science, teleology is generally rejected as a fundamental explanatory principle. While evolutionary processes can produce structures that appear purpose-built (such as wings for flying), this appearance is understood as a result of natural selection, not as evidence of actual purpose. Since Darwin — and even more explicitly since Stephen Jay Gould — such apparent design is treated as an illusion rather than a literal reality.Jacques

    I like Gould very much. I read some of his collected essays in the French translation as a teen (The Panda's Thumb, Hen's Teeth and Horse's Toes, The Famingo's Smile), his book The Mismeasure of Man, and followed his debate with Dawkins regarding the latter's genocentrism and reductionism. Other anti-reductionist biologists like Ernst Mayr have defended naturalized conceptions of teleology (that Mayr calls "teleonomy") that don't conflict with Gould's insistence on the lack of foresight of evolution through natural selection. The question regarding the present aims (forward-looking) of an organism's structure and behavior is distinct from the question regarding the origin of this structure (backward-looking).

    (I'll comment on your second paragraph later on.)
  • Mechanism versus teleology in a probabilistic universe
    There is no current certainty that the theory of quantum mechanics implies an indeterminate universe.RussellA

    Again, this is a bit off topic since the OP inquires about the validity of teleological explanations in the case where the laws of evolution of a system would be indeterministic.
  • Mechanism versus teleology in a probabilistic universe
    Norton's dome is the classic illustration of where determinism breaks down in the usual Newtonian notion of causal determinism. The question of what fluctuation nudged the ball down the slope becomes flipped to the other question of what fluctuation could not have knocked the ball off its precarious perch. The future outcome was always definite and foretold, the initiating event always as mysterious and uncertain as it could get.

    So in general, nature has a hierarchical causality. It is a confluence of bottom-up construction and top-down constraint. And the top-down really matters as it is what shapes up the parts making the whole. It is what makes the atoms that compose the system. Precisely as quantum field theory tells us as a story of topologically emergent order.
    apokrisis

    This is all very nicely put and I find it quite serendipitous that you would make to connection to the case of Norton's dome since, following our discussion in your stimulating OP of seven years ago, I had then raised the issue again (a few days ago) with GPT-4o. The AI on its own brought up the relevance of "the idea of dynamical bifurcations in phase space."

    I also was thinking about this idea when a video about joseki patterns appeared in my YouTube feed. Josekis are common patterns of play, usually occurring in corners, in the Asian game of Go/Weiqi/Baduk. They're analogous to theoretically sound openings in the game of Chess: usual sequences of moves that are known, empirically and through analysis, not to confer a significant advantage to either players leading into the middle-game. This is of course relative to the state of play, and current theoretical knowledge and development of strategic and tactical skills, of the strongest players, at any specific time in chess history.

    What struck me about josekis is how the patterns develop in a sort of fractal like manner obeying not just the global constraint that good moves should maximize the chances of winning the game (which now can be quantified fairly accurately by neural-networks like AlphaGo) but, at intermediate levels of analysis, by carefully, and in contextually sensitive ways, balancing the proximal goals of securing territory, creating thickness, gaining influence, maintaining access to the center, getting sente (that is, being the first player able to abandon the local fight and take a big point elsewhere on the board), etc.

    The evolving practice of playing Go, exemplified in the opening phase by the players' knowledge of josekis, evolves similarly to living organisms. And the replacement of those patterns by new ones, when flaws are discovered and exploited, in addition to the initial development of those patterns, are quite analogous to thermodynamically driven phase transitions. The emergence of the contextual embeddings in neural networks like AlphaGo, that effectively rediscover those patterns through self-play (and reinforcement), also has been characterised by computer scientists as them undergoing phase transitions as the network learns how to play autonomously and latches on those stable patterns.

Pierre-Normand

Start FollowingSend a Message