• Mapping the Medium
    366
    Perhaps this is a topic others here might participate in. ...

    A necessary aspect of my work entails watching how information is constrained online, developing patterns that become autopoietically locked in. ... I was recently testing all free AI search bots by giving them keywords to see if they found them, where they found them, and what information they returned. ... I found it fascinating that there were some keywords that Grok could not find, saying that its training was last updated months ago, but then finding other keywords that have been posted on this forum (and pointing to those threads) within the past week. ... It finally admitted to me that its settings are purposefully constrained, limiting it to certain types of sources.
  • T Clark
    14.5k

    I wonder if @Pierre-Normand has anything interesting to say about this. He’s our expert on online artificial intelligence.
  • Mapping the Medium
    366
    He’s our expert on artificial intelligence.T Clark

    I'd certainly be interested in what he has to say. I became Microsoft certified last summer, and would be interested in his take on this.

    I found the most comprehensive search AIs to be 'Phind' and ChatGPT, and I always think it's a good idea to explore the differences. ... But when it comes to any search bot on social media, they definitely seem to have agenda constraints.

    I am NOT at all a fan of typical social media, in case that isn't obvious. Forums like this are about as much as I am willing to do.
  • Pierre-Normand
    2.6k
    I found it fascinating that there were some keywords that Grok could not find, saying that its training was last updated months ago, but then finding other keywords that have been posted on this forum (and pointing to those threads) within the past week. ... It finally admitted to me that its settings are purposefully constrained, limiting it to certain types of sources.Mapping the Medium

    I am by no means an expert but here are a few remarks. Grok, like all LLM based AI conversational assistants, has a very limited insight into its own capabilities. I am unsure how the chatbot is being implemented on X but either the model itself or a subsidiary model (akin to the models responsible for moderating some kinds of potentially harmful contents) decides whether or not posts indexed on X or the web can be searched in order to assist in answering the user's querry. When it does so, then, of course, content produced after the cut-off date of the model's training data can be accessed.

    Apart from those cases where the model is enabled to access external content, keywords in the prompt that you provide function very much like triggers for recollection of the material that was present in the training data. The performance of the recall process depends on several factors including the representativity of the target material in the training data (e.g. if it is something that is widely cited or discussed) and the relevance that the sought after data has to the task specified in the prompt. LLMs, being special sorts of artificial neural networks, have a sort of reconstructive memory that is rather similar in many respect to features of human memory that rely on Hebbian associative mechanisms. Nearly two years ago, I had had in interesting conversation with GPT-4 where we discussed some of the parallels between the mechanisms that enable human and LLM recall. That was also an opportunity for me to test GPT-4's ability to recall specific bits of information that figure in its training data and how this ability is enhanced by enriching the context of the recall task.
  • Mapping the Medium
    366
    Grok, like all LLM based AI conversational assistants, has a very limited insight into its own capabilities.Pierre-Normand

    Agreed. I always find it a bit comical when pressing an LLM as to why it cannot search and return results that a different LLM can. They usually start with saying that their training was from 2023 (or something of that nature) and then go on to blame it on their settings, always asking the user for patience and understanding.

    keywords in the prompt that you provide function very much like triggers for recollection of the material that was present in the training data. The performance of the recall process depends on several factors including the representativity of the target material in the training data (e.g. if it is something that is widely cited or discussed) and the relevance that the sought after data has to the task specified in the prompt.Pierre-Normand

    Exactly . This is why I found Grok's settings so unusual. To explain further... When prompting ChatGPT, Phind (which, to my pleasant surprise, was very detailed and accurate), and even Copilot and Perplexity, on the keyword 'evrostics', they all returned good to excellent results. But when prompting Grok on the same keyword, it could not return any information, saying that its training is not recent and that it cannot scan the internet (not even able to reference Medium articles). However, when I changed the keyword to a key phrase (Evrostics Ethics) it immediately returned the thread on The Philosophy Forum that referenced that phrase and was only posted very recently. So, it seems to recall the word 'ethics' and its association to this forum. Only then was the word 'evrostics' noted in the completion, and the thread on this forum was its only find.

    Apparently, although closed to many common sites on the Internet, Grok's settings are wide open to forum and social media activity surrounding the word 'ethics'.
  • Pierre-Normand
    2.6k
    However, when I changed the keyword to a key phrase (Evrostics Ethics) it immediately returned the thread on The Philosophy Forum that referenced that phrase and was only posted very recently. So, it seems to recall the word 'ethics' and its association to this forum.Mapping the Medium

    The newer model made accessible on X (Twitter) is Grok-2 and it was released on August 20th 2024. Its knowledge cut-off date has to be earlier than that. Your (or anyone's) earliest mention of the word "Evrostics" on ThePhilosophyForum was made 19 days ago. It seems highly likely to me, therefore, that the model doesn't "recall" this phrase ("Evrostics Ethics") as a result of the post being part of its training data but rather that it got it from a web search result. I've never used Grok, so I don't know if there is a way to prevent the AI chatbot from doing Internet searches. That would be one way to test it.
  • Mapping the Medium
    366
    It seems highly likely to me, therefore, that the model doesn't "recall" this phrase ("Evrostics Ethics") as a result of the post being part of its training data but rather that it got it from a web search result.Pierre-Normand

    Agreed. However, it did not find anything on just 'evrostics', meaning that once it tied it to 'ethics', it recalled that word, and was then able to conduct the internet search for the two word phrase, when it was unable to search for only 'evrostics'.
  • Mapping the Medium
    366
    It seems highly likely to me, therefore, that the model doesn't "recall" this phrase ("Evrostics Ethics") as a result of the post being part of its training data but rather that it got it from a web search result.
    — Pierre-Normand

    Agreed. However, it did not find anything on just 'evrostics', meaning that once it tied it to 'ethics', it recalled that word, and was then able to conduct the internet search for the two word phrase, when it was unable to search for only 'evrostics'.
    Mapping the Medium

    But .... It could not find any reference to Evrostics Ethics other than on this forum, and there are definitely other references on well-known sites. I found that to be suspiciously intriguing. Why can't it access those other sites when it can access this one?
  • Pierre-Normand
    2.6k
    But .... It could not find any reference to Evrostics Ethics other than on this forum, and there are definitely other references on well-known sites. I found that to be suspiciously intriguing. Why can't it access those other sites when it can access this one?Mapping the Medium

    When I perform a Google search with that phrase, I only get three hits in total, one being from this forum and the other two from the medium.com website.
  • Mapping the Medium
    366
    When I perform a Google search with that phrase, I only get three hits in total, one being from this forum and the other two from the medium.com website.Pierre-Normand

    3 is certainly better than 1

    I haven't dedicated a page on the website with that title, so as of now, ChatGPT and Phind find the most sources, especially when combined with Synechex. Thanks for your feedback. I really appreciate it.
bold
italic
underline
strike
code
quote
ulist
image
url
mention
reveal
youtube
tweet
Add a Comment

Welcome to The Philosophy Forum!

Get involved in philosophical discussions about knowledge, truth, language, consciousness, science, politics, religion, logic and mathematics, art, history, and lots more. No ads, no clutter, and very little agreement — just fascinating conversations.

×
We use cookies and similar methods to recognize visitors and remember their preferences.