top of page

Theory of Mind in ChatGPT:

Does ChatGPT have a “theory of mind”?

Theory of Mind (ToM) has been applied not only to humans and animals but also to Large Language Models (LLMs) like ChatGPT. The extent to which these artificial systems replicate ToM has become a critical area of exploration. Recent studies have begun to probe this question, shedding light on how LLMs simulate mentalistic reasoning and how their performance compares with that of humans.


In their 2023 study, Marchetti and colleagues explored the fascinating question of whether ChatGPT, OpenAI’s conversational AI, can exhibit a ToM. The study highlights that ChatGPT demonstrates an impressive capacity to navigate classic ToM tasks. It successfully completed tests like the Sally-Anne task (Wimmer & Perner, 1983) and second- and third-order false-belief tasks (Perner & Wimmer, 1985; Valle et al. 2015). These challenges often require recognising that others can hold beliefs that differ from reality, a fundamental ToM milestone. Furthermore, ChatGPT performed well on more complex tasks, such as interpreting ambiguous scenarios in Strange Stories (Brunet-Gouet et al. 2023) and detecting faux pas[1] (Gregory et al. 2002), where subtle conversational cues reveal hidden meanings.


Despite these achievements, ChatGPT’s reasoning often strays into hypermentalisation—over-interpreting or ascribing undue intent to characters' actions. For instance, in retelling the Sally-Anne story, ChatGPT added an unnecessary and unsubstantiated accusation: Sally suspects Anne of stealing her marble. This error underscores the model’s occasional detachment from realistic human interpretations.


One particularly intriguing aspect of the study is ChatGPT’s unpredictability. Like humans, it sometimes changes its “mind” when questions are rephrased, but unlike humans, these shifts often feel arbitrary. This behaviour reflects its reliance on surface-level linguistic cues rather than deep, consistent conceptual understanding. While unpredictability is a known issue in LLMs (Mitchell et al. 2022), it becomes especially significant in tasks requiring nuanced reasoning about mental states.


Ultimately, while ChatGPT demonstrates a surprising capacity to simulate ToM, it is a fundamentally different phenomenon from the natural human ToM. Its limitations remind us that ToM in humans is shaped by embodied experiences, cultural exchanges, and developmental processes that AI cannot yet replicate. As Marchetti et al. (2023) conclude, understanding and refining the interaction between natural and artificial ToM is vital for fostering transparent, reliable, and meaningful communication between humans and AI systems.


‘Does ChatGPT have a typical or atypical theory of mind?’  

Attanasio et al. (2024) examined the abilities of ChatGPT-3.5 and ChatGPT-4 in performing affective and cognitive ToM tasks. These included the Advanced ToM Test and the Emotion Attribution Task, with results compared to individuals with typical development (TD) and high-functioning ASD. The findings revealed that ChatGPT-3.5 and ChatGPT-4 could infer mental states with remarkable accuracy, though ChatGPT-3.5 struggled with tasks requiring more complex reasoning, such as understanding persuasion or third-order ToM scenarios. For instance, in the “Persuasion” story, ChatGPT-3.5 provided a verbose and ambiguous response that failed to recognise deception, a challenge also faced by many individuals with ASD (Mazza et al. 2022).


In contrast, ChatGPT-4 achieved higher overall accuracy and performed comparably to TD individuals on cognitive ToM tasks. Yet, it still faced difficulties with recognising negative emotions like sadness and anger, paralleling challenges observed in ASD populations (De Marchena & Eigsti 2016). Despite their high accuracy, both LLMs exhibited a verbose and mechanical conversational style, violating Grice’s maxims of cooperative communication (Grice 1975; Marchetti et al. 2023). This style often resembled that of individuals with high-functioning ASD, characterised by redundant or overly detailed responses. The study highlights the potential of LLMs as a “bridge” for understanding and possibly supporting individuals with ASD, though their ability to simulate nuanced human ToM remains limited.


Strachan et al. (2024) extended this exploration by comparing the performance of GPT models and LLaMA2 with 1,907 human participants on a comprehensive battery of ToM tests. These assessments included tasks such as understanding false beliefs, interpreting indirect requests, and detecting irony and faux pas. The results demonstrated that GPT-4 matched or exceeded human performance on most tasks, particularly in identifying false beliefs and indirect requests. However, GPT-4 struggled with detecting faux pas, where LLaMA2 outperformed both GPT and humans.


Interestingly, follow-up manipulations revealed that LLaMA2’s success in faux pas detection may reflect a bias toward attributing ignorance rather than genuine inferential superiority. Similarly, GPT-4’s errors were attributed to its hyper-conservative approach to committing to conclusions rather than an inability to reason about mental states. This nuanced performance underscores the importance of systematic testing to avoid superficial comparisons between human and artificial intelligence, as well as the potential for LLMs to mimic human-like reasoning in specific contexts.


Implications and Limitations

The broader implications of these findings were contextualised by Kosinski (2023), who noted the potential influence of training data on the ToM-like behaviour of LLMs. Exposure to false-belief tasks during training may enhance the models’ performance in specific scenarios, raising questions about the distinction between learned patterns and genuine inference capabilities. Additionally, the difficulty LLMs face in attributing emotions—especially negative ones—highlights the complexity of recognising and understanding emotions, which require integration of verbal, non-verbal, and socio-cultural cues. (As Attanasio et al. (2024) observed, the tendency of LLMs to confuse emotions with physical states or to provide overly verbose explanations mirrors certain patterns seen in ASD populations.)

Although LLMs demonstrate capabilities akin to human reasoning in cognitive ToM tasks, their inability to fully replicate the communicative clarity and emotional nuance of human interaction remains a significant limitation.


The exploration of ToM in LLMs reveals a fascinating intersection between artificial intelligence and human cognition. As researchers continue to refine these models and test their capabilities, the role of LLMs as tools for supporting social and cognitive challenges in humans, especially in clinical contexts, will likely grow. However, these advances must be grounded in ethical and rigorous testing to ensure their utility and reliability.

___________________________

[1] ‘faux pas’ – an awkward social mistake; a social blunder or indiscretion

Comments


Sign-up below for my monthly newsletter about my work, personal updates and 'Parent Corner' as well as blog updates.

Thanks for subscribing!

fbook.png
linkedin.png

Copyright © 2021 OlgaBogdashina.com - All Rights Reserved.

Designed and built by Olesya Bath

bottom of page