Key points are not available for this paper at this time.
In asking how sociolinguists should engage with generative artificial intelligence (AI), Helen Kelly-Holmes explores useful methodological cautions that scholars attentive to language in social contexts should keep in mind when engaging with large language models. As linguistic anthropologists, we find her take suggests useful warnings, and also serves as a reminder of how our two disciplines have concerns that speak productively to each other, yet occasionally differ in orientation. She calls scholars’ attention to how the use of generative AI is increasingly entering seamlessly and often invisibly into social interactions that have so far been studied as only occurring between humans. In doing so, she reframes a longstanding question that linguistic anthropologists and sociolinguists share, namely: what does one learn about communication when one takes every communicative interaction to be the cooperative co-production of meaningfulness? In turning to generative AI, the question now becomes what happens if this cooperative co-production of meaningfulness takes place when one of the participants is a new type of actor that operates according to principles of machine semiosis, and thus offers utterances resulting from probabilistic analysis instead of utterances keyed to context. Or put another way, since generative AI never refers, and people often view referentiality as a vital aspect of communication, what should sociolinguists be careful to attend to when studying human–AI interaction? We propose building upon Helen Kelly-Holmes’ insightful questions to offer some further suggestions of what scholars might attend to now that generative AI is a participant, focusing in turn on each component of the cooperative coproduction of meaningfulness. In some cases, the sociolinguistic toolkit will be very helpful. In other cases, we suggest that sociolinguistics will need to borrow from other sources to tackle some of the challenges that generative AI poses. What does it mean to understand generative AI as cooperating with its human interlocutors to produce meaning through interaction? Kelly-Holmes suggests that to understand this, scholars need to ask: “How do we deal now with technologically (co-)produced language, with a conversation or interaction that is both human and technological? What is a ChatGPT transcript in terms of sociolinguistics? Where do we locate the socio? Who is the speaker/author of the response … . is it my ChatGPT interactant, is it me, is it both of us?” We would like to add another set of questions, inspired by Judith Irvine's revision of Goffman's concept of participant frameworks, and suggest that we should also analyze how people understand the participant roles involved in these interactions—what it means to be an author, and what kind of author, and so on. What are the role fractions people in the interaction now attribute to human–AI conversations, and how do the ways these role fractions combine also shift as the conversation circulates (which we understand to be another way to get purchase on Kelly-Holmes’ trenchant observation of how often AI becomes invisible in interactions) (Goffman, 1974; Irvine, 1996)? To understand what role fractions might be at play, one must understand the semiotic ideologies people bring to bear on interpreting these interactions, an insight that linguistic anthropologists have used to good effect to reveal cultural differences between the anthropologist and their fieldwork interlocutors. That is, it is not enough to observe that humans are co-producing meaning with a machine, what one wants to know to fully understand the interaction is how these particular humans understand what it means to be a speaking machine. For example, is ChatGPT understood to be producing its own texts in response to prompts, dependent on its human partner for setting the terms and the only evaluator of the conversation's efficaciousness? Or is it seen as a dialogue partner, co-operating as an autonomous and creative actant? Each vision offers its own range of hopes and hysterias around what AI does to conversations. If AI is simply a tool for producing the genres and registers requested, people begin to dream of an end to the bullshit tasks that fill the hours of many white collar jobs (Gershon, 2023; Graeber, 2018). At the same time, moral panics around plagiarism and the death of creativity proliferate under this semiotic ideology. If, by contrast, AI is viewed as an autonomous dialogue partner, then many of the concerns come to the fore that Silvio proposes emerge when animation is the dominant focus (as opposed to performance) (Silvio, 2019). Silvio, in brief, suggests that animation encourages people to be especially concerned about free will and control. With semiotic ideologies about animated AI in mind, questions emerge such as: to what extent is AI functioning as the submissive partner in a hierarchical conversation, offering the human interlocutor what they long to hear? And thus what does it mean that human interactions will increasingly be littered with mechanical servants presenting only the most pleasing of responses (Murphy, 2022; Suchman, 2007)? Or, conversely, if AI is truly an autonomous dialogue partner, what do entities such as AI want? And what kind of strategies are they likely to implement? Many moral panics and countless science fiction narratives around AI revolve around fears that AI will eventually aspire to be the master in the master/servant dynamic, taking control over what humans will or will not be able to do. These semiotic ideologies do far more than shape the narratives that circulate around AI, they also strongly influence the strategies that people will choose when communicating with AI. To understand generative AI's role in the co-production of meaning, we first have to remind readers of one of the signal contributions of linguistic anthropological and sociolinguistic research, namely, the development of the concept of register and register formation (Agha, 2004). Register is key to analyzing how language conveys and constructs identities. Particular phono-lexical elements become associated with registers (whether based on race, class, gender, occupation, location, or other categories), and these linguistic elements co-occur with varying densities and frequencies as speakers deploy the identities associated with them in interaction. The formation and deployment of these registers have been the bread and butter of sociolinguistic research, working off of a fundamental Bahktin (1981) principle that sociocultural life happens in the creative quotation and recombination of the elements of language in situated contexts of use (also Bauman, 1999). Large language models (LLMs) turn out to be quite adept at deploying registers, too, in large part because LLMs are very complicated systems for determining what is most likely to co-occur with the “tokens” that have already appeared. Particularly for registers that are associated with sociological labels that appear in their training data (e.g., speaking in “a black voice” or “a white voice,” or speaking like a stereotypical middle-class American teenage girl), ChatGPT and other LLMs have been able to approximate (some) elements of these stereotypical identities. Some companies and researchers are banking on the idea that this feature will only become more advanced with later implementations of current models. The startup character.ai offers a suite of identities for its users to choose from—both generic, demographically defined identities as well as particular individuals—or the ability to create their own. People can now define exactly what identities they want to interact with. Some social science researchers have even suggested that LLMs trained on material from particular demographic categories of people can be used in place of doing interviews with actual humans (Argyle et al., 2023): that one could “interview” LLMs speaking in the voice of a 40-year-old Hispanic man from Southern California, for example, rather than going out and speaking to the human versions (perhaps the researchers can train an LLM on interviewer data, so that one LLM could interview the other!). Using generative AI to study registers is indeed turning to what these LLMs do especially well—mapping the co-occurring register features of particular identities. Yet before adopting LLMs for this type of research, researchers need to answer pressing questions such as: what differences exist between the ways that humans and machines deploy registers in context? And thus we urge our colleagues to begin laying the groundwork for the much-needed comparative analysis of human- and machine-generated enregistered text. When LLMs are used to produce registers, they are contributing to standardizing projects. Analyzing registers up until now has contributed significantly to understanding processes of social change, insights that will have to be produced differently with LLMs possibly producing the utterances that sociolinguists analyze. Will these models cement particular forms as being the shibboleths of registers, essentially creating standardized forms of non-standard language? Will the production of enregistered text by LLMs start to transform the register features that humans use, in a process of either convergence with or divergence from the machine-produced forms? Enregisterment has offered insights into how people attribute value to ways of speaking, and penalize others. How will LLMs’ utterances shape the ways in which language is imbued with symbolic capital, a complicated process of co-production indeed? Sociolinguists’ careful attention to registers and register formation will be crucial to this work. As objects of sociolinguistic research, registers rarely deal with questions of reference and semantic meaning. There is no referential difference between “fourth floor” and “fohth floah,” to use William Labov's well-known example (Labov, 1966). Sociolinguistic research, especially variationist sociolinguistics research, focuses almost exclusively on non-referential aspects of indexicality, how speakers say “the same thing” in different ways and thus with different indexical effects. Yet much of the popular concern with LLMs is precisely their relationship to reference, and their total inability to know or refer to “the world out there.” Commenters talk about LLMs hallucinating when they confidently reference court cases or facts or medical procedures that don't exist. But strictly speaking, LLMs are always “hallucinating” because they are always creating text through a process of guessing co-occurrence features, not constructing sentences based on their connection to the world. Journalists and teachers worry about an ever-more misinformed populace. Does sociolinguistics have tools for responding to these anxieties? Does it need to start to take questions of reference more seriously in order to do so? Hallucinations are not the only reference-related concerns that people have with LLMs. To return to our earlier comments, some of the contemporary anxieties about users having unhealthy master/servant relationships with their LLM “friends” or “partners” come from the semantic vacuousness of their outputs. Take a recent news story about the quasi-romantic relationships that some men are having with LLM avatars. As sociologist Sherry Turkle noted there, men are turning to these relationships for assurance and agreement. In her opinion, these avatars offer men the possibility of a relationship “that doesn't involve friction and pushback and vulnerability” (Martínez someone or something that can say “but wait, that doesn't sound right to me.” Turkle's qualms are not unusual, there is a broader public semiotic ideology of reference that is affecting the ways in which many people structure their relationships to the LLMs, or at least structure their fears and anxieties about them. Whether or not LLMs can refer, and there is a robust academic debate about whether LLMs refer in the particular way that philosophers of language use that term (Bender et al., 2021, Lederman Mandelkorn & Linzen, 2023), laypeople engage with LLMs as if LLMs are in fact referring to different contexts. The fact that sociolinguistics has largely side-stepped issues of reference makes this a challenging part of developing a response to the increasing role of LLMs in our lives and research. Understanding the role of LLMs will require us to attend far more to questions of reference. What elements of the sociolinguistics toolkit seem to be best equipped for understanding some of the ways that people are interacting with LLMs? Adding to Kelly-Holmes’ trenchant suggestions, in this comment we have argued that the long-standing interest within sociolinguistics on registers puts it in a very good position to understand the ways that LLMs are being used to produce responses in demographically or individually identifiable voices. We also think sociolinguists are especially well positioned to determine the ways that these machine-produced simulacra of enregistered speech differ from their human versions, and there is an urgent need to start producing these insights. At the same time, some other aspects of the popular use of LLM seem to require tools from other disciplines like linguistic anthropology. This is particularly true for understanding the semiotic ideologies that people bring to bear on their interactions with machines, which also involves attending to how people on the ground understand the participant structure of human–AI interactions. And both sociolinguists and linguistic anthropologists now need to engage with the ways that machine relationships to reference and semantic content differ so strongly from their human forms.
Gershon et al. (Fri,) studied this question.