Linguistic Justice and GenAI

Language and Linguistic Justice

Linguistic justice is an anti-racist approach to language and communication (Baker-Bell, 2020; CCCC, 2020). Over the last several years, scholars whose work focuses on linguistic justice, raciolinguistics, and translingualism in literacy education have called for increased attention to how the academy privileges one dialect of English, but leaves little room for alternative ways of meaning making, especially with respect to those dialects practiced by people of color and other minoritized communities.

Language is one of the key ways we communicate our inner experiences (our thoughts, ideas, visions, experiences, memories, etc.) with others. How we use language allows us to be recognized as a member of a community, and to assert our belongingness in that community. Each language and dialect has its own history of use, its own history of tying people together through social, political, and economic happenings to specific communities (Gee, 2015). And because of these histories, all language practices are tied to social, economic, and political power (Gee, 2015; Street, 1984). Language can be a sword that asserts social power, or a shield that protects shared identities and resists hegemonic power (Collins and Blot, 2006).

One of the fundamental ideas underlying linguistic justice is that no single dialect is better than another. The popular assumption about language is that each named language (e.g., Chinese, English, French, German, Japanese, etc.) is a monolithic entity that can be associated with a nation state. But language is more complicated than that. For example, the closer you get to the French/German border, the more French sounds like German and the more German sounds like French. The spoken language of northern England is quite different from the spoken language of London. Conceptualizing English as a stable entity that exists in the world overlooks the complexity of English in practice. Assumptions that monolithic languages exist in the world as unitary entities are inaccurate (Canagarajah, 2021; Horner and Alvarez, 2019), for we are always negotiating language with others, and languages don’t exist outside the instances in which they are practiced, spoken, listened to, written down, read, etc. (Pennycook, 2010). The concept of English as a term to describe language is more like a trendline that unifies a variety of dialects than a stable boundary that determines whether language practices are inside/outside of its domain.

All language practices are related to power and ideology (Collins and Blot, 2006; Gee, 2015; Street, 1984). Over time, some dialects get entrenched in social institutions as the standard, dominant language of power. People assume that standard English is the correct English (fix your grammar), and is a “neutral” dialect available equally to everyone (Davilla, 2016). But standard English has its own history: it is one dialect of many practiced around the world, and emerges from the histories of white, upper-middle-class northern Americans. As such, it is a dialect tied to colonial power throughout the history of the United States, a power that persists in our modern institutions of government, justice, education, and capitalism (Flores and Rosa, 2015). What we call standard English (including academic English) is perhaps better understood as White Mainstream English (Baker-Bell, 2020). It is rooted in white cultural practices and white-dominated social institutions. However, it has no special claim on clarity of thought, or sophistication of expression, including in the sciences (Lerner, 2018; Poe, 2013).

Just like power in society at large, power is unevenly credited and recognized across dialects of English. Linguistic justice approaches to language education seek to highlight the distribution of power and make access to power through language more equitable. Unsurprisingly, the languages practiced by people of color are often marginalized in social interactions. No language or dialect is inherently better than another at communicating ideas or experiences, and when we make claims that there are hierarchical differences, and when we assert dominant power in language by racializing or minoritizing one dialect over another, we are often projecting other ideologies of power on to the language itself. The title of linguistic anthropologist Jonthan Rosa’s book sums up the power dynamics of language: Looking like a language, sounding like a race. Racializing or otherwise minoritizing the value (“appropriateness”) of a dialect is the function of readers and listeners projecting their own ideologies onto the language (Rosa and Flores, 2017). The attitudes we bring to bear on others’ language practices shape whether we honor their language or treat that language with prejudice (McKinney and Hoggan, 2022; Young, 2010).

Language differences aren’t reducible to issues of aesthetics, for language differences yield to differences in identity and differences in ways of acting, thinking, feeling, and being in the world (Gee, 2015). Students have the right to use their home dialects in the meaning-making process (Pimentel, 2021; Young, 2010). Teachers and scholars have therefore demanded a concerted effort to teach language and writing in ways that support minoritized students’ language practices (see “This Ain’t Another Statement! This is a Demand for Black Justice!”). White Mainstream English does not reflect the identity, experiences, or cultural values of all Americans. Our students have the right to be able to choose how they want to communicate their ideas with others, both in the classroom and in the workplace.

Problems Posed by GenAI

So how is GenAI related to the issue of linguistic justice in American higher education? In short, the language reified by GenAI is the language of White Mainstream English. The training sets used to develop GenAI platforms primarily consist of internet-based language developed by mostly younger and mostly white writers. The text generated by GenAI collapses style into a vanilla discourse. Like Wonder Bread, it is a generic discourse stripped of the enriching features of language that make communication a human cultural practice. All writing is an expression of social subjectivity. As the widely influential linguist and literacy theorist James Paul Gee (2015) has explained, we are always working to be recognized as a person playing a certain role (engineer, doctor, artist, accountant), doing a specific socially recognizable act in and through language. However, when we write in and through GenAI, “writing as a practice of subjectivity is modulated by the computational unconscious,” which is largely predicated on abled, cysgendered, heteronormative, male, and white discourse (Robinson, 2022).

While ChatGPT and other autonomous writing technologies are trained on massive data sets, as linguists Bender et al. (2021) explain, “Size Doesn’t Guarantee Diversity” (p. 613), for current GenAI training practice “privileges the hegemonic viewpoint. In accepting large amounts of web text as ‘representative’ of ‘all’ humanity we risk perpetuating dominant viewpoints, increasing power imbalances, and further reifying inequality” (p.614). The language models that power the algorithms to generate natural streams of language are primarily constituted by language created by younger, more affluent, White, western communities. Moreover, efforts to detect GenAI text are biased against non-native English writers (Liang et al., 2023).

Certainly, texts generated by communities other than young, western, white communities have already been introduced into the training models, but their voices are underprivileged in the models due to a lack of sufficient incoming and outgoing links to the material (Bender et al., 2021). Although some have called for changing how we train GenAI (Nee, Smith, and Rustagi, 2022), there is no indication at this time that GenAI will be able to reflect the variant discourses shared by communities of color and marginalized groups that often occupy the public sphere in opposition to the hegemonic power of White Mainstream English. Where is the GenAI for Black English? Rural southern English? Spanglish? These are all dialects practiced daily by people doing meaningful, intelligent, rigorous work through language and writing, but the technology excludes the possibility for these dialects to be incorporated into the meaning-making process.

Furthermore, the filtering protocols that are applied to the training data can result in the reduction of counter-cultural terms because they may be associated with knowledge domains excluded by the organizations who own and operate the algorithms (e.g., Porn), but have been remediated by alternative communities to have different meanings, such as the term “twink” (Bender et al., 2021, p. 614). The result of these dynamics can be an echo-chamber that limits the kinds of words and knowledge permitted into the stream of AI generated text, as well as the types of discourse accessible to users of GenAI.

The hyper-reduction of difference inherent in GenAI algorithms represents an important limit to the possibilities for our students to create language that represents their diverse experiences in the world. When we work with GenAI, we are writing to/with the algorithm to generate useful texts. However, even as we are working to engineer prompts that result in helpful streams of language, “Algorithmic audiences curate us. They determine what we see on our screen and when we see it” (Gallagher, 2020, p. 4). Difference in language and writing is foundational to meaning making (Lu and Horner, 2013). Every utterance is an act of difference-making as we build from our prior knowledge, memories, perceptions, and creative energies to organize the world into some kind of meaning that connects us to each other and allows us to engage in meaningful social interactions. GenAI certainly offers powerful tools for generating language, but for many of our students, the language produced by the algorithm may well limit the potentials for meaning and eloquence available to them.

References

Baker-Bell, April. (2020). Linguistic justice: Black language, literacy, identity and pedagogy. Routledge.

Bender et al. (2021). “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜,” in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT '21 (New York, NY, USA: Association for Computing Machinery, 2021), 610–23.

Canagarajah, S. (2021). “Diversifying academic communication in anti-racist scholarship: The value of a translingual orientation.” Ethnicities, 14687968211061586.

Collins, James, and Blot, Richard. (2003). Literacy and Literacies: Text, Power, and Identity. Cambridge UP.

Conference on College Composition and Communication. (2020). This ain’t another statement! This is a DEMAND for black linguistic justice! https://cccc.ncte.org/cccc/demand-for-black-linguistic-justice

Davila, Bethany. (2016). “The inevitability of “standard” English: Discursive constructions of standard language ideologies.” Written Communication, 32(2), 127-148. https://doi.org/10.1177/0741088316632186.

Flores, Nelson, and Rosa, Jonathan. (2015). “Undoing appropriateness: Raciolinguistic ideologies and language diversity in education.” Harvard Educational Review, 85(2), 149-171. https://doi.org/10.17763/0017-8055.85.2.149.

Gallagher, John R. (2020). “The ethics of writing for algorithmic audiences.” Computers and Composition 57 (pp. 1-?) doi: 10.1016/j.compcom.2020.102583

Gee, J. (2015). Social linguistics and literacies: Ideology in discourses. Routledge.

Horner, Bruce, and Alvarez, Sara P. (2019). "Defining Translinguality." Literacy in Composition Studies, 7(2), 1-30.

Lerner, Neal. (2018). “WAC Journal interview of Asao B. Inoue.” The WAC Journal, 29(1), 112-118.

Liang, Weixin et al. (2023). “GPT detectors are biased against non-native English writers.” Preprint. Cornell University: https://arxiv.org/abs/2304.02819

Lu, Min-Zhan, and Horner, Bruce. (2013). “Translingual Literacy, Language Difference, and Matters of Agency.” College English, 75(6), 582–607.

McKinney, Emry, and Hoggan, Chad (2022). “Language, identity, & social equity: educational responses to dialect hegemony.” International Journal of Lifelong Education, 1-13.

Nee, J., Smith, G. M., Sheares, A., and Rustagi, I. (2022). “Linguistic justice as a framework for designing, developing, and managing natural language processing tools.” Big Data & Society, 9(1). https://doi.org/10.1177/20539517221090930

Pennycook, A. (2010). Language as a local practice. Routledge.

Pimentel, Octavio. (2021). “The push for the 1974 statement...once again.” WPA: Writing Program Administration, 44(3), 63-67.

Poe, Mya. (2013). “Reframing race in teaching writing across the curriculum.” Across the Disciplines, 10(3), 1-14. https://doi.org/10.37514/ATD-J.2013.10.3.06.

Robinson, Bradley. (2022). “Speculative Propositions for Digital Writing Under the New Autonomous Model of Literacy.” Postdigital Science and Education. https://doi.org/10.1007/s42438-022-00358-5

Rosa, Jonathan, and Flores, Nelson. (2017). “Unsettling race and language: Toward a raciolinguistic perspective.” Language in society, 46(5), 621-647. https://doi.org/10.1017/S0047404517000562.

Rosa, Jonathan. (2018). Looking like a language, sounding like a race: Raciolinguistic ideologies and the learning of Latinidad. Oxford UP.

Street, B. V. (1984). Literacy in theory and practice. Cambridge UP.

Young, Vershawn Ashanti. (2010). “Should writers use they own English?” Iowa Journal of Cultural Studies, 12(1), 110-118. https://doi.org/10.17077/2168-569X.1095.

Search: {{$root.lsaSearchQuery.q}}, Page {{$root.page}}

for

Search: {{$root.lsaSearchQuery.q}}, Page {{$root.page}}

for

Linguistic Justice and GenAI

Language and Linguistic Justice

Problems Posed by GenAI

References