A corpus linguist reflects on a historian’s masterclass

Naomi Tadmor: The semantic analysis of keywords in context

Dr Seth Mehl, Research Associate on the Linguistic DNA project (linguisticdna.org) writes:

On 30 October, I was delighted to attend Professor Naomi Tadmor’s masterclass, arranged by SCEMS. As I am primarily a corpus linguist who studies historical semantics, the master class was an opportunity to see how a historian engages with some of the same types of issues that I’m accustomed to addressing. In what follows, I briefly summarise Tadmor’s presentation, and then provide some reflections related to my own work.

Tadmor began her presentation by reviewing some seminal texts in the social history of the family (including Aries 1979, Laslett 1965, Macfarlane 1970, Stone 1977, Wrightson 1982) and connecting them with some key texts from 19th century study of the family in history (e.g. the works of Tönnies and Sir Henry Sumner Maine, respectively). The review illustrates the polarisation of the field from the 1960s through the 1980s, and Tadmor situated her own early work outside of this polarisation, even as the polarisation was a sort of catalyst for her research.

The key concluding points that Tadmor forwarded are, I think, important for any work with historical texts, and thus also crucial to historical research:

  • Understanding historical language (including word meaning) is necessary for understanding historical texts.
  • To understand historical language we must analyse it in context.
  • Analysing historical language in context requires close reading.

Whether we identify as historians, linguists, corpus linguists, literary scholars, or otherwise, we would do well to keep these points in mind.

Tadmor’s take on historical keywords

Tadmor’s specific arguments in the master class focused on kinship terms. In Early Modern English (EModE), there was a broad array of referents for kinship terms such as brother, mother, father, sister, and associated terms like family and friend, which are not likely to be intuitive to a speaker of Present Day English (PDE). Evidence shows, for example, that family often referred to all of the individuals living in a household, including servants, to the possible exclusion of biological relations living outside of the household. The paper Tadmor asked us to read in advance (first published in 1996), supplemented with other examples at the masterclass, provides extensive illustrations of the nuance of family and other kinship terms.

In EModE, there was also a narrow range of semantic or pragmatic implications related to kinship terms: these meanings generally involved social expectations, social networks, or social capital. So, father could refer to ‘biological father’ or ‘father-in-law’ (or even ‘King’), and implied a relationship of social expectation (rather than, for example, a relationship of affection or intimacy, as might be implied in PDE).

By identifying both the array of referents and the implications or senses conveyed by these kinship terms, Tadmor provides a thorough illustration of the terms’ lexical semantics. We can see this method as being motivated by historical questions (about the nature of Early Modern relationships); driven in its first stage by lexicology (insofar as it begins by asking about words, their referents, and senses); and then, in a final stage, employing lexicological knowledge to analyse texts and further address the initial historical questions. Tadmor avoids circularity by using one data set (in her 1996 paper) to identify a hypothesis regarding lexical semantics, and another data set to test her hypothesis. What do these observations about lexical semantics tell us about history? As Tadmor notes, it is by identifying these meanings that we can begin to understand categories of social actions and relationships, as well as motivations for those actions and relationships. Perhaps more fundamentally, it is only by understanding semantics in historical texts, that we can begin to understand the texts meaningfully.

A Corpus Linguist’s take on Tadmor’s methods

Reflecting on Tadmor’s talk, I’m reminded of the utility of the terms semasiology and onomasiology. In semantic research, semasiology is an approach which examines a term as an object of inquiry, and proceeds to identify the meanings of that term. Onomasiology is an approach which begins with a meaning, and then identifies the various terms for expressing it. Tadmor’s method is largely semasiological, insofar as it looks at the meanings of the term family and other kinship terms. This approach begins in a relatively straightforward way—find all of the instances of the word (or lemma), and you can then identify its various senses. The next step is more difficult: how do you distinguish its senses? In linguistics, a range of methods is available, with varying degrees of rigour and reproducibility, and it is important that these methods be outlined clearly. Tadmor’s study is also onomasiological, as she compares the different ways (often within a single text) of referring to a given member of the household family. This approach is less straightforward: how do you identify each time a member of the family is referred to? Again, a range of methods is available, each with its own advantages and disadvantages. A clear statement and justification of the choice of method renders any study more rigorous. In my experience, the systematicity of thinking in terms of onomasiology and semasiology is useful in developing a systematic and rigorous study.

Semasiology and onomasiology allow us to distinguish types of study and approaches to meaning, which can in turn help render our methods more explicit and clear. Similarly, distinguishing editorially between a word (e.g. family) and a meaning (e.g. ‘family’) is useful for clarity. Indeed, thinking methodologically in terms of semasiology and onomasiology encourages clarity of expression editorially regarding terms and meanings. In Tadmor’s 1996 paper, double quotes (e.g. “family”) are used to refer to either the word family or the meaning ‘family’ at various points. At times, such a paper could be rendered more clear, it seems to me, by adopting consistent editorial conventions like those used in linguistics (e.g. quotes or all caps for meanings, italics for terms). The distinction between a term and a meaning is by nature not always clear or certain: that difficulty is all the more reason for journals to adhere to rigorously defined editorial conventions.

From the distinction between terms and concepts, we can move to the distinction between senses and referents. It is important to be explicit both about changes in referent and changes in sense, when discussing semasiological change. For example, as historians and linguists, we must be sure that when we identify changes in a word’s referents (e.g. father referring to ‘father-in-law’), we also identify whether there are changes in its sense (e.g. ‘a relationship of social expectation’ or ‘a relationship of affection and intimacy’). When Thomas Turner refers to his father-in-law as father, he seems to be using the term, as identified by Tadmor, in its Early Modern sense implying ‘a relationship of social expectation’ rather than in the possible PDE sense implying ‘a relationship of affection and intimacy’. The terms referent and sense allow for this distinction, and are useful in practice when conducting this kind of semantic analysis.

Of course, if a term becomes polysemous, it can be applied to a new range of referents, with a new sense, or even with new implicatures or connotations. For example, we can imagine (perhaps counterfactually) a historical development in which family might have come to refer to cohabitants who were not blood relations. At the same time, in referring to those cohabitants who were not blood relations, family might have ceased to imply any kind of social expectation, social network, or social capital. That is, it’s possible for both the referent and the sense to change. In this case, as Tadmor has shown, that doesn’t seem to be what’s happened, but it’s important to investigate such possible polysemies.

Future possibilities: Corpus linguistics

As a corpus linguist, I’d be interested in investigating Tadmor’s semantic findings via a quantitative onomasiological study, looking more closely at selection probabilities. Such a study could ask research questions like:

  • Given that an Early Modern writer is expressing ‘nuclear family’, what is the probability of using term a, b, etc., in various contexts?
  • Given that a writer is expressing ‘household-family’, what is the probability of using term a, b, etc., in various contexts?
  • Given that a writer is expressing ‘spouse’s father’ or ‘brother’s sister’, etc., what is the probability of using term a, b, etc., in various contexts?

These onomasiological research questions (unlike semasiological ones) allow us to investigate logical probabilities of selection processes. This renders statistical analyses more robust. Changes in probabilities of selection over time are a useful illustration of onomasiological change, which is an essential part of semantic change.

To see what all of this means for the Linguistic DNA project, see the continuation of this blog post on the project website here.

