So, where is “Merge”?

brainmap.png

Recently it's become popular with linguists with significant reputation to start blog posts where they give their opinions on the state of the field, e.g., my favorite, Norbert Hornstein's Faculty of Language. I definitely do not have the clout or experience to join their ranks. But, journal article writing is long, and tweets are fleeting, so I figured I’d write my first post. This entry's a bit long and maybe a bit feisty, so we'll see if there's a second, or if I'm just totally in over my head.

I'm still relatively new to neurolinguistics. Most of my work up until now has focused on the ‘what’ and the ‘why’ questions in sentence processing, and less on the ‘how’ and the ‘where (in the brain)’ questions. In the literatures that I'm typically in conversation with, it's acknowledged that filler-gap dependencies exhibit different kind of processing mechanics compared to agreement (or quantifier-variable, or anaphor-antecedent, or….) dependencies. They may overlap in mechanisms, e.g., they may depend on cue-based retrieval, but I would argue they're still qualitatively different. They're also different representationally speaking. After all, these different taxonomic categories are borrowed from syntactic theory, and syntactic theory distinguishes them because their grammatical properties are different.

By contrast, neurolinguists seem to be on a project to find ‘syntax’ in the brain. And, the results are complicated! Pylkkänen (2019) summarizes her group's work, disentangling a number of factors associated with syntactic and semantic composition in left anterior temporal lobe (LATL), and finds that LATL is mostly sensitive to conceptual semantic features. Fedorenko et al. (2020) argue that no brain areas are more responsive to syntactic composition compared to lexico-semantic meaning. Meanwhile, Matchin & Hicock (2019) argue that, roughly, posterior middle temporal gyrus is responsive to lexico-syntactic representations, and other classic language areas (including LATL and left inferior frontal gyrus) play supporting roles.

So, where does that leave us? I'm not so sure. With the exception of Matchin & Hicock (whose theory I'm attracted by), it's not always obvious to me what this field has in mind when they talk about ‘syntax'. I see the term ‘Merge’ pop up frequently to mean something like ‘minimal syntactic computation’ or ‘minimal syntactic representation’. Vaguely, I think this is coextensive with phrase structure, and it's a term borrowed from Chomsky. But, should we expect there to be a brain area corresponding to Merge, at least in Chomsky's sense?

Let's take the idea of Merge, in Chomsky's sense, seriously. Merge is an abstract syntactic operation that takes two syntactic objects, α and β, and yields a new syntactic object composed of these parts, {α, β}. Depending on the formulation of Merge, this may also involve a related event of Labeling, e.g., identifying the unit as [αP α β] such that α is the head of the construction, and/or feature checking, e.g., Merge(α, β) ‘checks off’ some feature on α that β supplies. But, what are α and β, and what are their features? In the recursive case, they're other output of Merge operations, which yields hierarchical structure and the creativity of language. Otherwise, α, β are whatever the minimal atomic units of the language are, e.g., -s, dog, kick the bucket, [PST], Ø, and their features are whatever their features are.

If we take this seriously, then I'm suspicious that there'll be a single neural correlate corresponding to Merge. There's two reasons for this. First, we have to consider what exactly the properties of the merged elements are. As far as linguistic theory goes, positing a lexicon of atomic units and an array of atomic features that Merge is defined over is enough. That is, syntactic theory doesn't care whether Merge (or Agree) checks a category/selectional feature, or number, gender, tense, etc. By contrast, from the perspective of neuroanatomy, it may matter a great deal what the specific, individual features and syntactic units actually are. Carreiras et al. (2010) found that number marking causes right intraparietal sulcus to light up, which they argue is also implicated in numerical cognition. But, the same group (Carreiras et al. 2015) found that number agreement between a determiner-noun phrase (las chicas ‘‘the.Fem.PL girl.PL’ / la chica ‘the.Fem.SG girl.SG’) and subject noun-verb (chicas corren ‘girl.PL run.3.Pres.PL’) share some overlapping networks, but are distinguished by others. Other work by this group in gender and number agreement in Spanish show that the brain areas involved differ a lot depending on what's agreeing and in what features. In other words, the neural correlates of Merge(α, β) may be quite different depending on the α and the β. (Interestingly, the semantic composition effect in LATL is more construction-neutral, cf. Westerlund et al. 2015). On an unrelated note, this is only seen by comparing many experiments manipulating dozens of variables – large-scale experimental paradigms such as naturalistic speech and Jabberwocky paradigms may not be so well-suited for sorting out these fine-grained details.

The second reason I don't expect there to be a coherent, single neural correlate to Merge is because language processing is dynamic and predictive, and unfolds at different levels of detail and at different time courses. Returning to the simple case of a determiner and its noun, e.g. the girl, several processes are likely underway before the noun phrase is even complete. It's likely that a partial syntactic representation of the entire sentence has been built, e.g., [S [NP the girl] [VP ]], along with some partial semantic representation, e.g., ∃e,∃x[girl(x) & agent(e,x)]. This anticipatorily-built structure likely constrains other details, e.g., read is now a more-likely verb compared to crumble, compared to sentences beginning with the cookie. So, how many times did we Merge in processing a determiner-noun pair? Depending on how detailed and how fast the brain generates this anticipatory structure, it could be any number of Merges at any number of latencies, but surely more than just Merge(the, girl).. It's contentious how deep these processes go and how detailed the structure is (Nieuwland et al. 2017 pops to mind; personally, I think syntactic prediction is very detailed – Chacón 2019), but it's hard to solve the Merge equation without solving the prediction one. Plus, this is ignoring effects of context (linguistic and otherwise), which also likely exert huge effects.

This may sound pessimistic, but I think the most productive way forward is to return to a kind of soft-constructionism. In practice, syntactic theorists regularly make reference to various diagnostics and taxonomic categories, e.g., ‘is this agreement or is it a clitic?’, ‘is this ellipsis or a null argument?’. Psycholinguists do, too, at least implicitly. Even if all constituent structure is built by Merge, and all parses are built with cue-based retrieval (or whatever), representational content still matters a great deal. My impression is that this kind of soft-constructionism isn't en vogue for constructing grand theories, but I think it might be a more interesting way to link ideas between these different subfields.