Bridging the Divide Between Linguistics and Artificial Intelligence: From Vectors to Symbols and Back Again
Capturing the structure of language is a central goal of both linguistics and natural language processing (NLP). Despite this overlap in goals, linguistics and NLP differ substantially in how they handle language. Linguistic theories typically assume a symbolic view, in which discrete units (e.g., words) are combined in structured ways (e.g., in syntax trees). In NLP, however, the most successful systems (such as ChatGPT) are based on neural networks, which encode information in continuous vectors that do not appear to have any of the rich structure posited in linguistics. In this talk, I will discuss two lines of work that show how vectors and symbols may not be as incompatible as they seem. First, I will present analyses of how standard neural networks represent syntactic structure, with the conclusion that - despite appearances - these models may not in fact challenge symbolic views of language. Second, I will turn from representations to learning by showing how neural networks can realize strong, structured learning biases of the sort traditionally seen only in symbolic models; this part of the talk will focus on a case study based on an account of syllable structure from Optimality Theory. Taken together, these results provide some first steps toward bridging the divide between linguistics and artificial intelligence.