Data-Intensive Experimental Linguistics

Steven Abney

Abstract


Computational linguistics is not a specialization of lin- guistics at all; it is a branch of computer science. A large majority of computational linguists have degrees in computer science and positions in computer science departments. It was founded as an offshoot of an en- gineering discipline (machine translation), and has been subsequently shaped by its place within artificial intelligence, and by a heavy influx of theory and method from speech recognition (another engineering discipline) and machine learning.

 

But computation is a means to an end; the essential feature is data collection, analysis, and prediction on the large scale. I will call it data-intensive experimental linguistics.

I wish to explain how data-intensive linguistics differs from mainstream practice, why I consider it to be genuine linguistics, and why I believe that it enables fundamental advances in our understanding of language.

 


Keywords


syntax; universals

Full Text: PDF PDF ()