About Me

I am an Assistant Professor of Computational Linguistics at the Rochester Institute of Technology where I lead the Language Technology Group. I am affiliated with the College of Liberal Arts and the Golisano College of Computing and Information Sciences Masters in Data Science and PhD Program.

I received my PhD from Saarland University with a thesis on computational models applied to varieties of pluricentric languages. In Saarland, I was a member of the Multilinguality and Language Technology Lab at the German Research Center for Artificial Intelligence (DFKI) part of the Saarland Informatics Campus.

I am member of the Association for Computational Linguistics (ACL), the ACL SIGEDU and a Fellow of the UK Higher Education Academy. I was a member of the INDUS Network funded by the German Research Foundation from 2014-2018.

Research Interests

I am interested in applying computational models to large collections of texts. My research interests fall into three main areas: 1) language variation: diatopic and diachronic; 2) language acquisition and educational NLP applications, 3) pragmatics and offensiveness in user interaction in social media. I am also interested in machine translation and translation technology.

You can find a complete list of my publications and the language resources I've help developing.

Ongoing Activities

I am one of the organizers of the series of workshops on NLP for Similar Languages, Varieties and Dialects (VarDial). The next edition is VarDial 2020 at COLING in Barcelona, Spain. I am co-organizing the Similar Language Translation Task at WMT 2020 and the SemEval 2020 Task 12 - OffensEval: Multilingual Offensive Language Identification in Social Media..

I am co-editing with Preslav Nakov a book for the series Studies in Natural Language Processing and a special issue of the journal Natural Language Engineering on the same topic. Both publications by Cambridge University Press.

I am writing a book on Automatic Language Identification with Tommi Jauhiainen, Tim Baldwin, and Krister Linden to appear in the Morgan & Claypool Series Synthesis Lecutres in Human Language Technologies.

I regularly serve as a program committee member of several workshops and conferences. A full list is available here.


6/12/20 - The TRAC-2 workshop proceedings and shared task report are available here.
3/21/20 - One paper accepted at LREC. Pre-print here.
11/12/19 - I am looking for PhD students starting Fall 2020. See the PhD Program website for the admission process and requirements. If you qualify and are interested in working with me, please get in touch!
11/12/19 - VarDial 2020 will be co-located with COLING 2020 in Barcelona. The call for paper is out. See here.
10/25/19 - The website for SemEval 2020 Task 12 - OffensEval 2 is online here. The training sets will be available soon.