Professor of Language and Speech Technology; Director of the Meertens Institute

Member of the Centre for Language Studies and the Department of Communication and Information Sciences, Faculty of Arts, Radboud University

Member of the Royal Dutch Academy of Arts and Sciences; ECCAI Fellow


Ph +31 24 3611647
a.vandenbosch (at)

visiting (directions)
Radboud University
Erasmusgebouw, room E4.05
Erasmusplein 1
6525 HT Nijmegen
The Netherlands

Faculty of Arts, CIW
Radboud University
P.O. Box 9103
NL-6500 HD Nijmegen
The Netherlands

Research interests

In my research I develop machine learning and language technology. Most of my work involves the intersection of the two fields: computers that learn to understand and generate natural language. Specific interests include memory-based learning, machine translation, the relation between written and spoken language, text mining, the Dutch language, computational humanities, and cultural heritage. My CV has more detailed information.




Since September 2011 I am professor of Language and Speech Technology at Radboud University, Nijmegen, the Netherlands, within the Centre for Language Studies (CLS), of which I am the research director. I also co-direct the Language and Speech Technology research group, one of the fifteen groups within CLS. I am a member of the department of Communication and Information Sciences of the Faculty of Arts.

I am guest professor at CLiPS, the Computational Linguistics and Psycholinguistics Research Centre at the University of Antwerp. I spent many good years at the ILK Research Group at Tilburg University. I am humanities integrator at the Netherlands eScience Center, fellow of the Donders Institute, and member of the Royal Netherlands Academy of Arts and Sciences.



I am General Chair of ACL-2016, the 54th annual meeting of the Assocation for Computational Linguistics, Berlin, Germany, August 7-12, 2016.


Current projects

  • CLARIAH, Common Lab Research Infrastructure for the Arts and Humanities. I am a board member of this exciting new project that will continue and enlarge the digital infrastructure for the Humanities in the Netherlands.

  • ADNEXT, Adaptive Information Extraction over Time, a work package of the Infiniti project, part of COMMIT.

  • Nederlab, bringing together massive amounts of digitized Dutch texts from the Middle Ages to the present in one user-friendly and tool-enriched web interface. Funded by NWO.

  • Language in Interaction - with Peter Desain I coordinate WP7 'Utilization' of this NWO Gravitation programme. Check out our growing number of language apps for mobile phone and tablet.

  • DISCOSUMO, an NWO Creative Industry project with Tilburg centre for Cognition and Communication, Tilburg University, and Sanoma.

  • TraMOOC, a Horizon 2020 ICT collaborative project aiming at providing reliable machine Translation for Massive Open Online Courses (MOOCs).

  • FutureTDM, a Horizon 2020 coordinate and support action on reducing barriers and increasing uptake of text and data mining for research environments.

  • Notoriously Toxic, a National Endownment for the Humanities project on understanding the language and costs of hate and harassment in online games.


  • Tunes & Tales, a KNAW Computational Humanities project. Try MOMFER (Meertens Online Motif Finder).


  • FACT, Folktales as Classifiable Texts, an NWO CATCH project.


    Public Talks

    Past projects & highlights

    Older news items:

    Current (co-) supervised Ph.D. students

    • Sara Ahmadi (with Bert Cranen and Louis ten Bosch)
    • Martijn Bentum (with Mirjam Ernestus and Louis ten Bosch)
    • Peter Berck (with Eric Postma)
    • Maarten van Gompel
    • Ali Hürriyetoğlu (with Nelleke Oostdijk)
    • Florian Kunneman (with Margot van Mulken)
    • Louis Onrust (with Hugo Van hamme)
    • Roel Smeets (with Maarten De Pourcq)
    • Sebastiaan Tesink
    • Chara Tsoukala (with Stefan Frank and Mirjam Boersma)
    • Véronique Verhagen (with Ad Backus and Joost Schilperoord)

    Former (co-) supervised students



    Click and explore the following demos showcasing our recent work:





    Aside from papers and dissertations, our projects tend to produce software. We make a point of maximizing the availability of this software by releasing the best software projects under open source licenses. Some of our software, such as Timbl and Frog, is packaged and available in Debian Science. Other packages, particularly the ones that perform some natural language processing function, are available as webservices, usually with a web interface., our context-sensitive spelling corrector for Dutch

    As part of past and ongoing projects with many colleagues I was involved in developing the following software:

    Natural language processing

    • Frog: Dutch tagger-lemmatizer, morphological analyzer, and dependency parser. With the Frog development team.
    • and Dutch and English context-sensitive spelling correctors. With Maarten van Gompel, Wessel Stoop, Tanja Gaustad van Zaanen, and Monica Hajek.
    • PBMBMT: Phrase-based memory-based machine translation. With Maarten van Gompel.
    • Mbt: Memory-based tagger-generator and tagger. With Ko van der Sloot, Jakub Zavrel, and Walter Daelemans.
    • WOPR: Memory-based word prediction, language modeling, and spelling correction. Main developer: Peter Berck.

    Machine Learning

    • Timbl: Tilburg memory-based learner. With Ko van der Sloot, Walter Daelemans, and Jakub Zavrel.
    • Dimbl: Distributed Timbl, parallel k-NN classification on multi-CPU machines. Programmer: Ko van der Sloot.
    • Timpute: TiMBL-wrapper for internal database correction through imputation. Programmer: Steve Hunt.
    • paramsearch: automatic parameter optimization for various machine learning algorithms.
    • Fambl: Family-based learner, a generalized-example k-NN classifier. Reference guide.

    Crowd sourcing

    Help research and add your common sense! This crowd-sourcing experiment is run at the CLiPS research group in Antwerp. The top widget asks you to rate Dutch words on their subjectiveness and their polarity (negative to positive); the bottom one shows English words.