#4168

Workshop

Simple NLP for Learner Corpus Analysis: A Hands-On Workshop

Time not set

In applied linguistics, particularly within ESL and EFL contexts, the ability to develop and analyse learner corpora has become increasingly important (Deshors & Gries, 2020). Learner corpora, structured collections of texts produced by language learners, can provide invaluable insights into language development and inform pedagogical strategies. Integrating these corpora with natural language processing (NLP) techniques, such as part-of-speech tagging and lemmatisation, enhances the automated calculation of lexical richness measures, thereby offering more nuanced assessments of EFL written (Spring & Johnson, 2022) and spoken (Kyle, 2021) proficiency. Moreover, NLP can help develop targeted pedagogical materials (Granger et al., 2007). This hands-on workshop introduces participants to foundational Python programming and provides practical experience with NLP libraries, including the Natural Language Toolkit (NLTK) and spaCy. The workshop will cover installing and employing NLTK and spaCy to analyze textual data for lexical properties such as tokenization and part-of-speech tagging. No prior programming knowledge is required; the workshop is designed to guide participants through a series of tasks within a shared Jupyter Notebook. We will use a Google Colab notebook explicitly developed to provide participants with step-by-step instructions for writing simple code in Python and importing and analyzing text in a practical, hands-on learning environment.