top of page


We are honored to welcome the following instructors to this year's AALA pre-conference workshop.

The workshops will take place on September 2.

Antony Kunnan

with co-instructor Geoff LaFlair


Workshop title:

Using LLMs for content and item generation

Workshop abstract:

This workshop offers a comprehensive introduction and practical examples for educators interested in leveraging item generation with large language models (LLMs) for language assessment development (Attali et al. 2022). The key objectives of the workshop include understanding the foundations of item generation using LLMs and integrating them into the content and item development process. 


During the workshop, participants will learn the necessary steps for designing and implementing item generation using LLMs in language assessments, including the selection of appropriate item types, the development of content specifications, and the use of LLMs for generating diverse and engaging items. To ensure a hands-on experience for workshop participants, the event will provide opportunities for attendees to actively engage with automated item generation using LLMs. Participants will work with sample prompts and experiment with LLM-generated content. This practical approach will allow participants to gain firsthand experience in using LLMs for language assessment development.

Morning session (10AM-12PM): 2 hours AND Afternoon session (1:30PM-3:30PM): 2 hours


Antony Kunnan

Antony John Kunnan is a Principal Assessment Scientist at Duolingo and a Senior Research Fellow at Carnegie Mellon University. His specializations include research on assessment validation, fairness and policy. His latest publications include an authored book titled Evaluating language assessments (Routledge, 2018), an edited 4-volume set titled The Companion to Language Assessment (Wiley, 2014), and a journal article on Scenario-based language assessment (with C. Qin and C. Zhao) in Language Assessment Quarterly (2022). Recently, he gave a talk on "Authenticity and Interactiveness in the of AI in language assessment' at the Education University of Hong Kong. He was the founding president of AALA, a past president of ILTA, and the founding editor of Language Assessment Quarterly. He has given talks on language assessment in over 35 countries.


Geoff LaFlair

Geoff LaFlair is a Lead Assessment Scientist at Duolingo where he works on research and development of the Duolingo English Test and leads the Test Development team. His research interests are situated at the intersection of applied linguistics, psychometrics, and machine learning. His work has been published in Language Testing, Applied Linguistics, The Modern Language Journal, the Transactions of the Association for Computational Linguistics, Journal of Computer Assisted Learning, Frontiers in Artificial Intelligence, and Empirical Methods in Natural Language Processing.

Rie Koizumi

University of Tsukuba

Workshop title:

Creating L2 classroom-based speaking assessment for learning

Workshop abstract:

This workshop offers a set of guiding principles and practical examples that instructors can use to optimize second-language (L2) classroom assessment of speaking skills for effective learning purposes (Koizumi, 2022; Muñoz & Álvarez, 2010). These principles include the following. During the test development phase, careful planning is needed to integrate L2 speaking assessment into the curriculum. Additionally, instructors should employ a variety of tasks (e.g., monologic and dialogic tasks including teacher-led, paired, and group interactions) and rubrics that reflect teaching content and activities. In the test administration and scoring phases, teachers should discuss the assessment criteria to ensure the reliability of test scores. The feedback stage should preferably involve the prompt provision of score reports to each student by the teacher. This feedback will allow students to compare their results with their self-assessment and peer assessment results, identify areas of strength and weakness, and chart a course for further improvement. Moreover, teachers can use the feedback to gain insight into their students' characteristics and adjust their instruction accordingly. Throughout all stages, feasibility should be taken into account while considering the time and other resources available to teachers.


Although adhering to these principles may present challenges for classroom teachers, the workshop will highlight successful examples from Japanese secondary and tertiary schools that follow these principles. These examples include administering approximately five speaking tests over the course of a school year, conducting brief 10-minute discussions on scoring criteria at the start of each speaking test, having students engage in group discussions while a few teachers evaluate each group, and using artificial intelligence to score primarily monologic speaking tests taken by students at home, with interactive tests taking place in the classroom. The workshop will showcase the results of these practices, as evaluated using student questionnaire responses, as well as analytical methods such as many-facet Rasch measurement and generalizability theory. Finally, workshop participants will have ample opportunities to share their experiences and ideas, experiment with test examples, and discuss feasible and effective methods of implementing L2 classroom-based speaking assessment in their teaching contexts.

Morning session (10AM-12PM): 2 hours AND Afternoon session (1:30PM-3:30PM): 2 hours


Rie Koizumi is a Professor of English at the University of Tsukuba, Japan. Her research interests include assessing and modeling second language speaking ability, performance, and development and learning-oriented language assessment in the classroom. She has published her work in Language Testing, Language Assessment Quarterly, System, and other journals.

Vahid Aryadoust

with co-instructor Xuelian Zhu

Nanyang Technological University


Getting Started with Linear Mixed Effect Models for Language Assessment

Workshop abstract:

Linear Mixed Effect Models (LMEMs) are statistical models that are widely used in different streams of research. They allow for the analysis of data that involves multiple sources of variation, such as individual differences among participants and differences between items. This workshop will provide an introduction to LMEMs using the free software Jamovi. The workshop will be targeted towards graduate students and researchers who have some basic knowledge of statistics such as general linear model (GLM) and wish to learn how to apply LMEMs in their own research.


The workshop will cover the following topics:

  • What are LMEMs and why are they useful in language assessment research?

  • Setting up and running LMEMs in Jamovi

  • Interpreting the results of LMEMs, such as fit statistics, fixed effects, random effects, and variance components.

By the end of the workshop, participants will have gained a working understanding of LMEMs and will have learned how to utilize the Jamovi software to perform basic LMEM analysis.



  • Installing Jamovi: Jamovi is a free statistical software package, which will be used to teach the workshop. Participants will need to have Jamovi installed on their laptops before attending the workshop. They can download Jamovi from the official website and follow the installation instructions provided. It is recommended that participants install Jamovi before the workshop to ensure a smooth and timely start.

  • Familiarity with GLM: It is recommended that participants have a basic understanding of GLMs before attending the workshop. This will help provide a foundation for understanding the concepts and techniques used in LMEMs.

Contingency plan

To have a backup plan in place, given that a reliable internet connection is crucial for using Jamovi, it is recommended that participants also install JASP, a free statistical analysis software program.


Vahid Aryadoust

Vahid Aryadoust, an Associate Professor of language assessment at the National Institute of Education, Nanyang Technological University, Singapore, specializes in language assessment, meta-analysis, Scientometrics, and sensor technologies. His research has been published in prestigious journals such as Computer Assisted Language Learning, Language Testing, System, Current Psychology, Language Assessment Quarterly, Assessing Writing, Educational Assessment, Educational Psychology, among others. He has also contributed to books and book chapters published by renowned publishers like Routledge, Cambridge University Press, Springer, Cambridge Scholar Publishing, Wiley Blackwell, and more.


Dr. Aryadoust has led assessment research projects supported by educational fund-providers in Singapore, USA, UK, and Canada. He serves on the Advisory Board of several international journals and was awarded the Intercontinental Academia Fellowship in 2018-2019. As an advocate for knowledge-sharing, equity, and free education, he established the "Statistics and Theory" YouTube channel to share his expertise in quantitative methods and theory. In recognition of his exceptional use of social media, his YouTube channel received the John Cheung Social Media Award in 2020.

Xuelian Zhu

Xuelian Zhu is a PhD candidate from the National Institute of Education of Nanyang Technological University, Singapore. Xuelian has extensive experience as a conference interpreter and trainer, providing interpreting services at both local and international levels. Her experiences in professional practice and teaching led her to quantitative research in language and interpreting studies. This journey has allowed her to accumulate substantial expertise, as exemplified by a co-authored book chapter on structural equation modeling and publications on utilizing the Rasch models in the context of language testing. Xuelian’s current research focuses on using linear mixed effect models to conduct rigorous analysis on data collected through sensor technologies, specifically eye-tracking and Galvanic Skin Response.

bottom of page