Sessions / Location Name: Room E302

Location not set by organizers

Exploring AI-Generated Synthetic Speech for Perceptual Training and Production Practice #4345

Fri, Jul 18, 17:00-18:00 Asia/Tokyo | LOCATION: Room E302

This 60-minute hands-on workshop invites language teachers interested in CALL tasks to explore the potential of AI assistants in generating level-appropriate elicited imitation (EI) and shadowing (Sh) audio materials. Focusing on the research question: To what extent can AI assistants generate EI and Sh content that meets targeted linguistic complexity? Participants will critically assess whether synthetic speech meets phonetic and linguistic criteria and learn strategies to “trick” AI into producing more precise English audio outputs. The session will provide demonstrations on using the International Phonetic Alphabet (IPA) alongside AI tools to generate audio files, supporting shadowing practice and enhancing perceptual discrimination skills. Interactive activities will compare AI-generated audio with human speech to evaluate naturalness, comprehensibility, and intelligibility. Attendees will gain practical skills in fine-tuning AI prompts, adapting synthetic outputs for diverse learner levels, and assessing speech quality using established linguistic criteria. Take-away benefits include hands-on experience with AI prompt engineering, insights into the integration of synthetic speech into language teaching, and practical guidelines for developing CALL materials that support both perceptual and productive language skills. This workshop empowers educators with innovative, empirically grounded strategies for enhancing language learning.

AI-Enhanced Extensive Reading: Empowering Students as Content Creators #4258

Fri, Jul 18, 18:10-19:10 Asia/Tokyo | LOCATION: Room E302

Extensive Reading (ER) is a powerful approach to language learning, but maintaining engagement and providing individualized support at scale remains a challenge. This interactive workshop introduces a custom-built AI-powered platform that transforms students from passive consumers into active creators of reading material.

Participants will gain hands-on experience using the system to generate personalized ER stories across different levels with just a few clicks. The platform incorporates adaptive difficulty scaling, adjusting text complexity based on learner interaction time and feedback while maintaining coherence.

Real-time tracking of student interactions — including reading time, vocabulary lookups, and content generation patterns — provides insights into engagement and common language challenges. Features such as context-sensitive vocabulary support, AI-generated audio narration, and optional comprehension activities may further enhance autonomous learning.

A key focus will be how AI-driven analysis of student feedback informs ongoing refinements to both the system and pedagogical approaches. Participants will leave the workshop with practical experience in generating personalized ER stories, exploring learner support features, and considering how learner data can inform teaching practice.

This workshop demonstrates how integrating AI technology into ER programs can create a more engaging, personalized, and data-informed learning environment while maintaining the core principles of extensive reading methodology.

Exploring Interaction with AI Chatbots for Professional Conversation Practice Among L2 English Engineering Students #4353

Sat, Jul 19, 09:00-09:25 Asia/Tokyo | LOCATION: Room E302

AI-driven chatbots are increasingly used in language learning to offer personalized practice and promote learner autonomy. However, chatbot interactions are typically individual and relatively new, leaving gaps in understanding how learners engage with these tools. This study analyzes interactions between engineering students and a Poe-based chatbot designed to act as an investor evaluating student proposals. The goal is to help students prepare for live assessed simulations. The study employs the concept of 'field' from systemic functional linguistics, or 'what the conversation is about.' It specifically examines 'field shifts,' when learners deviate from the main topic, affecting conversation quality. Field shifts can lead to incomplete communication or unmet objectives. Through qualitative analysis, the study identifies several types of field shifts, with 'Zooming Out' (providing overly general answers), 'Wide Miss' (responses that broadly miss the main topic), and 'Blurred Shot' (fragmented, unclear responses) being the most common. These types of shifts often reduced coherence and negatively impacted learners' ability to achieve their communicative goals. This session targets language instructors and instructional designers interested in chatbot-mediated learning. Understanding common field shifts can help them design more effective chatbot prompts and scaffolding strategies, ultimately fostering more focused and goal-oriented learner interactions.

The Impact of Combining AI and Face-to-Face for Conversational Practice on Non-English Majors #4342

Sat, Jul 19, 09:35-10:00 Asia/Tokyo | LOCATION: Room E302

Enhancing both receptive and productive skills is crucial in English education. A survey by Kato and Yamada (2022) indicates that university students wish to improve their speaking skills more than the others. However, students think daily English conversation opportunities remain limited (Okayama University, 2023). Artificial Intelligence (AI) offers a potential solution by providing extensive practice opportunities for EFL learners.

Besides limited exposure, speaking anxiety is another significant barrier. Kawauchi (2016) found that speaking English in class significantly contributes to EFL learners’ anxiety. However, AI can facilitate practice and help reduce anxiety, as shown by Hapasari and Wu (2022) and Hawanti and Zubaydulloevna (2023).

This study examines the effectiveness of combining AI-driven chatbot speaking practice with English Central and real conversations involving teachers and classmates. Over four months, the participants showed slight improvements in speaking test scores and fluency, along with enhanced TOEIC listening scores. The in-class survey suggests that students found both AI and face-to-face practice beneficial for improving conversational skills.

Balancing CLIL Instruction and Enhancing Spontaneous Spoken Output in CLIL Classrooms through CALL #4338

Sat, Jul 19, 10:10-10:35 Asia/Tokyo | LOCATION: Room E302

Content and Language Integrated Learning (CLIL) presents unique challenges in linguistically homogeneous university classrooms, particularly in fostering spontaneous spoken output. While students in my courses demonstrate significant gains in academic writing and structured presentations, informal, unplanned spoken interactions about course content remain underdeveloped. Drawing on the Balanced CLIL Framework (Brown & Christmas, unpublished), this presentation examines an action research project designed to address this gap through technology-enhanced language learning.

The study compares two pedagogical interventions in parallel CLIL courses: one class engages in peer-led oral quizzing on key vocabulary and concepts, while the other utilizes ChatGPT-based interactive speaking assignments that provide automated feedback on comprehensibility, pronunciation, and content accuracy. The presentation will outline the rationale behind these interventions, the methods used to assess their effectiveness—including pre- and post-course assessments and student feedback—and preliminary findings on their impact on language confidence and content retention. By integrating CALL strategies to promote spontaneous spoken engagement, this study seeks to contribute to ongoing discussions on achieving greater balance in CLIL instruction.

The Current State of Automatic Speech Recognition for Non-Native English #4218

Sat, Jul 19, 11:35-12:00 Asia/Tokyo | LOCATION: Room E302

Automatic Speech Recognition (ASR), or the automated conversion of spoken language into text, is an essential component of computer assisted language learning (CALL) and computer assisted language testing (CALT). However, ASR is a rapidly developing technology and has reached its highest levels of accuracy in the past few years thanks to advances in neural networks and transformer systems. This study looks at five state-of-the-art ASR systems (AssemblyAI's Universal-2, Deepgram's Nova-2, RevAI's V2, Speechmatics' Ursa-2, and OpenAI's Whisper-large-v3) and measures their accuracy on non-native accented English speech from six different L1 backgrounds, in the form of both 2400 read sentences and 22 spontaneous narrative recordings. Results found that all systems achieved mean Match Error Rate (MER) of less than 0.09, or above 91% accuracy on read speech. Two systems performed especially well, with no significant difference found between them: Whisper had the lowest mean MER of 0.054 followed by AssemblyAI with 0.056. For spontaneous speech, RevAI had the lowest mean MER of 0.074. All five systems performed better than other ASR systems reported on in the last several years, suggesting that accurate transcription of non-native English speech is possible.

Investigating Senior Learners’ Perceptions of AI-Assisted Learning: A Preliminary Study #4271

Sat, Jul 19, 12:10-12:35 Asia/Tokyo | LOCATION: Room E302

The purpose of this preliminary study seeks to explore senior learners’ current understanding of AI and their perceptions and motivation of its role in language learning and lifelong education. This study focuses on gathering baseline data through the two structured questionnaire: QAIUM and AIM, and a semi-structured interview (Yurt & Kasarci, 2024; Li, 2025). The 15 participants are around retirement age, ranging from 60 to 76. By the end of this research, educators can have a deeper understanding of how these senior learners view AI in education. Furthermore, this research helps investigate the motives and the openness or the potential concerns senior learners may have towards the use of technology. Their answers were recorded, transcribed, and analyzed by using thematic analysis. Findings reveal the combination of learners’ curiosity for exploration and hesitation for investment of their time and effort. Some participants express their doubt derive from unfamiliarity to the use of AI. The insight generated from this preliminary study will provide a foundation considering the learners’ positive feedback and potential restraint for the design of the subsequent research.

Technology-Mediated Peer Feedback in EFL Speaking Courses #4229

Sat, Jul 19, 15:10-15:35 Asia/Tokyo | LOCATION: Room E302

This study examines the implementation of a real-time peer feedback system in EFL speaking classes and investigates students’ perceptions of its utility and effectiveness. The system employed a scoring rubric uploaded to Google Forms, enabling students to provide immediate feedback during classmates’ presentations. Feedback was automatically collated into a pre-formatted Google Sheet, allowing presenters to access peer comments and scores immediately after their presentations. In addition to peer feedback, the teacher used the same system to provide feedback on the presentations, enabling a comparison between peer and teacher feedback scores to identify significant differences. The research focuses on the practicality of this process, the alignment between peer and teacher feedback, and students’ reflections on its impact on their learning experience. While it does not measure changes in speaking proficiency, the study highlights the potential for enhancing peer feedback practices, fostering student engagement, and promoting reflective learning. Future directions include the development of a training program to improve the quality of peer feedback and an exploration of its impact on speaking proficiency. This session will be beneficial to educators and researchers seeking innovative approaches to technology-mediated peer feedback practices in language classrooms.

CALL to Enhance Feedback on Speaking #4225

Sat, Jul 19, 15:45-16:10 Asia/Tokyo | LOCATION: Room E302

Hattie and Timperley (2007), argue that the most powerful single influence enhancing achievement is feedback. However, offering feedback on students' speaking is challenging. During the speaking process listening and referencing assessment criteria is demanding. Offering feedback during spoken interactions disrupts fluency. After speaking, language becomes remembered and abstract, making it difficult to offer targeted feedback. This presentation details action research cycles exploring the ways in which interactive videos assist teacher feedback on speaking. Students (n=120) video-recorded conversations with a partner and uploaded them to Moodle. Using H5P software, the teacher was able to re-watch their conversations and add feedback via interactive pop-ups. Qualitative and quantitative data were collected and analyzed inductively, via reflective journals, and teacher, self and peer assessments. Qualitative data were triangulated with quantitative data via the Rasch model. Results illustrated that interactive videos allowed greater feedback opportunities in a way that enhanced assessment and student satisfaction. The theoretical underpinnings of the research; how data were collected and analysed; and conclusions will be presented. Next, practical discussion will focus on a demonstration of the interactive video process and how the research can be adopted by audience members with whom it resonates.

A Mobile App for L2 Fluency Development #4328

Sat, Jul 19, 16:20-16:45 Asia/Tokyo | LOCATION: Room E302

Assessing second language (L2) spoken fluency remains a challenge for instructors, despite its importance in language learning and proficiency (Brown, 2006; Fulcher, 2003; Kang et al., 2019). Spoken fluency, defined as the "smoothness and effortlessness of speech" (Chambers, 1997), is a key indicator of L2 proficiency and a central focus in L2 oral assessments (Ogawa, 2022; Peltonen, 2024). To address this, we developed a mobile app to track and enhance spoken fluency, grounded in theories of speech production (Levelt, 1993) and skill development (DeKeyser, 2007). The app also integrates gamification elements such as progress checkers, which might increase learner engagement in out-of-class activities (See Huang et al., 2018). This workshop reviews research on spoken fluency development, explores assessment methods, and presents findings from our study on the app’s syllable-counting ability. The study involved 84 first-year undergraduate students from a private university in Tokyo, Japan. Participants recorded one-minute monologues, and data was analyzed using Bland-Altman analysis to determine which speech-to-text method—pitch analysis, Apple’s speech-to-text, or WhisperX’s speech-to-text—best matched human syllable counts.

NGSL Profiler: Simplifying EFL Materials the Easy Way! #4251

Sat, Jul 19, 17:00-18:00 Asia/Tokyo | LOCATION: Room E302

The NGSL Profiler is a new corpus-based tool designed to help create and/or simplify learning materials to the level of the learner. It is one of the latest additions to the New General Service List Project, a large and growing collection of free, open source vocabulary words lists and online teaching and learning tools.

Designed along the lines of other excellent profiling tools such as OGTE (Browne & Waring), AntWordProfiler (Anthony) and VocabProfile (Cobb), the NGSL Profiler tries to focus more specifically on the needs of teachers and content developers utilizing one or more of the NGSL word lists. In addition to an easy and intuitive profiling tool, there is also an AI-powered tool which helps teachers to quickly create original fiction texts as well as another tool which helps teachers to quickly shorten and simplify texts to one of 5 levels of difficulty.

This workshop will introduce, demonstrate and give practice using the NGSL Profiling tools as well as give ideas about how to utilize the tools in conjunction with other NGSL tools (such as wordlists and placement tests) in pedagogically sound ways designed to improve the efficiency and effectiveness of language learning and language learning programs.

Implementation of AI Spoken Agents in Oral Language Acquisition: An Exploratory Study #4205

Sun, Jul 20, 09:00-09:25 Asia/Tokyo | LOCATION: Room E302

This pilot study investigates the implementation of AI spoken agents in oral language acquisition. While Large Language Models have garnered significant attention in recent years, most research has focused on text-based applications, with little emphasis on oral language. Furthermore, AI-spoken agents remain an under-researched area despite their increasing prevalence. This study examines the implementation of the Doubao Agent (Kyle) in facilitating oral Chinese acquisition and explores how the spoken agent contributes to oral Chinese learning. Also, it examines learners' perceptions of such an application. The participants were four intermediate L2 Chinese learners in China. Data on oral communication was collected weekly over a month, followed by an interview. The results showed that learners demonstrated improved oral communication, suggesting the effectiveness of the spoken agent in stimulating oral learning. However, oral productions were primarily simple, highlighting the need for strategies to stimulate complex language. Interviews showed interesting results, particularly in the dominant role of the spoken agent in communication. Such findings call for more research on strategies to take advantage of spoken agents in oral language learning.

L2 Listening in a Digital Era: Developing and Validating the Mobile-Assisted Self-Regulated Listening Strategy Questionnaire (MSRLS-Q) #4226

Sun, Jul 20, 09:35-10:00 Asia/Tokyo | LOCATION: Room E302

Mobile technologies have transformed L2 listening. These technologies provide learners with an abundance of materials that transcend the limitations of traditional classroom instruction. Understanding how learners engage with such materials is crucial if teachers are to facilitate students' development of self-regulated listening strategies. This study reports on the development and validation of a new instrument, the Mobile-assisted Self-Regulated Listening Strategy Questionnaire (MSRLS-Q). Informed by a social cognitive understanding of self-regulation, items were generated from existing literature and semi-structured interviews with 16 Chinese undergraduate students. The questionnaire was validated through an exploratory factor analysis with 309 Chinese undergraduate students, followed by a confirmatory factor analysis with a separate sample of 327 students. Results confirmed a 31-item, five-factor model covering students' pre-, during- and post-listening strategies: Goal setting and mobile resource planning, Cognitive and metacognitive multimedia listening, Mobile-assisted motivational control, Structuring online social space, and Listening evaluation and adaptation. Structural equation modelling revealed that four of the five factors significantly predicted students’ international orientation to use English. The results emphasize self-regulated listening as a cyclical process reflective of cognitive, motivational, and social dimensions of strategy use. Implications for research and pedagogical use of the MSRLS-Q are discussed.

Gemini Listens: Analyzing Speaking Tasks #4201

Sun, Jul 20, 11:40-12:05 Asia/Tokyo | LOCATION: Room E302

Generative AI is transforming language teaching and learning in areas such as translation, feedback, and evaluation. This presentation examines AI’s ability to analyze speaking tasks in the language learning classroom. Most generative AI tools, such as ChatGPT, first convert speech to text and then analyze the transcript—an approach that overlooks important prosodic features. However, Google’s Gemini 2.0 can process raw audio directly, capturing intonation, stress, rhythm, and loudness without relying on text-based transcription. This study compared the accuracy and efficiency of human and AI ratings of pair-work speaking tasks, focusing on Gemini 2.0’s multimodal ability to analyze natural prosody and intonation. The findings revealed a moderate positive correlation between human and AI ratings of speaking tasks, indicating that Gemini 2.0 aligns well with human judgments of intonation and rhythm in language learner speech.

Enhancing English Pronunciation Through Speech-to-Text Technology: A Quasi-Experimental Study #4178

Sun, Jul 20, 12:15-12:40 Asia/Tokyo | LOCATION: Room E302

Pronunciation is often seen as the most anxiety-inducing aspect of language learning. Various techniques have been introduced to improve students’ pronunciation, including explicit instruction (Zhang & Yuan, 2020), virtual reality (Alemi & Khatoony, 2020), and speech-to-text technology (Jiang et al., 2021). As explicit instruction requires instructor expertise, speech-to-text technology has recently gained increased attention, even though free versions have been perceived as inferior. This quasi-experimental study examined the effectiveness of free speech-to-text software in improving Japanese university students’ spoken English. A pre-test-post-test comparison was conducted with two groups (N = 77) over 15 weeks, with both groups receiving explicit pronunciation instruction via flipped learning videos. During weekly classes, the control group (n = 28) practiced listen-and-repeat exercises, while the treatment group (n = 49) used free speech-to-text software. Based on recorded speeches rated by three evaluators (α = .800), the treatment group showed statistically significant pronunciation improvement (p = .003). While both groups improved in rhythm, intonation, and intelligibility, the treatment group exhibited greater gains, though not at a significant level (p > .05). These findings suggest that flipped learning enhances pronunciation instruction, with additional benefits when combined with speech-to-text technology.