Workshop Program

Saturday, May 16, 2026 | Palma de Mallorca, Spain

Schedule

09:00 - 10:30 Session A - Cabrera 1 (on the 2nd floor)
09:10 - 09:20 A Bolu: A Structured Dataset for the Computational Analysis of Sardinian Improvisational Poetry Silvio Calderaro and Johanna Monti Università di Pisa, "L'Orientale" University of Naples Presentation Slides
09:20 - 09:30 Saar-Voice: A Multi-Speaker Saarbrücken Dialect Speech Corpus Lena Sophie Oberkircher, Jesujoba Alabi, Dietrich Klakow and Jürgen Trouvain Saarland University Presentation Slides
09:30 - 09:40 MD_NLP: Reconstructing an Australian English Heritage Dialect Corpus from the Mitchell–Delbridge Recordings through LLM-Assisted Speaker Attribution Steven Coats University of Oulu Presentation Slides
09:40 - 09:50 Challenges in the Detection of Dialect for Historical Languages; the Case of Old Irish Text Resources Adrian Doyle University of Galway Presentation Slides Presentation Video
10:00 - 10:30 2-minute poster presentations (Cabrera 1)
10:30 - 11:00 Poster Session & coffee break (Running in parallel)
Menorca Hall (on the 3rd floor)
11:00 - 13:00 Session B - Cabrera 1 (on the 2nd floor)
11:00 - 11:10 Phonologically-aware Automatic Speech Recognition Evaluation of Low-Resource Languages: The Case of Basque Dialects Christoforos Souganidis, Asier Herranz, Ibon Saratxaga, Eva Navas and Inma Hernaez University of the Basque Country UPV/EHU Presentation Slides
11:10 - 11:20 Systematic Normalization of Spoken Mixed-Language, Mixed-Dialect Data Margaret Blevins The University of Texas at Austin Presentation Slides
11:20 - 11:30 Handling Cross-Dialect Syntactic Variation: a Theory-Driven Web Resource Emanuela Li Destri, Marco Longhin, Gaia Sorge, Sofia Ferroni, Giovanni Battista Matteazzi, Andrea Artioli, Lorenzo Carletti, Federico Motta, Giuseppe Longobardi and Cristina Guardiano Università di Modena e Reggio Emilia, Università di Padova, University of York Presentation Slides
11:30 - 11:40 Can LLM Agents Identify Spoken Dialects like a Linguist? Tobias Bystrich, Lukas Hamm, Maria Hassan Akhter, Lea Fischbach, Lucie Flek and Akbar Karimi University of Bonn, Fraunhofer IAIS, Philipps-Universität Marburg Presentation Slides Presentation Video
11:45 - 12:30 Invited Talk: "Beyond the Standard: Dialectal Variation at the Heart of NLP"
Prof. Barbara Plank, LMU Munich, Visiting Prof ITU Copenhagen Presentation Slides
12:30 - 13:00 Community Discussion

Accepted Posters

Beyond Accuracy: Analyzing Dialect Confusion in Automatic Speech-Based Dialect Classification Lea Fischbach, Alfred Lameli and Lucie Flek Video
FLEURS-Kobani: Extending FLEURS dataset for Northern Kurdish Daban Q Jaff and Mohammad Mohammadamini Poster Video
Exploring the reusability of Northern Kurdish resources for Badini speech recognition Mohammad Mohammadamini, Aveen Jalal Mohammed, Barzan Hussein Mohammed, Dezheen H. Abdulazeez, Imad Saeed Sadeeq, Dilgash Mohammed Salih, Amera Ismail Melhum and Abuobaida Abdullah Dheyab Poster Video
Wancho Dialectometry: Community-created data and the Living Dictionaries project Kellen Parker van Dam Video
Dialectometry and Evaluation of the ePark Corpus for Low-Resource Formosan Language Dialects Henry Gagnier Video
A Dialectal Corpus for Ukrainian: Collection, Classification, and Standardization Yuliia Frund and Sina Ahmadi Video
German Dialects across Situations, Generations, and Regions: The REDE Corpus as an Oral Resource for NLP Hanna Fischer and Alfred Lameli Video
A Catalog of Basque Dialectal Resources: Online Collections and Standard-to-Dialectal Adaptations Jaione Bengoetxea, Itziar Gonzalez-Dios and Rodrigo Agerri Poster Video
WoVis: Interactive Visualization of Word Embeddings for Semantic Change in Historical and Dialectal Language Resources Filip Miletić, Maximilian Henkel, Rene Cutura, Sophie Sadler, Quynh Quang Ngo, Michael Sedlmair and Sabine Schulte im Walde Video
Speaker Normalization via Voice Conversion Reveals a Human-Machine Dissociation in Dialect Classification Caroline Kleen, Lea Fischbach, Akbar Karimi, Lucie Flek and Alfred Lameli Video
South Tyrolean Dialect-to-Standard Speech Translation Greta H. Franzini and Luca Ducceschi Poster Video
TransVar – the Corpus for Variation and Change Study of the Historical Transcarpathian lects Ilia Afanasev Poster Video
The Generator-Eraser Paradox: Community Guidelines for Responsible LLM-Assisted Dialect Resource Creation Wajdi Zaghouani Video
The Texas German Dialect Project Corpus as a Diachronic Resource for Investigating Language Contact Thomas Schmidt, Margaret M. Blevins, Hans C Boas and Glenn Gilbert Video
Pontic Greek in the Caucasus: an online corpus Svetlana Berikashvili and Stavros Skopeteas Poster Video
Meaning Over Morphology: A Multi-Metric Benchmark of LLMs for Bangla Dialect Translation Soumik Deb Niloy, Subhey Sadi Rahman, Mahbub E Sobhani, Md. Golam Rabiul Alam, Farig Yousuf Sadeque and Md. Rezuwan Hassan Poster Video
Sociolinguistic aspects of crowdsourcing for a vocal corpus of Alsatian Pascale Erhart, Lucile Hamm, Sam Bigeard, Carole Werner, Malek Yaich and Slim Ouni Poster Video
HeptaTAX: A Neuro-Symbolic Pipeline and Benchmark for Classifying 16th-Century Heptanesian Notarial Acts Stergios Chatzikyriakidis, Eleni Karantzola and Vasiliki Makri Video
Towards Semantic Access and Interoperability in Digital Dialectal Atlases. A Case Study Paola Marongiu and Simonetta Montemagni Poster Video
A CLDF-Complical Database for Modern Greek Dialects: Resource Design and Dialectometric Analysis Stavros Bompolas, Natalia Chousou-Polydouri, Manuela Genitsaridi, Danae Karatzanou, Georgios Kostopoulos, Elena Anagnostopoulou and Dimitra Melissaropoulou Poster
A Speech Resource for the Pontic Greek Dialect: Transcription Choices and Baseline ASR Evaluation Rodanna Konstantinidou, Chara Tsoukala, Vivian Stamou, Voula Giouli and Stella Markantonatou Poster Video
First Steps in ASR for Cypriot Greek: Challenges and Insights Vivian Stamou, Spyros Armostis, Antigoni Klimi, Georgios Paraskevopoulos, Vassilis Katsouros and Antonios Anastasopoulos Poster
Structural Divergence under Shared Language-Level Specification: Griko in Universal Dependencies Stavros Bompolas, Emanuela Pinna, Josep Quer, Marika Lekakou and Stella Markantonatou Poster
Digital Preservation of Aromanian Through Knowledge Management and Automatic Speech Recognition Evaluation Marija Pendevska and Hristina Nastevska Poster
A Novel Typology of Mutually Intelligible Words: The Case of Slavic Languages Edward Klyshinsky and Yulia Badryzlova Video
Transfer Learning for an Endangered Slavic Variety: Dependency Parsing in Pomak Across Contact-Shaped Dialects Sercan Karakas Video