Workshop Program

Saturday, May 16, 2026 | Palma de Mallorca, Spain

Indicative Schedule

09:00 - 10:30 Session A - Room 4
09:10 - 09:20 A Bolu: A Structured Dataset for the Computational Analysis of Sardinian Improvisational Poetry Silvio Calderaro and Johanna Monti Università di Pisa, "L'Orientale" University of Naples
09:20 - 09:30 Saar-Voice: A Multi-Speaker Saarbrücken Dialect Speech Corpus Lena Sophie Oberkircher, Jesujoba Alabi, Dietrich Klakow and Jürgen Trouvain Saarland University
09:30 - 09:40 MD_NLP: Reconstructing an Australian English Heritage Dialect Corpus from the Mitchell–Delbridge Recordings through LLM-Assisted Speaker Attribution Steven Coats University of Oulu
09:40 - 09:50 Challenges in the Detection of Dialect for Historical Languages; the Case of Old Irish Text Resources Adrian Doyle University of Galway
10:00 - 10:30 2-minute poster presentations
10:30 - 11:00 Poster Session & coffee break (Running in parallel)
11:00 - 13:00 Session B - Room 4
11:00 - 11:10 Phonologically-aware Automatic Speech Recognition Evaluation of Low-Resource Languages: The Case of Basque Dialects Christoforos Souganidis, Asier Herranz, Ibon Saratxaga, Eva Navas and Inma Hernaez University of the Basque Country UPV/EHU
11:10 - 11:20 Systematic Normalization of Spoken Mixed-Language, Mixed-Dialect Data Margaret Blevins The University of Texas at Austin
11:20 - 11:30 Evaluating Cross-Dialect Syntactic Variation: a Theory-Driven Web Resource Emanuela Li Destri, Marco Longhin, Gaia Sorge, Sofia Ferroni, Giovanni Battista Matteazzi, Andrea Artioli, Lorenzo Carletti, Federico Motta, Giuseppe Longobardi and Cristina Guardiano Università di Modena e Reggio Emilia, Università di Padova, University of York
11:30 - 11:40 Can LLM Agents Identify Spoken Dialects like a Linguist? Tobias Bystrich, Lukas Hamm, Maria Hassan Akhter, Lea Fischbach, Lucie Flek and Akbar Karimi University of Bonn, Fraunhofer IAIS, Philipps-Universität Marburg
11:45 - 12:30 Invited Talk
Prof. Barbara Plank, LMU Munich, Visiting Prof ITU Copenhagen
12:30 - 13:00 Community Discussion

Accepted Posters

Beyond Accuracy: Analyzing Dialect Confusion in Automatic Speech-Based Dialect ClassificationLea Fischbach, Alfred Lameli and Lucie Flek
FLEURS-Kobani: Extending FLEURS dataset for Northern KurdishDaban Q Jaff and Mohammad Mohammadamini
Exploring the reusability of Northern Kurdish resources for Badini speech recognitionMohammad Mohammadamini, Aveen Jalal Mohammed, Barzan Hussein Mohammed, Dezheen H. Abdulazeez, Imad Saeed Sadeeq, Dilgash Mohammed Salih, Amera Ismail Melhum and Abuobaida Abdullah Dheyab
Wancho Dialectometry: Community-created data and the Living Dictionaries projectKellen Parker van Dam
Dialectometry and Evaluation of the ePark Corpus for Low-Resource Formosan Language DialectsHenry Gagnier
A Dialectal Corpus for Ukrainian: Collection, Classification, and StandardizationYuliia Frund and Sina Ahmadi
German Dialects Across Situations, Generations, and Regions: The REDE corpus as an Oral Resource for NLPHanna Fischer and Alfred Lameli
A Catalog of Basque Dialectal Resources: Online Collections and Standard-to-Dialectal AdaptationsJaione Bengoetxea, Itziar Gonzalez-Dios and Rodrigo Agerri
WoVis: Interactive Visualization of Word Embeddings for Semantic Change in Historical and Dialectal Language ResourcesFilip Miletić, Maximilian Henkel, Rene Cutura, Sophie Sadler, Quynh Quang Ngo, Michael Sedlmair and Sabine Schulte im Walde
Speaker Normalization via Voice Conversion Reveals a Human-Machine Dissociation in Dialect ClassificationCaroline Kleen, Lea Fischbach, Akbar Karimi, Lucie Flek and Alfred Lameli
South Tyrolean Dialect-to-Standard Speech TranslationGreta H. Franzini and Luca Ducceschi
TransVar – the Corpus for Variation and Change Study of the Historical Transcarpathian lectsIlia Afanasev
The Generator-Eraser Paradox: Community Guidelines for Responsible LLM-Assisted Dialect Resource CreationWajdi Zaghouani
The Texas German Dialect Project Corpus as a Diachronic Resource for Investigating Language ContactThomas Schmidt, Margaret M. Blevins, Hans C Boas and Glenn Gilbert
Pontic Greek in the Caucasus: an online corpusSvetlana Berikashvili and Stavros Skopeteas
Meaning Over Morphology: A Multi-Metric Benchmark of LLMs for Bangla Dialect TranslationSoumik Deb Niloy, Subhey Sadi Rahman, Mahbub E Sobhani, Md. Golam Rabiul Alam, Farig Yousuf Sadeque and Md. Rezuwan Hassan
Sociolinguistic aspects of crowdsourcing for a vocal corpus of AlsatianPascale Erhart, Lucile Hamm, Sam Bigeard, Carole Werner, Malek Yaich and Slim Ouni
HeptaTAX: A Neuro-Symbolic Pipeline and Benchmark for Classifying 16th-Century Heptanesian Notarial ActsStergios Chatzikyriakidis, Eleni Karantzola and Vasiliki Makri
Towards Semantic Access and Interoperability in Digital Dialectal Atlases. A Case StudyPaola Marongiu and Simonetta Montemagni
A CLDF-Compliant Lexical Database for Modern Greek Dialects: Resource Design and Dialectometric AnalysisStavros Bompolas, Natalia Chousou-Polydouri, Manuela Genitsaridi, Danae Karatzanou, Georgios Kostopoulos, Elena Anagnostopoulou and Dimitra Melissaropoulou
A Speech Resource for the Pontic Greek Dialect: Transcription Choices and Baseline ASR EvaluationRodanna Konstantinidou, Chara Tsoukala, Vivian Stamou, Voula Giouli and Stella Markantonatou
First Steps in ASR for Cypriot Greek: Challenges and InsightsVivian Stamou, Spyros Armostis, Antigoni Klimi, Georgios Paraskevopoulos, Vassilis Katsouros and Antonios Anastasopoulos
Structural Divergence under Shared Language-Level Specification: Griko in Universal DependenciesStavros Bompolas, Emanuela Pinna, Josep Quer, Marika Lekakou and Stella Markantonatou
Digital Preservation of Aromanian Through Knowledge Management and Automatic Speech Recognition EvaluationMarija Pendevska and Hristina Nastevska
A Novel Typology of Mutually Intelligible Words: The Case of Slavic LanguagesEdward Klyshinsky and Yulia Badryzlova
Transfer Learning for an Endangered Slavic Variety: Dependency Parsing in Pomak Across Contact-Shaped DialectsSercan Karakas