20/05/2026
Yesterweek was a busy one for TartuNLP: our researchers were out sharing ideas, presenting their work, and representing Estonian and Finno-Ugric NLP at international conferences 📚
At 𝐋𝐑𝐄𝐂 𝟐𝟎𝟐𝟔 in Palma de Mallorca, our team presented four papers:
🔹 "Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Translation"
by Marii Ojastu, Hele-Andra Kuulmets, Aleksei Dorkin, Marika Borovikova, Dage Särg and Kairit Sirts
🔹 "Towards Consistent Detection of Cognitive Distortions: LLM-Based Annotation and Dataset-Agnostic Evaluation"
by Neha Sharma, Navneet Agarwal and Kairit Sirts
🔹 "Creation of the Estonian Subjectivity Dataset: Assessing the Degree of Subjectivity on a Scale"
by Karl Gustav Gailit, Kadri Muischnek and Kairit Sirts
🔹 "Document-Level Text Simplification in Estonian Using Large Language Models"
by Meeri-Ly Muru and Eduard Barbu
At the 𝐑𝐄𝐒𝐎𝐔𝐑𝐂𝐄𝐅𝐔𝐋-𝟐𝟎𝟐𝟔 workshop, Mark Fishel gave the keynote talk:
🎤 "Translating and modelling under-resourced languages and dialects, and how (not) to do it"
At the 𝐊𝐆-𝐋𝐋𝐌 @ 𝐋𝐑𝐄𝐂 𝟐𝟎𝟐𝟔 workshop, two papers were presented:
🔹 "ReX-GG: A LLM Ensemble Pipeline for Relation-extraction and Graph Generation"
by Giacomo Magnifico and Eduard Barbu
🔹 "Large Language Models for Knowledge Graph Extraction: A Schema-Constrained Evaluation Framework"
by Markus Ilves, Eduard Barbu and Jaan Übi
Meanwhile in Helsinki at 𝐈𝐅𝐔𝐒𝐂𝐎 𝐗𝐋𝐈, Britt-Kathleen Mere presented:
🔹 "Are We Low-Resource by Choice? Rethinking Finno-Ugric NLP"
(Title in Komi: "Лоам-ӧ ми "этша ресурса" асланым бӧрйӧмӧн? Финн-йӧгра NLP вылӧ мӧдног видзӧдлӧм")
From datasets to language models for under-resourced languages, we’re happy to see our researchers contributing to so many important conversations 🗣️💻