We're thrilled to announce that a research paper by our Senior NLP Scientist Adam King has been accepted into the 25th Annual Conference of the European Association for Machine Translation! 🎉
Adam's research focuses on training more performant multilingual classification models using translated training data, pushing the boundaries of what's possible in machine translation. 🌐✨ The conference will be held at the University of Sheffield from June 24-27.
Check out his research paper here: https://lnkd.in/g3zZDcDR#EAMT2024
Partial or full automation of translation quality evaluation (TQE) is the Holy Grail. If algorithms can reliably indicate (the vast majority of) segments that (are most likely to) require human post-editing, machine translation (MT) could be used at scale for a broader range of content.
Maria Stasimioti from Slator reports on promising research that combines Large Language Models (LLMs) with human annotations so that the LLM receives feedback that ends up being (quite) similar to that of human evaluators. The idea is to train the model until it ready to categorise errors (using the same multidimensional quality metrics framework, aka MQM, and to derive a score from that fine-grained auto-generated feedback.
I find this rather exciting.
Will read more tonight (it's a 19-page paper available on Arxiv: https://lnkd.in/ejkU6M9b)
https://ow.ly/GS6U50PFkJV
Last week at AACL-IJCNLP 2023, Jay Gala, Pranjal Chitale and I delivered a tutorial on "Massively Multilingual Machine Translation for Related Languages". If you are interested in this but could not attend, we are making everything available:
https://lnkd.in/djf46rQa
The GitHub repo contains our slides, recorded talk and all the papers we referred to to prepare the tutorial slides. We are happy to present this tutorial again upon request so please feel free to reach out to us.
We hope that this helps motivate further research into language relatedness for massively multilingual machine translation. A big thanks to Prof Kurohashi for motivating us to submit a tutorial application. Also special thanks to Varun Gumma for their feedback.
This tutorial is a part of the series of tutorials on:
a. NMT (https://lnkd.in/dnTbgMPW) and
b. Multilingual Machine Translation (https://lnkd.in/d6tepmwu)
Feel free to take a look and reach out if you have any questions.
“Some uses [of machine translation] are low-risk, others much higher. You have to consider the context in which these tools are used and their impact.”
– Dr. Benoît Dubreuil, Quebec’s French language commissioner
Read more:
Machine translation: a game changer in science – The English language may be king in science, but neural machine translation could well put an end to its dominance.
https://lnkd.in/emz6eSH4
[🤖] "We show that Claude 3 Opus, a large language model (LLM) released by Anthropic in March 2024, exhibits stronger machine translation competence than other LLMs. Though we find evidence of data contamination with Claude on FLORES-200, we curate new benchmarks that corroborate the effectiveness of Claude for low-resource machine translation into English. We find that Claude has remarkable resource efficiency – the degree to which the quality of the translation model depends on a language pair’s resource level. Finally, we show that advancements in LLM translation can be compressed into traditional neural machine translation (NMT) models. Using Claude to generate synthetic data, we demonstrate that knowledge distillation advances the state-of-the-art in Yoruba-English translation, meeting or surpassing strong baselines like NLLB-54B and Google Translate."
🌟 Excited to share insights from my research into #GenAI for Machine Translation! 🚀🔍
In my Master's dissertation, I've explored the dynamic world of Neural Machine Translation (NMT), focusing specifically on the intricate nuances of spatial language within TedTalk subtitles. 🌍💬
Research Questions:
🔹 How do leading open-source Large Language Models (#LLMs) like LLama 2, Gemma, and Mistral compare against established NMT giants such as Google and DeepL?
🔹 What potential lies in leveraging #GenAI for machine translation tasks in terms of accuracy, fluency, and post-editing needs when translating spatial prepositions like "across", "through", "into", and "onto", which present challenges from English to Portuguese?
🔹 How does human translation compare? Is it always "the gold standard"?
To achieve this, I'm examining how these systems tackle issues of prepositional semantics like crosslinguistic variation, polysemy, and idiomatic expressions, intrinsic factors in spatial language.🧠💭
🔶Evaluation Metrics:
To evaluate the effectiveness of these systems, I'm analyzing a range of established metrics such as BLEU, METEOR, BERTScore, and COMET☄️. Additionally, I'm using human evaluation to assess these scores' accuracy in capturing spatial preposition nuances and overall translation quality.
Stay tuned for forthcoming insights and revelations from my research journey!
🎓✨ #LLMs#GenAI#MachineTranslation#NeuralMT#SpatialLanguage#ResearchInsights#LinkedInLearning
🚀 We're thrilled to share the preprint of our paper on development in the realm of multilingual Community Question-Answering (CQA) portals that we've been working on. Our research focuses on addressing the challenges of language barriers, particularly when translating noisy questions.
Paper Link: https://lnkd.in/d9AfNjBB
🔑 Our approach centers around "Reference-Free Domain Adaptation for Translation of Noisy Questions with Question-Specific Rewards." In this paper, we introduce several key contributions:
Reference-Free Training: We've devised a novel methodology for fine-tuning Neural Machine Translation (NMT) systems using only source-side data. This means we no longer rely on synthetic target data, making our method more robust.
Adequacy and Fluency Balance: To ensure our translations maintain both adequacy and fluency, we employ a combination of BERTScore and Masked Language Model (MLM) Score.
Impressive Results: Our model outperforms the traditional Maximum Likelihood Estimation (MLE) based fine-tuning approach with a remarkable 1.9 BLEU score improvement.
Open Source Initiative: In the spirit of collaboration and advancing the field, we've made our codes and datasets publicly available. You can access them here: https://lnkd.in/dsXTZjsH
Published at: Findings of EMNLP 2023
📚 If you're curious to delve deeper into the details, feel free to check out the full paper: "Reference-Free Domain Adaptation for Translation of Noisy Questions with Question-Specific Rewards," authored by Baban Gain, Ramakrishna Appicharla, Soumya Chennabasavaraj, Nikesh Garera, Asif Ekbal, and Muthusamy Chelliah.
Link: https://lnkd.in/d9AfNjBB#AI#NMT#Research#EMNLP2023#LanguageTechnology#Community#QuestionAnswering#Innovation#OpenSource#Collaboration
Co-Founder, Chief AI & Analytics Advisor @ InstaDataHelp | Innovator and Patent-Holder in Gen AI and LLM | Data Science Thought Leader and Blogger | FRSS(UK) FSASS FRIOASD | 16+ Years of Excellence
The Power of Algorithms: Unveiling the Secrets Behind Machine Translation
The Power of Algorithms: Unveiling the Secrets Behind Machine Translation Introduction In today’s interconnected world, language barriers can hinder effective communication and limit opportunities for global collaboration. However, with the advent of machine translation, these barriers are gradually being broken down. Machine translation, powered by sophisticated algorithms, has revolutionized the way we communicate across languages. […]
https://lnkd.in/dqd8bcHk
Calls: Humor and Artificial Intelligence Panel: Call for Papers:
We invite 20-minute presentations on AI-based technology for generating, processing, or analyzing humor, for our dedicated panel that kicks off ISHS's 2024 webinar series:
Application areas include, but are not limited to:
* human–computer interaction
* computer-mediated communication
* intelligent writing assistants
* conversational agents
* machine and computer-assisted translation
* digital humanities
* natural language proc
loudly & unapologetically black ✨• storyteller 💌 • fierce deib advocate 📣 • social impact aficionado 🌿
1moyaaaaas Adam King!!! smarty pants!! 🤩