Explosion is a software company specializing in developer tools and tailored solutions for Artificial Intelligence and Natural Language Processing. We're the makers of spaCy, one of the leading open-source libraries for Natural Language Processing and Prodigy, a modern annotation tool for creating training data for machine learning models.
Staring down the barrel of a data annotation task? I strongly recommend checking out Prodigy.
If you dread labeling data for NLP tasks, text analysis, and LLMs, this tool is your new best friend. Don’t let the deceptively simple interface fool you — aside from being incredibly intuitive, here’s how it’s helped my team:
- elevating data labeling to a first-order concern in the machine learning workflow
- enabling us to collaborate on measures of inter-rater reliability
- making the labeling options super unambiguous for data annotators
🎙️ Upcoming podcast livestream with Spacy / Explosion creators Ines Montani & Matthew Honnibal on Open Source ML/AI and Sustainable Tech!
🧠 spaCy & open-source AI
🔍 Democratizing AI tech
🤖 Human-in-loop NLP
🚀 Sustainable open-source biz
Link to register in comment 😍
Matthew Honnibal & Ines Montani are both forces of nature and massive benefactors of the open-source AI community so I can't wait to see what they'll build in the coming years.
Also, it's a great practical example of how collaboration > competition in AI: Spacy & HF/transformers were kind of competitive libraries early on and we both decided to collaborate instead of competing for the benefit of the open-source AI community IMO.
Long life Explosion! https://lnkd.in/eFTp7SCx
I want to take this opportunity to put the spot light on some wonderful (now ex-) colleagues who helped maintain spaCy and its open-source ecosystem for so many years.
- First and foremost: Adriane B., the team's technical lead, ML engineer, resident computational linguist. She has fought all the Python packaging battles, trained spaCy models for 25+ languages, helped users on the forum and StackOverflow, and came up with efffiency tricks such as floret text embeddings (https://lnkd.in/e-FuYWgh).
- Daniël de Kok and Madeeswaran Kannan, who have been responsible for countless efficiency improvements throughout the OS stack. This super productive duo also created 'curated-transformers' which supports transformer architectures built from the ground up.
- Raphael Mitsch, a solid engineer who is equally good at scoping client projects as collaborating on Github to implement novel functionality. Raphael has been the driving force behind 'spacy-llm', allowing the integration of LLM prompting into structured spaCy pipelines.
- Ákos Kádár, (PhD): A ML engineer with a strong academic background. Ákos cuts through the hype/bullshit to get to the core of a problem and its solution. He also implemented the experimental spaCy coref module, based on an end-to-end neural approach published in EMNLP 2021 (https://lnkd.in/eYziEP5w).
- Peter Baumgartner: who helped us shape the consultancy efforts and implemented various tailored solutions for clients, succesfully marrying our open-source efforts with consulting (win-win!).
I've had the honour to work with many more wonderful colleagues, including Basile Dura (now a freelance AI & Data consultant), Edward Schmuhl and Victoria Slocum (now at Weaviate), Lester James Miranda (currently at AI2), Paul McCann (https://lnkd.in/epGCi_EU), Richard Paul Hudson (author of coreferee) and Vinit Ravishankar.
Then there's all the other Explosion (ex-) colleagues who worked on Prodigy (Teams), admin or customer success. Some of them have already moved on to the next challenge, others have taken a small break and are available right now if you're looking to hire exceptional NLP experts / ML engineers.
If you're unsure who to reach out to but have some job openings that could be relevant - feel free to shoot me a message.
Myself - I'm continuing with my one-woman consulting company OxyKodit, while also being actively engaged with open-source maintenance, such as Sebastián Ramírez Montaño's library Typer.
Finally, I want to thank Matthew Honnibal & Ines Montani for having given me the opportunity to work on my two biggest passions for so many years - NLP and open-source, and for giving me the opportunity to lead the OS team even though probably none of us really knew what we were doing. Nevertheless, I'm so proud of everything we've accomplished. I wish Explosion the best of luck going forward, as I'm 100% convinced that spaCy and Prodigy should, can and will survive the current LLM-hype.
Company update: We're going back to our roots! We're back to running Explosion as a smaller, independent-minded and self-sufficient company. spaCy and Prodigy will stay stable and sustainable and we'll keep updating our stack with the latest technologies, without changing its core identity or purpose 💙
https://lnkd.in/evAaa4pn
Thank you so much for all your support! Matthew Honnibal also wrote a more detailed blog post to share more background and some personal reflections: https://lnkd.in/gkRaQ5ad
Company update: We're going back to our roots! We're back to running Explosion as a smaller, independent-minded and self-sufficient company. spaCy and Prodigy will stay stable and sustainable and we'll keep updating our stack with the latest technologies, without changing its core identity or purpose 💙
https://lnkd.in/evAaa4pn
Thank you so much for all your support! Matthew Honnibal also wrote a more detailed blog post to share more background and some personal reflections: https://lnkd.in/gkRaQ5ad
⚗️ Develop small and performant models using LLMs
Large language models (LLMs) work really well out of the box ✨ and as a result they are excellent tools for prototyping solutions and features that require AI. On the other hand, in production, you need reliable, auditable and cost effective workflows, something LLMs are not that great yet 👎 To make things worse, transitioning from an end to end, LLM based solution, to a production workflow with multiple components is not trivial since it requires a significant shift in not only the design of the solution but also the frameworks involved 🤯
To combat the above, you can architect your solution in a more production like fashion with the same components and use the same frameworks to prototype using an LLM and deploy your smaller modular models. An excellent example of this way of thinking is spacy and spacy-llm which allows you to quickly build an information extraction pipeline using your LLM of choice while allowing you to correct that data using humans and train much smaller more performant models at the same time 🚀
You can think of this process as distilling the knowledge from LLMs and humans into a compact model 🌟
🔗 Read more in this excellent blog from Explosionhttps://lnkd.in/e82Efdti
🌟 We’re super excited to have Ines Montani join us as a keynote speaker at #PDAmsterdam2024! Ines is ready to take us on an epic adventure through the world of Applied NLP in the age of Generative AI.
🚀 What’s on the Agenda?
Get ready for a deep dive with Ines into the realm of Large Language Models (LLMs) and in-context learning, where the magic words are “prompts are all you need.” While prototyping has gotten a heck of a lot easier, making the leap to production can still be a wild ride. Ines will be sharing the secret spells—uh, lessons—learned from the front lines of real-world information extraction battles. Plus, she’ll reveal some game-changing strategies for building NLP systems that are not only smart but also tough enough for the real world.
👩💻 Meet Ines Montani:
Ines isn’t just the co-founder and CEO of Explosion; she’s also a core developer of spaCy, a popular open-source library for Natural Language Processing in Python, and Prodigy, a modern annotation tool for creating training data for machine learning models.
🎟️ Why You Can’t Miss This:
Join us to geek out over data science and snag some sage advice from Ines and other Data and AI wizards this September at PyData Amsterdam 2024.
Grab your tickets now and prepare for an unforgettable experience!
🔗 https://lnkd.in/ek6FaNgh
Can’t wait to see you there for some serious NLP fun and groundbreaking insights!
#NLP#AI#MachineLearning#PyData#DataScience#GenerativeAI#LLMs#spaCy#Prodigy
In our latest episode, we're excited to host Ines Montani, a developer specializing in AI and NLP technology. She is the cofounder and CEO of Explosion and a co-developer of spaCy, the leading open-source library for NLP tasks in Python, as well as Prodigy, a cutting-edge annotation tool for creating training data for machine learning models.
During our discussion, we explore: https://lnkd.in/g5XtvW4p
➡️ The focus areas for spaCy and how it all comes together
➡️ Applying generative AI technologies to specific industry problems
➡️ Common pain points in the field
➡️ Challenges faced by developers
➡️ The future of generative AI
➡️ An exciting Rapid-Fire
Tune in now for a conversation that blends leadership wisdom, technological innovation, and practical advice.
Kunal Jain#LeadingWithData#Explosion#spacy#ines#DataScience#AI#Podcast
Check out my latest InfoQ w/ Ines Montani about the transformative power of open-source in AI! From democratizing tech to creating task-specific models, discover how open-source is shaping the future of AI and ensuring transparency, privacy, and innovation. #AI#OpenSource Read more here: https://lnkd.in/e477irds