Meet us at HIMSS 2025 - March 3-6 - Book a meeting >>
was successfully added to your cart.

Spark NLP 5.0.2: Introducing ONNX Support for ALBERT, CmameBERT, and XLM-RoBERTa, a new Zero-Short Classifier for XLM-RoBERTa transformer, 200+ new ONNX models and more!

Avatar photo
Principal AI / ML Engineer and a Senior Team Lead

Spark NLP 5.0.2 comes with new ONNX support for ALBERT, CmameBERT, and XLM-RoBERTa annotators, a new Zero-Short Classifier for XLM-RoBERTa transformer, 200+ new ONNX models, and bug fixes! We want to thank our community for their valuable feedback, feature requests, and contributions. Our Models Hub now contains over 18,000+ free and truly open-source models & pipelines. 🎉

New Features

  • NEW: Introducing support for ONNX Runtime in ALBERT, CamemBERT, and XLM-RoBERTa annotators. We have already converted 200+ models to ONNX format for these annotators for our community
  • NEW: Implement XlmRoBertaForZeroShotClassification annotator for Zero-Shot multi-class & multi-label text classification based on XLM-RoBERTa transformer

Bug Fixes & Enhancements

  • Fix MarianTransformers annotator breaking with java.lang.ClassCastException in Python
  • Fix out of 0.0/1.0 accuracy in SentenceDetectorDL and MultiClassifierDL annotators
  • Fix BART issue with a low-temperature value that only occurred when there are no non-infinite logits satisfying the low temperature and top_k values
  • Add missing E5Embeddings and InstructorEmbeddings annotators to annotators in Scala for easy all-in-one import

Documentation

Community support

  • Slack For live discussion with the Spark NLP community and the team
  • GitHub Bug reports, feature requests, and contributions
  • Discussions Engage with other community members, share ideas, and show off how you use Spark NLP!
  • Medium Spark NLP articles
  • JohnSnowLabs official Medium
  • YouTube Spark NLP video tutorials

Installation

Python

#PyPI

pip install spark-nlp==5.0.2

Spark Packages

spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x (Scala 2.12):

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.0.2

pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.0.2

GPU

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.0.2

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.0.2

Apple Silicon (M1 & M2)

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.0.2

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.0.2

AArch64

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.0.2

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.0.2

Maven

spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x:

		<dependency>
			<groupId>com.johnsnowlabs.nlp</groupId>
			<artifactId>spark-nlp_2.12</artifactId>
			<version>5.0.2</version>
		</dependency>

spark-nlp-gpu:

		<dependency>
			<groupId>com.johnsnowlabs.nlp</groupId>
			<artifactId>spark-nlp-gpu_2.12</artifactId>
			<version>5.0.2</version>
		</dependency>

spark-nlp-silicon:

		<dependency>
			<groupId>com.johnsnowlabs.nlp</groupId>
			<artifactId>spark-nlp-silicon_2.12</artifactId>
			<version>5.0.2</version>
		</dependency>

spark-nlp-aarch64:

		<dependency>
			<groupId>com.johnsnowlabs.nlp</groupId>
			<artifactId>spark-nlp-aarch64_2.12</artifactId>
			<version>5.0.2</version>
		</dependency>

FAT JARs

How useful was this post?

Avatar photo
Principal AI / ML Engineer and a Senior Team Lead
Our additional expert:
Maziyar Panahi is a Principal AI / ML engineer and a senior Team Lead with over a decade-long experience in public research. He leads a team behind Spark NLP at John Snow Labs, one of the most widely used NLP libraries in the enterprise. He develops scalable NLP components using the latest techniques in deep learning and machine learning that includes classic ML, Language Models, Speech Recognition, and Computer Vision. He is an expert in designing, deploying, and maintaining ML and DL models in the JVM ecosystem and distributed computing engine (Apache Spark) at the production level. He has extensive experience in computer networks and DevOps. He has been designing and implementing scalable solutions in Cloud platforms such as AWS, Azure, and OpenStack for the last 15 years. In the past, he also worked as a network engineer in high-level places after he completed his Microsoft and Cisco training (MCSE, MCSA, and CCNA). He is a lecturer at The National School of Geographical Sciences teaching Big Data Platforms and Data Analytics. He is currently employed by The French National Centre for Scientific Research (CNRS) as IT Project Manager and working at the Institute of Complex Systems of Paris (ISCPIF).

Reliable and verified information compiled by our editorial and professional team. John Snow Labs' Editorial Policy.

Spark NLP 5.0: It’s All About That Search!

We are delighted to announce the release of Spark NLP 5.0, featuring the highly anticipated support for ONNX! We are delighted to...
preloader