was successfully added to your cart.

Legal NLP 1.4.0 for Spark NLP: Over 80+ new state-of-the-art models!

We are excited to welcome the new 1.4.0 version of NLP for Legal including the following new capabilities:

Spark Ecosystem

Legal NLP has been built on top of Spark NLP, which uses Spark MLLib pipelines. This means, You can have a common pipeline with any component of Spark NLP of Spark MLLib. Also, you combine it with the rest of our licensed libraries, such as Visual NLP, Healthcare NLP or Finance NLP. The library works on the top of Transformers and other Deep Learning architectures, providing state-of-the-art models which can be run on Spark Clusters. Remember, Spark NLP is the only library natively scalable to do parallel computing, so it is Legal NLP.

New Models

Models Hub

Named Entity Recognition:

legner_sigma_absa_people : This is a Legal NER model trained on the Sigma Absa Dataset for legal sentiment analysis on legal parties, including coreference pronouns (he, him, their…). This is the first component which extracts those people’s names and pronouns and NER.

You have the second component, which does Assertion Status to retrieve sentiment, on legassertion_sigma_absa_sentiment
Predicted Entities: PER, O

legner_notice_clause : This is a NER model aimed to be used in notice clauses, to retrieve entities such as NOTICE_METHOD, NOTICE_PARTY, ADDRESS, EMAIL, etc.
Predicted Entities: ADDRESSDEPARTMENTEMAILFAXNAMENOTICE_METHODNOTICE_PARTYPERSONPHONETITLE

Text Classification

Document classifiers:

Binary Document Classification Models: We have added 60+ different Binary Document Classification models on this release:

Using LongformerEmbeddings:

legclf_deposit_agreement, legclf_indemnity_agreement, legclf_investment_advisory_agreement, legclf_option_agreement, legclf_pledge_agreement, legclf_restricted_stock_unit_agreement, legclf_revolving_credit_agreement, legclf_severance_agreement, legclf_underwriting_agreement, legclf_agreement_and_plan_of_reorganization, legclf_administrative_services_agreement, legclf_aircraft_lease_agreement, legclf_custody_agreement, legclf_expense_limitation_agreement, legclf_joinder_agreement, legclf_plan_and_agreement_of_merger, legclf_reference_trust_agreement, legclf_restricted_stock_agreement, legclf_share_purchase_agreement, legclf_subadvisory_agreement, legclf_warrant_agreement, legclf_supplemental_indenture, legclf_sub_advisory_agreement, legclf_stock_option_agreement, legclf_registration_rights_agreement, legclf_management_agreement, legclf_limited_liability_company_agreement, legclf_license_agreement, legclf_indenture, legclf_agreement

Using BertSentenceEmbeddings:

legclf_administrative_services_agreement_bert, legclf_aircraft_lease_agreement_bert, legclf_custody_agreement_bert, legclf_expense_limitation_agreement_bert, legclf_joinder_agreement_bert, legclf_plan_and_agreement_of_merger_bert, legclf_reference_trust_agreement_bert, legclf_restricted_stock_agreement_bert, legclf_share_purchase_agreement_bert, legclf_subadvisory_agreement_bert, legclf_deposit_agreement_bert, legclf_indemnity_agreement_bert, legclf_investment_advisory_agreement_bert, legclf_option_agreement_bert, legclf_pledge_agreement_bert, legclf_restricted_stock_unit_agreement_bert, legclf_revolving_credit_agreement_bert, legclf_severance_agreement_bert, legclf_underwriting_agreement_bert, legclf_agreement_and_plan_of_reorganization_bert, legclf_warrant_agreement_bert, legclf_survival_bert, legclf_supplemental_indenture_bert, legclf_successors_and_assigns_bert, legclf_sub_advisory_agreement_bert, legclf_stock_option_agreement_bert, legclf_representations_and_warranties_of_the_company_bert, legclf_representations_and_warranties_bert, legclf_registration_rights_agreement_bert, legclf_management_agreement_bert, legclf_limited_liability_company_agreement_bert, legclf_limitation_of_liability_bert, legclf_license_agreement_bert, legclf_indenture_bert, legclf_general_provisions_bert, legclf_events_of_default_bert, legclf_defined_terms_bert, legclf_confidential_information_bert, legclf_applicable_law_bert, legclf_applicable_law_bert, legclf_agreement_bert

We include 2 types of embeddings:

1. Bert-based (512 tokens)
2. Longformer-based (4096 tokens)

Bert-based embeddings take less time to train and infer than Longformer-based embeddings, but Long-former-based embeddings are more accurate when the information extends over the 512th token.

Demo visualizing Classification Of Legal Clauses

Demo visualizing Classification Of Legal Clauses

Clause Classifiers

Binary Clause Classification Models: We have added 20 different Binary Clause Classification models on this release:

legclf_governing_law_clause, legclf_entire_agreement_clause, legclf_demand_registration_clause, legclf_rules_and_regulations_clause, legclf_health_and_safety_clause, legclf_electronic_communications_clause, legclf_replacement_of_lenders_clause, legclf_language_clause, legclf_business_day_clause, legclf_ti_allowance_clause, legclf_eminent_domain_clause, legclf_erisa_reports_clause, legclf_injury_pay_clause, legclf_no_appraisal_rights_clause, legclf_payment_of_interest_defaulted_interest_clause, legclf_principal_underwriter_clause, legclf_retention_of_sub_adviser_clause, legclf_right_to_cure_clause, legclf_sick_days_clause, legclf_successor_to_the_bank_clause, legclf_notice_clause, legclf_survival_clause, legclf_limitation_of_liability_clause, legclf_successors_and_assigns_clause, legclf_applicable_law_clause, legclf_events_of_default_clause, legclf_confidential_information_clause, legclf_representations_and_warranties_clause, legclf_defined_terms_clause, legclf_general_provisions_clause, legclf_representations_and_warranties_of_the_company_clause

New Assertion Model

legassertion_sigma_absa_sentiment : This mode was trained to be benchmarked against SigmaLaw’s official Aspect-based Sentiment Analysis model, based on ABSA dataset, where several parties were tagged with their sentiments in legal texts.
Predicted Entitiesneutralpositivenegative

New Relation Extraction Model

legre_notice_clause_xs : This is a Relation Extraction model aimed to be used in notice clauses, to retrieve relations between entities as NOTICE_PARTY, ADDRESS, EMAIL, TITLE etc.
Predicted Entities: has_notice_partyhas_addresshas_personhas_phonehas_faxhas_titlehas_emailhas_department

New Sentence Embeddings

We have released 30+ new Portuguese Bert Sentence Embeddings for legal, based on Legal Transformers. With this, there are now more than 60 embeddings for legal in various languages.

sbert_Legal_BERTimbau_base_TSDAE_sts, sbert_Legal_BERTimbau_large_GPL_sts, sbert_Legal_BERTimbau_large_TSDAE_sts, sbert_Legal_BERTimbau_large_TSDAE_sts_v2,sbert_Legal_BERTimbau_large_TSDAE_sts_v4, sbert_Legal_BERTimbau_large_TSDAE_v4_GPL_sts, sbert_Legal_BERTimbau_large_v2_sts, sbert_Legal_BERTimbau_sts_base_ma, sbert_Legal_BERTimbau_sts_base_ma_v2, sbert_Legal_BERTimbau_sts_base, sbert_Legal_BERTimbau_sts_large_ma,
sbert_Legal_BERTimbau_sts_large_ma_v3, sbert_Legal_BERTimbau_sts_large, sbert_Legal_BERTimbau_sts_large_v2, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.10, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.1, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.2, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.3, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.4, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.5, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.7, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.8, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.9, sbert_bert_large_portuguese_cased_legal_mlm_sts_v1.0, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_gpl_nli_sts_v0, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_gpl_nli_sts_v1, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_nli_sts_v0, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_nli_sts_v1, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_sts_v0, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_sts_v1, sbert_Legal_BERTimbau_large_GPL_sts, sbert_Legal_BERTimbau_base_TSDAE_sts)

New Demo

You can find the existing demos on our Demo site, where you will find demos, showcasing some of the models available in Models Hub.

  • Notice Clause Relation Extraction → This demo shows how to extract relations between entities as NOTICE_PARTY, NAME, TITLE, ADDRESS, EMAIL, etc. from notice clauses.

Demo for “legre_notice_clause_xs”

Demo for “legre_notice_clause_xs”

Docker Webapps to check Legal Zero-shot NER

Do you want to check Zero-shot Legal NER ? Usee our dockerized webapps for streamlit or flask+jinja2 and learn about prompt engineering, while you speed up the prototyping ofyour NER models without any training data. Available at https://github.com/JohnSnowLabs/spark-nlp-workshop/tree/master/legal-nlp/platforms/docker

Docker Webapps to check Legal Zero-shot NER

Party-specific Sentiment Analysis (from SigmaLaw benchmark)

We have benchmarked Legal NLP Assertion Status comparing it to the results of many different DL architectures reported in this paper by SigmaLaw: https://metatext.io/datasets/sigmalaw-absa

We are happy to announce we have been able to improve the claimed state of the art in +0.03 for Sentiment Analysis at the NER-chunk level (for parties). The models we have used are:
1) legner_sigma_absa_people
2) legassertion_sigma_absa_sentiment

SigmaLaw benchmark

Databricks Notebooks

All Legal NLP notebooks can be run in Databricks. If you wish to use our ready to use versions of Spark NLP in Databricks, please find the notebooks here.

Want to see more?

How to install?

!pip install johnsnowlabs
from johnsnowlabs import *

# Before 4.2.3
jsl.install(json_license_path=[your_legal_license_path])
jsl.start(json_license_path=[your_legal_license_path])# After 4.2.3
nlp.install(json_license_path=[your_legal_license_path])
nlp.start(json_license_path=[your_legal_license_path])

Do you want to get certified in Legal NLP?

We will carry out a Certification training session of 4 hours in Jan, 2023. If you are interested, please check the dates and register here https://www.johnsnowlabs.com/spark-nlp-training/

How useful was this post?

Try Legal NLP

See in action
Our additional expert:
👋 I am a Computer Science Senior who juggles his studies and life as a Data Scientist. ⚓ I am Currently working at John Snow Labs as a Data Scientist and maintaining/contributing to the SparkNLP library.

Financial Zero-shot Learning and Automatic Prompt Generation with Spark NLP

Zero-shot Learning (ZSL) is one of the most recent advancements in Machine Learning aimed to train Deep Neural Network models to have...
preloader