We are excited to welcome the new 1.4.0 version of NLP for Legal including the following new capabilities:
Spark Ecosystem
Legal NLP has been built on top of Spark NLP, which uses Spark MLLib pipelines. This means, You can have a common pipeline with any component of Spark NLP of Spark MLLib. Also, you combine it with the rest of our licensed libraries, such as Visual NLP, Healthcare NLP or Finance NLP. The library works on the top of Transformers and other Deep Learning architectures, providing state-of-the-art models which can be run on Spark Clusters. Remember, Spark NLP is the only library natively scalable to do parallel computing, so it is Legal NLP.
New Models
Named Entity Recognition:
legner_sigma_absa_people
: This is a Legal NER model trained on the Sigma Absa Dataset for legal sentiment analysis on legal parties, including coreference pronouns (he, him, their…). This is the first component which extracts those people’s names and pronouns and NER.
You have the second component, which does Assertion Status to retrieve sentiment, on legassertion_sigma_absa_sentiment
Predicted Entities: PER
, O
legner_notice_clause
: This is a NER model aimed to be used in notice clauses, to retrieve entities such as NOTICE_METHOD, NOTICE_PARTY, ADDRESS, EMAIL, etc.
Predicted Entities: ADDRESS
, DEPARTMENT
, EMAIL
, FAX
, NAME
, NOTICE_METHOD
, NOTICE_PARTY
, PERSON
, PHONE
, TITLE
Text Classification
Document classifiers:
Binary Document Classification Models: We have added 60+ different Binary Document Classification models on this release:
Using LongformerEmbeddings:
legclf_deposit_agreement
, legclf_indemnity_agreement
, legclf_investment_advisory_agreement
, legclf_option_agreement
, legclf_pledge_agreement
, legclf_restricted_stock_unit_agreement
, legclf_revolving_credit_agreement
, legclf_severance_agreement
, legclf_underwriting_agreement
, legclf_agreement_and_plan_of_reorganization
, legclf_administrative_services_agreement
, legclf_aircraft_lease_agreement
, legclf_custody_agreement
, legclf_expense_limitation_agreement
, legclf_joinder_agreement
, legclf_plan_and_agreement_of_merger
, legclf_reference_trust_agreement
, legclf_restricted_stock_agreement
, legclf_share_purchase_agreement
, legclf_subadvisory_agreement
, legclf_warrant_agreement
, legclf_supplemental_indenture
, legclf_sub_advisory_agreement
, legclf_stock_option_agreement
, legclf_registration_rights_agreement
, legclf_management_agreement
, legclf_limited_liability_company_agreement
, legclf_license_agreement
, legclf_indenture
, legclf_agreement
Using BertSentenceEmbeddings:
legclf_administrative_services_agreement_bert
, legclf_aircraft_lease_agreement_bert
, legclf_custody_agreement_bert
, legclf_expense_limitation_agreement_bert
, legclf_joinder_agreement_bert
, legclf_plan_and_agreement_of_merger_bert
, legclf_reference_trust_agreement_bert
, legclf_restricted_stock_agreement_bert
, legclf_share_purchase_agreement_bert
, legclf_subadvisory_agreement_bert
, legclf_deposit_agreement_bert
, legclf_indemnity_agreement_bert
, legclf_investment_advisory_agreement_bert
, legclf_option_agreement_bert
, legclf_pledge_agreement_bert
, legclf_restricted_stock_unit_agreement_bert
, legclf_revolving_credit_agreement_bert
, legclf_severance_agreement_bert
, legclf_underwriting_agreement_bert
, legclf_agreement_and_plan_of_reorganization_bert
, legclf_warrant_agreement_bert
, legclf_survival_bert
, legclf_supplemental_indenture_bert
, legclf_successors_and_assigns_bert
, legclf_sub_advisory_agreement_bert
, legclf_stock_option_agreement_bert
, legclf_representations_and_warranties_of_the_company_bert
, legclf_representations_and_warranties_bert
, legclf_registration_rights_agreement_bert
, legclf_management_agreement_bert
, legclf_limited_liability_company_agreement_bert
, legclf_limitation_of_liability_bert
, legclf_license_agreement_bert
, legclf_indenture_bert
, legclf_general_provisions_bert
, legclf_events_of_default_bert
, legclf_defined_terms_bert
, legclf_confidential_information_bert
, legclf_applicable_law_bert
, legclf_applicable_law_bert
, legclf_agreement_bert
We include 2 types of embeddings:
1. Bert-based (512 tokens)
2. Longformer-based (4096 tokens)
Bert-based embeddings take less time to train and infer than Longformer-based embeddings, but Long-former-based embeddings are more accurate when the information extends over the 512th token.
Demo visualizing Classification Of Legal Clauses
Clause Classifiers
Binary Clause Classification Models: We have added 20 different Binary Clause Classification models on this release:
legclf_governing_law_clause
, legclf_entire_agreement_clause
, legclf_demand_registration_clause
, legclf_rules_and_regulations_clause
, legclf_health_and_safety_clause
, legclf_electronic_communications_clause
, legclf_replacement_of_lenders_clause
, legclf_language_clause
, legclf_business_day_clause
, legclf_ti_allowance_clause
, legclf_eminent_domain_clause
, legclf_erisa_reports_clause
, legclf_injury_pay_clause
, legclf_no_appraisal_rights_clause
, legclf_payment_of_interest_defaulted_interest_clause
, legclf_principal_underwriter_clause
, legclf_retention_of_sub_adviser_clause
, legclf_right_to_cure_clause
, legclf_sick_days_clause
, legclf_successor_to_the_bank_clause
, legclf_notice_clause
, legclf_survival_clause
, legclf_limitation_of_liability_clause
, legclf_successors_and_assigns_clause
, legclf_applicable_law_clause
, legclf_events_of_default_clause
, legclf_confidential_information_clause
, legclf_representations_and_warranties_clause
, legclf_defined_terms_clause
, legclf_general_provisions_clause
, legclf_representations_and_warranties_of_the_company_clause
New Assertion Model
legassertion_sigma_absa_sentiment
: This mode was trained to be benchmarked against SigmaLaw’s official Aspect-based Sentiment Analysis model, based on ABSA dataset, where several parties were tagged with their sentiments in legal texts.
Predicted Entities: neutral
, positive
, negative
New Relation Extraction Model
legre_notice_clause_xs
: This is a Relation Extraction model aimed to be used in notice clauses, to retrieve relations between entities as NOTICE_PARTY, ADDRESS, EMAIL, TITLE etc.
Predicted Entities: has_notice_party
, has_address
, has_person
, has_phone
, has_fax
, has_title
, has_email
, has_department
New Sentence Embeddings
We have released 30+ new Portuguese Bert Sentence Embeddings for legal, based on Legal Transformers. With this, there are now more than 60 embeddings for legal in various languages.
sbert_Legal_BERTimbau_base_TSDAE_sts, sbert_Legal_BERTimbau_large_GPL_sts
, sbert_Legal_BERTimbau_large_TSDAE_sts, sbert_Legal_BERTimbau_large_TSDAE_sts_v2
,sbert_Legal_BERTimbau_large_TSDAE_sts_v4
, sbert_Legal_BERTimbau_large_TSDAE_v4_GPL_sts
, sbert_Legal_BERTimbau_large_v2_sts
, sbert_Legal_BERTimbau_sts_base_ma
, sbert_Legal_BERTimbau_sts_base_ma_v2
, sbert_Legal_BERTimbau_sts_base
, sbert_Legal_BERTimbau_sts_large_ma
,
sbert_Legal_BERTimbau_sts_large_ma_v3
, sbert_Legal_BERTimbau_sts_large
, sbert_Legal_BERTimbau_sts_large_v2
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.10
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.1
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.2
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.3
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.4
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.5
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.7
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.8
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v0.9
, sbert_bert_large_portuguese_cased_legal_mlm_sts_v1.0
, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_gpl_nli_sts_v0
, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_gpl_nli_sts_v1
, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_nli_sts_v0
, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_nli_sts_v1
, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_sts_v0
, sbert_bert_large_portuguese_cased_legal_mlm_v0.11_sts_v1
, sbert_Legal_BERTimbau_large_GPL_sts
, sbert_Legal_BERTimbau_base_TSDAE_sts)
New Demo
You can find the existing demos on our Demo site, where you will find demos, showcasing some of the models available in Models Hub.
- Notice Clause Relation Extraction → This demo shows how to extract relations between entities as NOTICE_PARTY, NAME, TITLE, ADDRESS, EMAIL, etc. from notice clauses.
Demo for “legre_notice_clause_xs”
Docker Webapps to check Legal Zero-shot NER
Do you want to check Zero-shot Legal NER ? Usee our dockerized webapps for streamlit or flask+jinja2 and learn about prompt engineering, while you speed up the prototyping ofyour NER models without any training data. Available at https://github.com/JohnSnowLabs/spark-nlp-workshop/tree/master/legal-nlp/platforms/docker
Party-specific Sentiment Analysis (from SigmaLaw benchmark)
We have benchmarked Legal NLP Assertion Status comparing it to the results of many different DL architectures reported in this paper by SigmaLaw: https://metatext.io/datasets/sigmalaw-absa
We are happy to announce we have been able to improve the claimed state of the art in +0.03 for Sentiment Analysis at the NER-chunk level (for parties). The models we have used are:
1) legner_sigma_absa_people
2) legassertion_sigma_absa_sentiment
Databricks Notebooks
All Legal NLP notebooks can be run in Databricks. If you wish to use our ready to use versions of Spark NLP in Databricks, please find the notebooks here.
Want to see more?
- Check our Models Hub
- Check our Notebooks
- Check our Demos
How to install?
!pip install johnsnowlabs
from johnsnowlabs import *
# Before 4.2.3 jsl.install(json_license_path=[your_legal_license_path]) jsl.start(json_license_path=[your_legal_license_path])# After 4.2.3 nlp.install(json_license_path=[your_legal_license_path]) nlp.start(json_license_path=[your_legal_license_path])
Do you want to get certified in Legal NLP?
We will carry out a Certification training session of 4 hours in Jan, 2023. If you are interested, please check the dates and register here https://www.johnsnowlabs.com/spark-nlp-training/
Try Legal NLP
See in action