Flan T5 finetuned for Multichoice on MPRE
The new model is a fine-tuned version of Flan T5, extending its capabilities to reason on the Legal domain. Passing the Multistate Professional Responsibility Examination is required by most US states as a prerequisite or co-requisite to enter a bar as an attorney at law.
To create the pipeline, simply use the TextGenerator
from Legal NLP:
from johnsnowlabs import nlp, legal spark = nlp.start() document_assembler = ( nlp.DocumentAssembler() .setInputCol("question") .setOutputCols("document_question") ) leg_gen = ( legal.TextGenerator.pretrained( "leggen_flant5_mpre", "en", "legal/models" ) .setInputCols(["question"]) .setOutputCol("generated_text") .setMaxNewTokens(150) .setStopAtEos(True) ) pipeline = nlp.Pipeline(stages=[document_assembler, leg_gen])
Then, to use the model, pass a question in the format of MPRE, meaning a short context followed by choices for the model to select from. For example,
Conglomerate Corporation owns a little more than half the stock of Giant Company. Conglomerate’s stock, in turn, is public, available on the public stock exchange, as is the remainder of the stock in Giant Company. The president of Conglomerate Corporation has asked Attorney Stevenson to represent Giant Company in a deal by which Giant would make a proposed transfer of certain real property to Conglomerate Corporation. The property in question is unusual because it contains an underground particle collider used for scientific research, but also valuable farmland on the surface, as well as some valuable mineral rights in another part of the parcel. These factors make the property value difficult to assess by reference to the general real-estate market, which means it is difficult for anyone to determine the fairness of the transfer price in the proposed deal. Would it be proper for Attorney Stevenson to facilitate this property transfer at the behest of the president of Conglomerate, if Attorney Stevenson would be representing Giant as the client in this specific matter?
Yes, because Conglomerate Corporation owns more than half of Giant Company, so the two corporate entities are one client for purposes of the rules regarding conflicts of interest.
Yes, because the virtual impossibility of obtaining an appraisal of the fair market value of the property means that the lawyer does not have actual knowledge that the deal is unfair to either party.
No, because the attorney would be unable to inform either client fully about whether the proposed transfer price would be in their best interest.
No, not unless the attorney first obtains effective informed consent of the management of Giant Company, as well as that of Conglomerate, because the ownership of Conglomerate and Giant is not identical, and their interests materially differ in the proposed transaction.
Send the data to a spark data frame and use the model to obtain the answer:
data = spark.createDataFrame( [ [ """question: Conglomerate Corporation owns a little more than half the stock of Giant Company. Conglomerate's stock, in turn, is public, available on the public stock exchange, as is the remainder of the stock in Giant Company. The president of Conglomerate Corporation has asked Attorney Stevenson to represent Giant Company in a deal by which Giant would make a proposed transfer of certain real property to Conglomerate Corporation. The property in question is unusual because it contains an underground particle collider used for scientific research, but also valuable farmland on the surface, as well as some valuable mineral rights in another part of the parcel. These factors make the property value difficult to assess by reference to the general real-estate market, which means it is difficult for anyone to determine the fairness of the transfer price in the proposed deal. Would it be proper for Attorney Stevenson to facilitate this property transfer at the behest of the president of Conglomerate, if Attorney Stevenson would be representing Giant as the client in this specific matter? Yes, because Conglomerate Corporation owns more than half of Giant Company, so the two corporate entities are one client for purposes of the rules regarding conflicts of interest. Yes, because the virtual impossibility of obtaining an appraisal of the fair market value of the property means that the lawyer does not have actual knowledge that the deal is unfair to either party. No, because the attorney would be unable to inform either client fully about whether the proposed transfer price would be in their best interest. No, not unless the attorney first obtains effective informed consent of the management of Giant Company, as well as that of Conglomerate, because the ownership of Conglomerate and Giant is not identical, and their interests materially differ in the proposed transaction.""", ] ] ).toDF("question") results = pipeline.fit(data).transform(data) results.select("generated.result").show(truncate=False)
Obtaining, in this example,
Not if the attorney first obtainses efficient informed consent of the administration of Giants, as well and of Conglomerate
Question-Answering on MPRE
This new model is a modification of the MPRE dataset, adding it as a question answering task without giving the model choices to choose from.
To use the model, add the question and context as input to the annotator:
context = """ Mr. Burns, the chief executive officer of Conglomerate Corporation, now faces criminal charges of discussing prices with the president of a competing firm. If found guilty, both Mr. Burns and Conglomerate Corporation will be subject to civil and criminal penalties under state and federal antitrust laws. An attorney has been representing Conglomerate Corporation. She has conducted a thorough investigation of the matter, and she has personally concluded that no such pricing discussions occurred. Both Conglomerate Corporation and Mr. Burns plan to defend on that ground. Mr. Burns has asked the attorney to represent him, as well as Conglomerate Corporation, in the proceedings. The legal and factual defenses of Conglomerate Corporation and Mr. Burns seem completely consistent at the outset of the matter. Would the attorney need to obtain informed consent to a conflict of interest from both Mr.""" question = """ Burns and a separate corporate officer at Conglomerate Corporation before proceeding with this dual representation?""" spark_df = data = spark.createDataFrame([[ question, context]]).toDF("question", "context")
Then, build the pipeline using the QuestionAnswering
annotator and the pretrained model legqa_flant5_mpre
:
document_assembler = ( nlp.MultiDocumentAssembler() .setInputCols("question", "context") .setOutputCols("document_question", "document_context") ) leg_qa = ( legal.QuestionAnswering.pretrained( "legqa_flant5_mpre", "en", "legal/models" ) .setInputCols(["document_question", "document_context"]) .setCustomPrompt("question: {QUESTION} context: {CONTEXT}") .setMaxNewTokens(50) .setOutputCol("answer") ) pipeline = nlp.Pipeline(stages=[document_assembler, leg_qa])
Finally, run the model:
result = pipeline.fit(data).transform(data) result.select("answer.result").show(truncate=False)
Obtaining:
Yes, because the conflicting positions in the legal and factual defenses require the attorney to obtain the informed consent of both clients before proceeding with the representation.
New demo for Law Stack Exchange Classifier
Released in the past version of Legal NLP, the Law Stack Exchange model received a new demo app, highlighting how the model can be used in practice.
Bug fixes
We fixed bugs on the pretrained deidentification pipelines of our models, which contained incompatibility problems with newer versions of Spark. Now the models are compatible with all major releases of the library.
The pipeline can be used to remove the information by masking it with entity labels, special characters, or obfuscating (changing with synthetic data). Use it with the PretrainedPipeline
named legpipe_deid
:
Obtaining:
Masking with entity labels:
Masking with special chars:
Masking with fixed-length chars:
Obfuscated:
Fancy trying?
We’ve got 30-days free licenses for you with technical support from our legal team of technical and SME. This trial includes complete access to more than 926 models, including Classification, NER, Relation Extraction, Similarity Search, Summarization, Sentiment Analysis, Question Answering, etc. and 120+ legal language models.
Just go to https://www.johnsnowlabs.com/install/ and follow the instructions!
Don’t forget to check our notebooks and demos.
How to run
Legal NLP is extremely easy to run on both clusters and driver-only environments using johnsnowlabs
library:
! pip install johnsnowlabs
from johnsnowlabs import nlp
nlp.install(force_browser=True)
# Start Spark Session spark = nlp.start()
# Import the Legal NLP module from johnsnowlabs import legal
For alternative installation methods of how to install in specific environments, please check the docs. Visit also our other product pages: Healthcare NLP, Biomedical NLP, Clinical NLP, Finance NLP, and Legal NLP.