Legal NLP releases new Multichoice and Question-Answering on Multistate Professional Responsibility Examination Questions and more

08.09.2023

David Cecchini

Data Scientist at John Snow Labs

Flan T5 finetuned for Multichoice on MPRE

The new model is a fine-tuned version of Flan T5, extending its capabilities to reason on the Legal domain. Passing the Multistate Professional Responsibility Examination is required by most US states as a prerequisite or co-requisite to enter a bar as an attorney at law.

To create the pipeline, simply use the TextGenerator from Legal NLP:

from johnsnowlabs import nlp, legal

spark = nlp.start()

document_assembler = (
    nlp.DocumentAssembler()
    .setInputCol("question")
    .setOutputCols("document_question")
)

leg_gen = (
    legal.TextGenerator.pretrained(
        "leggen_flant5_mpre", "en", "legal/models"
    )
    .setInputCols(["question"])
    .setOutputCol("generated_text")
    .setMaxNewTokens(150)
    .setStopAtEos(True)
)

pipeline = nlp.Pipeline(stages=[document_assembler, leg_gen])

Then, to use the model, pass a question in the format of MPRE, meaning a short context followed by choices for the model to select from. For example,

Conglomerate Corporation owns a little more than half the stock of Giant Company. Conglomerate’s stock, in turn, is public, available on the public stock exchange, as is the remainder of the stock in Giant Company. The president of Conglomerate Corporation has asked Attorney Stevenson to represent Giant Company in a deal by which Giant would make a proposed transfer of certain real property to Conglomerate Corporation. The property in question is unusual because it contains an underground particle collider used for scientific research, but also valuable farmland on the surface, as well as some valuable mineral rights in another part of the parcel. These factors make the property value difficult to assess by reference to the general real-estate market, which means it is difficult for anyone to determine the fairness of the transfer price in the proposed deal. Would it be proper for Attorney Stevenson to facilitate this property transfer at the behest of the president of Conglomerate, if Attorney Stevenson would be representing Giant as the client in this specific matter?

Yes, because Conglomerate Corporation owns more than half of Giant Company, so the two corporate entities are one client for purposes of the rules regarding conflicts of interest.

Yes, because the virtual impossibility of obtaining an appraisal of the fair market value of the property means that the lawyer does not have actual knowledge that the deal is unfair to either party.

No, because the attorney would be unable to inform either client fully about whether the proposed transfer price would be in their best interest.

No, not unless the attorney first obtains effective informed consent of the management of Giant Company, as well as that of Conglomerate, because the ownership of Conglomerate and Giant is not identical, and their interests materially differ in the proposed transaction.

Send the data to a spark data frame and use the model to obtain the answer:

data = spark.createDataFrame(
    [
        [
            """question:
Conglomerate Corporation owns a little more than half the stock of Giant 
Company. Conglomerate's stock, in turn, is public, available on the public 
stock exchange, as is the remainder of the stock in Giant Company. 
The president of Conglomerate Corporation has asked Attorney Stevenson to 
represent Giant Company in a deal by which Giant would make a proposed 
transfer of certain real property to Conglomerate Corporation. 
The property in question is unusual because it contains an underground 
particle collider used for scientific research, but also valuable farmland 
on the surface, as well as some valuable mineral rights in another part of 
the parcel. These factors make the property value difficult to assess by 
reference to the general real-estate market, which means it is difficult for 
anyone to determine the fairness of the transfer price in the proposed deal. 
Would it be proper for Attorney Stevenson to facilitate this property 
transfer at the behest of the president of Conglomerate, if Attorney 
Stevenson would be representing Giant as the client in this specific matter? 
Yes, because Conglomerate Corporation owns more than half of Giant Company, 
so the two corporate entities are one client for purposes of the rules 
regarding conflicts of interest. Yes, because the virtual impossibility 
of obtaining an appraisal of the fair market value of the property means 
that the lawyer does not have actual knowledge that the deal is unfair to 
either party. No, because the attorney would be unable to inform either 
client fully about whether the proposed transfer price would be in their 
best interest. No, not unless the attorney first obtains effective informed 
consent of the management of Giant Company, as well as that of Conglomerate, 
because the ownership of Conglomerate and Giant is not identical, and their 
interests materially differ in the proposed transaction.""",
        ]
    ]
).toDF("question")

results = pipeline.fit(data).transform(data)

results.select("generated.result").show(truncate=False)

Obtaining, in this example,

Not if the attorney first obtainses efficient informed consent of the administration of Giants, as well and of Conglomerate

Question-Answering on MPRE

This new model is a modification of the MPRE dataset, adding it as a question answering task without giving the model choices to choose from.

To use the model, add the question and context as input to the annotator:

context = """
Mr. Burns, the chief executive officer of Conglomerate Corporation, now 
faces criminal charges of discussing prices with the president of a competing 
firm. If found guilty, both Mr. Burns and Conglomerate Corporation will be 
subject to civil and criminal penalties under state and federal antitrust 
laws. An attorney has been representing Conglomerate Corporation. 
She has conducted a thorough investigation of the matter, and she has 
personally concluded that no such pricing discussions occurred. Both 
Conglomerate Corporation and Mr. Burns plan to defend on that ground. 
Mr. Burns has asked the attorney to represent him, as well as Conglomerate 
Corporation, in the proceedings. The legal and factual defenses of 
Conglomerate Corporation and Mr. Burns seem completely consistent at 
the outset of the matter. Would the attorney need to obtain informed consent 
to a conflict of interest from both Mr."""

question = """ 
Burns and a separate corporate officer at Conglomerate Corporation before 
proceeding with this dual representation?"""

spark_df = data = spark.createDataFrame([[
    question, context]]).toDF("question", "context")

Then, build the pipeline using the QuestionAnswering annotator and the pretrained model legqa_flant5_mpre:

document_assembler = (
    nlp.MultiDocumentAssembler()
    .setInputCols("question", "context")
    .setOutputCols("document_question", "document_context")
)

leg_qa = (
    legal.QuestionAnswering.pretrained(
        "legqa_flant5_mpre", "en", "legal/models"
    )
    .setInputCols(["document_question", "document_context"])
    .setCustomPrompt("question: {QUESTION} context: {CONTEXT}")
    .setMaxNewTokens(50)
    .setOutputCol("answer")
)

pipeline = nlp.Pipeline(stages=[document_assembler, leg_qa])

Finally, run the model:

result = pipeline.fit(data).transform(data)

result.select("answer.result").show(truncate=False)

Obtaining:

Yes, because the conflicting positions in the legal and factual defenses require the attorney to obtain the informed consent of both clients before proceeding with the representation.

New demo for Law Stack Exchange Classifier

Released in the past version of Legal NLP, the Law Stack Exchange model received a new demo app, highlighting how the model can be used in practice.

Bug fixes

We fixed bugs on the pretrained deidentification pipelines of our models, which contained incompatibility problems with newer versions of Spark. Now the models are compatible with all major releases of the library.

The pipeline can be used to remove the information by masking it with entity labels, special characters, or obfuscating (changing with synthetic data). Use it with the PretrainedPipeline named legpipe_deid:

Obtaining:

Masking with entity labels:

Masking with special chars:

Masking with fixed-length chars:

Obfuscated:

Fancy trying?

We’ve got 30-days free licenses for you with technical support from our legal team of technical and SME. This trial includes complete access to more than 926 models, including Classification, NER, Relation Extraction, Similarity Search, Summarization, Sentiment Analysis, Question Answering, etc. and 120+ legal language models.

Just go to https://www.johnsnowlabs.com/install/ and follow the instructions!

Don’t forget to check our notebooks and demos.

How to run

Legal NLP is extremely easy to run on both clusters and driver-only environments using johnsnowlabs library:

! pip install johnsnowlabs

from johnsnowlabs import nlp

nlp.install(force_browser=True)

# Start Spark Session
spark = nlp.start()

# Import the Legal NLP module
from johnsnowlabs import legal

For alternative installation methods of how to install in specific environments, please check the docs. Visit also our other product pages: Healthcare NLP, Biomedical NLP, Clinical NLP, Finance NLP, and Legal NLP.