was successfully added to your cart.

    Legal NLP releases Subpoenas section classification model and more LLM examples and use cases

    Avatar photo
    Data Scientist at John Snow Labs

    The latest version of Legal NLP, 1.15 introduces numerous additional features to the existing collection of 926+ models and 125+ Language Models from previous releases of the library. Let’s examine each of these new capabilities in detail.

    New Subpoenas section classifier

    This model can identify important sections of subpoena documents, such as `INSTRUCTION`, `ARGUMENT`, `DEFINITION`, `NOTIFICATION`, `DOCUMENT_REQUEST`, `STATEMENT_OF_FACTS`, `CONCLUSION`, among others.

    With the sections classified, we could run other models that are specialized in finding insights on each specific section.

    Updated LLM examples

    With the increase in the capabilities of the library, we added new examples to help users understand how to perform certain specific tasks:

    • Text summarization

    The updated notebook now shows an example of how to perform summarization on long documents. This is one approach to the challenging problem of how to process long documents with the limitations of the current models in terms of number of tokens they can process on the input texts.
    By splitting the document into chunks and taking into consideration the number of tokens that can be processed by the model at each run, the approach we used was able to summarize a long document by split-and-merge strategy.

    • Text Generation

    In this notebook, we show how to use the Flan-T5-based model to continue generating texts in the Legal domain (text generation), finetuned on in-house data.

    • Normalizing date mentions in text

    This notebook shows how to use Legal Natural Language Processing to standardize date mentions in the texts to a unique format. When working with data coming from various sources, we may incur the problem of some of the sources using the format mm/dd/yyyy, while other sources use dd/mm/yyyy, and any other format. By standardizing the date mentions, we can easily apply other analytics on the texts to obtain insights from the data.

    The legal.DateNormalizer annotator is even capable of standardizing relative date (current day is customizable).

    • Extracting important key phrases from text

    With the legal.ChunkKeyPhraseExtraction annotator, it is possible to extract the most relevant phrases given candidates coming from either N-Grams or NER entities.

    • Drawing boxes around entities in PDF files with Visual NLP and Legal NLP

    This example notebook shows how to combine the power of Visual NLP and Finance NLP to identify entities coming from PDF/Image files by first extracting the text from the file and using one of the Legal NLP pretrained NER models. Finally, mapping the found entities back to the file and marking them visually.

    Fancy trying?

    We’ve got 30-days free licenses for you with technical support from our legal team of technical and SME. This trial includes complete access to more than 926 models, including Classification, NER, Relation Extraction, Similarity Search, Summarization, Sentiment Analysis, Question Answering, etc. and 120+ legal language models.

    Just go to https://www.johnsnowlabs.com/install/ and follow the instructions!

    Don’t forget to check our notebooks and demos.

    How to run

    Legal NLP is extremely easy to run on both clusters and driver-only environments using johnsnowlabs library:

    !pip install johnsnowlabs
    from johnsnowlabs import nlp
    
    nlp.install(force_browser=True)
    
    # Start Spark Session
    spark = nlp.start()

    How useful was this post?

    Avatar photo
    Data Scientist at John Snow Labs
    Our additional expert:
    Ph.D. at Tsinghua-Berkeley Shenzhen Institute | Data Scientist

    Legal NLP releases new LLM demos, LLM-based Question Answering, Longformer and Camembert models , Greek Regulation Classification and notebooks and demos

    Legal NLP 1.14 comes with a lot of new capabilities added to the 926+ models and 125+ Language Models already available in...
    preloader