was successfully added to your cart.

Finance NLP releases a new Financial Summarization Annotator, new Embeddings and NER models, and a bunch of Financial Visual NLP demos.

John Snow Labs Finance NLP 1.11 comes with a lot of new capabilities added to the 135+ models and 35+ Language Models already available in previous versions of the library. Let’s take a look at each of them!

Native Financial Text Summarization: new annotator and 2 models

Financial texts, including reports, filings, etc., may be very long, verbose and complicated.

By using our new Financial Summarizer() module, you can get state-of-the-art, short versions of your financial documents, without losing any information.

We included 2 models for Financial Summarization:

  • The base model, with generic capacities for summarizing financial documents:
  • A specifically finetuned NLP fintech model trained to summarize Financial Reports sections. For this task, we finetuned our base model with more than 8K sections from different SEC Financial Reports.

Verbose sections on the left, summarized information on the right, of a SEC 10K Filing

New Financial Word and Sentence Embeddings

Language Models provide a numeric representation of texts which allow to train better models. Word Embeddings are used to train token(word)-based classifiers, as Name Entity Recognition. Sentence Embeddings are used to train Text Classifiers (document, section, sentence level) or calculate Sentence (text) Similarity.

It’s crucial the Language Models are train with domain-specific texts, so that they understand the nuances of financial documents. We are happy to announce the inclusion of the following models:

Word Embeddings: English Financial Roberta and Bert, German, Japanese and Chinese Word Embeddings models.

Sentence Embeddings: Chinese Financial Bert, Chinese Financial DistilBert, English Financial Embeddings (from SetFit).

Example of embeddings used to calculate Sentence Similarity:

We generate revenue primarily from subscription fees

New mixed Financial and Legal NER model on SEC documents

Identify up to 9 entities in SEC documents, including Business-Companies (ORG), People (PER), Legistations / Acts / Regulations (LAW), Locations (LOC), Government Institutions (INST), Courts (COURT) and other proper nouns (MISC), aliases of concepts or references (ALIAS) and Tickers (TICKER).

In our opinion, the accompanying consolidated balance sheets and the related consolidated statements of operations, of changes in stockholders’ equity, and of cash flows present fairly, in all material respects, the financial position of SunGard Capital Corp. II and its subsidiaries ( SCC II ) at December 31, 2010, and 2009, and the results of their operations and their cash flows for each of the three years in the period ended December 31, 2010, in conformity with accounting principles generally accepted in the United States of America.

New Visual NLP + Finance NLP demos

Financial documents usually contain tables, forms, balance sheets, which should be analyzed keeping the visual information at image level, since converting to text makes you lose the format the information is displayed.

By Using Visual NLP and NLP for financial services, you can carry out Table Question Answering (extracting first the table with Visual NLP and then carrying out Table Understanding with Finance NLP) and Visual Question Answering.

Check out some of the new demos for the combination of our 2 John Snow Labs libraries.

Visual Question Answering on Balance Sheets

➤Q1: What type of report is this report?︎ 1,189︎

➤Q2: How many Ordinary Stocks are in circulation as of February 16, 2010? 168,620

➤Q3: How many billion dollars is the total market value of the voting shares held by the Registrant’s non-affiliates as of June 28, 2009? 684

➤Q4: What is the title of each class? 338,923

Table Question Answering

The specificity of Table Question Answering (Table Understanding) is that you can get mathematical operations also encoded in the response, as COUNT, AVERAGE, MEAN, SUM, etc.

➤Q1: What is the sum of Employees (%) for Firm 1, 2, 3, 4, 5 and 6?

︎ SUM(40.0000, 30.0000, 50.0000, 40.0000, 20.0000, 40.0000)

➤Q2: What is the average of Environment (%) for Firm 1, 2, 3, 4, 5 and 6?

AVERAGE(57.69, 38.46, 34.62, 23.08, 19.23, 3.85)

➤Q3: How many Firms, among Firm 1,2,3,4,5 and 6, have a Risk (%) value of more than 30.00?

COUNT(2, 3, 4)

Fancy trying?

We’ve got 30-days free licenses for you with technical support from our financial team of technical and SME. This trial includes complete access to more than 135 models, including Classification, NER, Relation Extraction, Similarity Search, Summarization, Sentiment Analysis, Question Answering, etc. and 35+ financial language models.

Just go to https://www.johnsnowlabs.com/install/ and follow the instructions!

Don’t foger to check our notebooks and demos.

How to run

Finance NLP is very easy to run on both clusters and driver-only environments using johnsnowlabs library:

!pip install johnsnowlabs
nlp.install(force_browser=True)
nlp.start()

How useful was this post?

Try Finance NLP

See in action
Our additional expert:
Juan Martinez is a Sr. Data Scientist, working at John Snow Labs since 2021. He graduated from Computer Engineering in 2006, and from that time on, his main focus of activity has been the application of Artificial Intelligence to texts and unstructured data. To better understand the intersection between Language and AI, he complemented his technical background with a Linguistics degree from Moscow Pushkin State Language Institute in 2012 and later on on University of Alcala (2014). He is part of the Healthcare Data Science team at John Snow Labs. His main activities are training and evaluation of Deep Learning, Semantic and Symbolic models within the Healthcare domain, benchmarking, research and team coordination tasks. His other areas of interest are Machine Learning operations and Infrastructure.

Financial Spanish Sentiment Analysis, German Relation Extraction, ESG Solution Accelerator, Chinese Sentence Embeddings, new demos and much more!

Finance NLP 1.10 comes with a lot of new capabilities added to the 135+ models and 25+ Language Models already available in...
preloader