Language nlu.load() reference Spark NLP Model reference Type
Arabic ar.ner arabic_w2v_cc_300d Named Entity Recognizer
Arabic ar.embed.aner aner_cc_300d Word Embedding
Arabic ar.embed.aner.300d aner_cc_300d Word Embedding (Alias)
Bengali bn.stopwords stopwords_bn Stopwords Cleaner
Bengali bn.pos pos_msri Part of Speech
Thai th.segment_words wordseg_best Word Segmenter
Thai th.pos pos_lst20 Part of Speech
Thai th.sentiment sentiment_jager_use Sentiment Classifier
Thai th.classify.sentiment sentiment_jager_use Sentiment Classifier (Alias)
Chinese zh.pos.ud_gsd_trad pos_ud_gsd_trad Part of Speech
Chinese zh.segment_words.gsd wordseg_gsd_ud_trad Word Segmenter
Bihari bh.pos pos_ud_bhtb Part of Speech
Amharic am.pos pos_ud_att Part of Speech

NLU 1.1.1 New English Models and Pipelines

Language nlu.load() reference Spark NLP Model reference Type
English en.sentiment.glove analyze_sentimentdl_glove_imdb Sentiment Classifier
English en.sentiment.glove.imdb analyze_sentimentdl_glove_imdb Sentiment Classifier (Alias)
English en.classify.sentiment.glove.imdb analyze_sentimentdl_glove_imdb Sentiment Classifier (Alias)
English en.classify.sentiment.glove analyze_sentimentdl_glove_imdb Sentiment Classifier (Alias)
English en.classify.trec50.pipe classifierdl_use_trec50_pipeline Language Classifier
English en.ner.onto.large onto_recognize_entities_electra_large Named Entity Recognizer
English en.classify.questions.atis classifierdl_use_atis Intent Classifier
English en.classify.questions.airline classifierdl_use_atis Intent Classifier (Alias)
English en.classify.intent.atis classifierdl_use_atis Intent Classifier (Alias)
English en.classify.intent.airline classifierdl_use_atis Intent Classifier (Alias)
English en.ner.atis nerdl_atis_840b_300d Aspect based NER
English en.ner.airline nerdl_atis_840b_300d Aspect based NER (Alias)
English en.ner.aspect.airline nerdl_atis_840b_300d Aspect based NER (Alias)
English en.ner.aspect.atis nerdl_atis_840b_300d Aspect based NER (Alias)

New Easy NLU 1-liner Examples:

Extract aspects and entities from airline questions (ATIS dataset)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
nlu.load("en.ner.atis").predict("i want to fly from baltimore to dallas round trip")
output: ["baltimore"," dallas", "round trip"]
nlu.load("en.ner.atis").predict("i want to fly from baltimore to dallas round trip") output: ["baltimore"," dallas", "round trip"]
      nlu.load("en.ner.atis").predict("i want to fly from baltimore to dallas round trip")
      output:  ["baltimore"," dallas", "round trip"]

Intent Classification for Airline Traffic Information System queries (ATIS dataset)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
nlu.load("en.classify.questions.atis").predict("what is the price of flight from newyork to washington")
output: "atis_airfare"
nlu.load("en.classify.questions.atis").predict("what is the price of flight from newyork to washington") output: "atis_airfare"
      nlu.load("en.classify.questions.atis").predict("what is the price of flight from newyork to washington")
      output:  "atis_airfare"

Recognize Entities OntoNotes – ELECTRA Large

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
nlu.load("en.ner.onto.large").predict("Johnson first entered politics when elected in 2001 as a member of Parliament. He then served eight years as the mayor of London.")
output: ["Johnson", "first", "2001", "eight years", "London"]
nlu.load("en.ner.onto.large").predict("Johnson first entered politics when elected in 2001 as a member of Parliament. He then served eight years as the mayor of London.") output: ["Johnson", "first", "2001", "eight years", "London"]
      nlu.load("en.ner.onto.large").predict("Johnson first entered politics when elected in 2001 as a member of Parliament. He then served eight years as the mayor of London.")	
      output:  ["Johnson", "first", "2001", "eight years", "London"]

Question classification of open-domain and fact-based questions Pipeline – TREC50

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
nlu.load("en.classify.trec50.pipe").predict("When did the construction of stone circles begin in the UK? ")
output: LOC_other
nlu.load("en.classify.trec50.pipe").predict("When did the construction of stone circles begin in the UK? ") output: LOC_other
      nlu.load("en.classify.trec50.pipe").predict("When did the construction of stone circles begin in the UK? ")
      output: LOC_other

Traditional Chinese Word Segmentation

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# 'However, this treatment also creates some problems' in Chinese
nlu.load("zh.segment_words.gsd").predict("然而,這樣的處理也衍生了一些問題。")
output: ["然而",",","這樣","的","處理","也","衍生","了","一些","問題","。"]
# 'However, this treatment also creates some problems' in Chinese nlu.load("zh.segment_words.gsd").predict("然而,這樣的處理也衍生了一些問題。") output: ["然而",",","這樣","的","處理","也","衍生","了","一些","問題","。"]
      # 'However, this treatment also creates some problems' in Chinese
      nlu.load("zh.segment_words.gsd").predict("然而,這樣的處理也衍生了一些問題。")
      output:  ["然而",",","這樣","的","處理","也","衍生","了","一些","問題","。"]

Part of Speech for Traditional Chinese

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# 'However, this treatment also creates some problems' in Chinese
nlu.load("zh.pos.ud_gsd_trad").predict("然而,這樣的處理也衍生了一些問題。")
# 'However, this treatment also creates some problems' in Chinese nlu.load("zh.pos.ud_gsd_trad").predict("然而,這樣的處理也衍生了一些問題。")
      # 'However, this treatment also creates some problems' in Chinese
      nlu.load("zh.pos.ud_gsd_trad").predict("然而,這樣的處理也衍生了一些問題。")

Output:

Token POS
然而 ADV
PUNCT
這樣 PRON
PART
處理 NOUN
ADV
衍生 VERB
PART
一些 ADJ
問題 NOUN
PUNCT

Thai Word Segment Recognition

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# 'Mona Lisa is a 16th-century oil painting created by Leonardo held at the Louvre in Paris' in Thai
nlu.loadnlu.load("th.segment_words").predict("Mona Lisa เป็นภาพวาดสีน้ำมันในศตวรรษที่ 16 ที่สร้างโดย Leonardo จัดขึ้นที่พิพิธภัณฑ์ลูฟร์ในปารีส")
# 'Mona Lisa is a 16th-century oil painting created by Leonardo held at the Louvre in Paris' in Thai nlu.loadnlu.load("th.segment_words").predict("Mona Lisa เป็นภาพวาดสีน้ำมันในศตวรรษที่ 16 ที่สร้างโดย Leonardo จัดขึ้นที่พิพิธภัณฑ์ลูฟร์ในปารีส")
      # 'Mona Lisa is a 16th-century oil painting created by Leonardo held at the Louvre in Paris' in Thai
      nlu.loadnlu.load("th.segment_words").predict("Mona Lisa เป็นภาพวาดสีน้ำมันในศตวรรษที่ 16 ที่สร้างโดย Leonardo จัดขึ้นที่พิพิธภัณฑ์ลูฟร์ในปารีส")

Output:

token
M
o
n
a
Lisa
เป็น
ภาพ
สีน้ำ
มัน
ใน
ศตวรรษ
ที่
16
ที่
สร้าง
L
e
o
n
a
r
d
o
จัด
ขึ้น
ที่
พิพิธภัณฑ์
ลูฟร์
ใน
ปารีส

Part of Speech for Bengali (POS)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# 'The village is also called 'Mod' in Tora language' in Bengali
nlu.load("bn.pos").predict("বাসস্থান-ঘরগৃহস্থালি তোড়া ভাষায় গ্রামকেও বলে ` মোদ ' ৷")
# 'The village is also called 'Mod' in Tora language' in Bengali nlu.load("bn.pos").predict("বাসস্থান-ঘরগৃহস্থালি তোড়া ভাষায় গ্রামকেও বলে ` মোদ ' ৷")
      # 'The village is also called 'Mod' in Tora language' in Bengali 
      nlu.load("bn.pos").predict("বাসস্থান-ঘরগৃহস্থালি তোড়া ভাষায় গ্রামকেও বলে ` মোদ ' ৷")

Output:

token pos
বাসস্থান-ঘরগৃহস্থালি NN
তোড়া NNP
ভাষায় NN
গ্রামকেও NN
বলে VM
` SYM
মোদ NN
SYM
SYM

Stop Words Cleaner for Bengali

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# 'This language is not enough' in Bengali
df = nlu.load("bn.stopwords").predict("এই ভাষা যথেষ্ট নয়")
# 'This language is not enough' in Bengali df = nlu.load("bn.stopwords").predict("এই ভাষা যথেষ্ট নয়")
      # 'This language is not enough' in Bengali 
      df = nlu.load("bn.stopwords").predict("এই ভাষা যথেষ্ট নয়")

Output:

cleanTokens token
ভাষা এই
যথেষ্ট ভাষা
নয় যথেষ্ট
None নয়

Part of Speech for Bengali

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# 'The people of Ohu know that the foundation of Bhojpuri was shaken' in Bengali
nlu.load('bh.pos').predict("ओहु लोग के मालूम बा कि श्लील होखते भोजपुरी के नींव हिल जाई")
# 'The people of Ohu know that the foundation of Bhojpuri was shaken' in Bengali nlu.load('bh.pos').predict("ओहु लोग के मालूम बा कि श्लील होखते भोजपुरी के नींव हिल जाई")
      # 'The people of Ohu know that the foundation of Bhojpuri was shaken' in Bengali
      nlu.load('bh.pos').predict("ओहु लोग के मालूम बा कि श्लील होखते भोजपुरी के नींव हिल जाई")

Output:

pos token
DET ओहु
NOUN लोग
ADP के
NOUN मालूम
VERB बा
SCONJ कि
ADJ श्लील
VERB होखते
PROPN भोजपुरी
ADP के
NOUN नींव
VERB हिल
AUX जाई

Amharic Part of Speech (POS)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# ' "Son, finish the job," he said.' in Amharic
nlu.load('am.pos').predict('ልጅ ኡ ን ሥራ ው ን አስጨርስ ኧው ኣል ኧሁ"')
# ' "Son, finish the job," he said.' in Amharic nlu.load('am.pos').predict('ልጅ ኡ ን ሥራ ው ን አስጨርስ ኧው ኣል ኧሁ"')
      # ' "Son, finish the job," he said.' in Amharic
      nlu.load('am.pos').predict('ልጅ ኡ ን ሥራ ው ን አስጨርስ ኧው ኣል ኧሁ"')

Output:

pos token
NOUN ልጅ
DET
PART
NOUN ሥራ
DET
PART
VERB አስጨርስ
PRON ኧው
AUX ኣል
PRON ኧሁ
PUNCT
NOUN

Thai Sentiment Classification

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# 'I love peanut butter and jelly!' in thai
nlu.load('th.classify.sentiment').predict('ฉันชอบเนยถั่วและเยลลี่!')[['sentiment','sentiment_confidence']]
# 'I love peanut butter and jelly!' in thai nlu.load('th.classify.sentiment').predict('ฉันชอบเนยถั่วและเยลลี่!')[['sentiment','sentiment_confidence']]
      #  'I love peanut butter and jelly!' in thai
      nlu.load('th.classify.sentiment').predict('ฉันชอบเนยถั่วและเยลลี่!')[['sentiment','sentiment_confidence']]

Output:

sentiment sentiment_confidence
positive 0.999998

Arabic Named Entity Recognition (NER)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# 'In 1918, the forces of the Arab Revolt liberated Damascus with the help of the British' in Arabic
nlu.load('ar.ner').predict('في عام 1918 حررت قوات الثورة العربية دمشق بمساعدة من الإنكليز',output_level='chunk')[['entities_confidence','ner_confidence','entities']]
# 'In 1918, the forces of the Arab Revolt liberated Damascus with the help of the British' in Arabic nlu.load('ar.ner').predict('في عام 1918 حررت قوات الثورة العربية دمشق بمساعدة من الإنكليز',output_level='chunk')[['entities_confidence','ner_confidence','entities']]
      # 'In 1918, the forces of the Arab Revolt liberated Damascus with the help of the British' in Arabic
      nlu.load('ar.ner').predict('في عام 1918 حررت قوات الثورة العربية دمشق بمساعدة من الإنكليز',output_level='chunk')[['entities_confidence','ner_confidence','entities']]

Output:

entity_class ner_confidence entities
ORG [1.0, 1.0, 1.0, 0.9997000098228455, 0.9840999841690063, 0.9987999796867371, 0.9990000128746033, 0.9998999834060669, 0.9998999834060669, 0.9993000030517578, 0.9998999834060669] قوات الثورة العربية
LOC [1.0, 1.0, 1.0, 0.9997000098228455, 0.9840999841690063, 0.9987999796867371, 0.9990000128746033, 0.9998999834060669, 0.9998999834060669, 0.9993000030517578, 0.9998999834060669] دمشق
PER [1.0, 1.0, 1.0, 0.9997000098228455, 0.9840999841690063, 0.9987999796867371, 0.9990000128746033, 0.9998999834060669, 0.9998999834060669, 0.9993000030517578, 0.9998999834060669] الإنكليز