In this talk, we present recent updates to our natural language analysis toolkits, CoreNLP and Stanza, and talk about recent work happening at Stanford NLP in building and using large language models. CoreNLP and Stanza, annotation tools in Java and Python respectively, incorporate new research work to expand their capabilities and improve existing models, such as an improved constituency parsing module and a worldwide English NER dataset. The CRFM at Stanford has introduced two new open source tools for building LLMs and measuring their performance, Helm and Levanter. We also present a biomedical LLM, a neural search engine ColBERT, and an LLM query compiler DSPy I will talk about Stanza’s neural architectural design, its simple user interface, and its improved performance against existing toolkits over a range of datasets covering 70 languages. Our latest updates include NER support for English from around the world, an interface to edit dependencies, a state of the art constituency parser, and more. I will close my talk by talking about our future plans for the Stanza library.
In this talk, we present recent updates to our natural language analysis toolkits, CoreNLP and Stanza, and talk about recent work happening at Stanford NLP in building and using large...
In this fascinating world of machine learning, we're all just explorers, trying to make sense of the mammoth creatures we call large language models. They're like gigantic dinosaurs of the...
In 1690, the English philosopher John Locke developed and popularized an ancient old concept called tabula rasa or blank slate. It refers to the philosophical concept that suggests individuals are...