Fighting Coronavirus with Technology: Another Breakthrough for Alibaba in NLP Research

The technology also plays a key role in the fight against COVID-19

Hangzhou, China, March 6, 2020 — Alibaba’s DAMO Academy, the group’s global research program, has had another major breakthrough in the machine-reading capabilities that underpin success in artificial intelligence.

DAMO’s Natural Language Processing (NLP) model topped the GLUE benchmark rankings, an industry table perceived as the most-important baseline test for the NLP model on March 3. Alibaba’s model also significantly outperformed the human baselines, marking a key milestone in the development of robust natural language understanding systems.

DAMO’s existing model has already been deployed widely in Alibaba’s ecosystem, powering its customer-service AI chatbot and the search engine on Alibaba’s retail platforms, as well as anonymous healthcare data analysis. The model was used in the text analysis of medical records and epidemiological investigation by CDCs in different cities in China for fighting against COVID-19.

StructBERT: Alibaba’s Intelligent NLP Model

This is not the first time Alibaba’s machine-learning model has topped others. On June 20 2019, Alibaba’s model bested human scores when tested by the Microsoft Machine Reading Comprehension dataset, one of the artificial-intelligence world’s most challenging tests for reading comprehension. The model scored 0.54 in the MS Marco question-answering task, outperforming the human score of 0.539, a benchmark provided by Microsoft. In 2018, Alibaba also scored higher than the human benchmark in the Stanford Question Answering Dataset — also one of the most-popular machine reading-comprehension challenges worldwide.

What Makes StructBERT So Good?

However, existing models have limitations and do not make the most of underlying language structures. Since language fluency is determined by the ordering of words and sentences, finding the best permutation of a set of words and sentences is an essential problem in many NLP tasks, such as machine translation and NLU. StructBERT incorporates language structures into BERT pre-training by proposing two novel linearization strategies. Specifically, in addition to the existing masking strategy, StructBERT extends BERT by leveraging the structural information: word-level ordering and sentence-level ordering.

In StructBERT, model pre-training is augmented with two new structural objectives on the inner-sentence and inter-sentence structures, respectively. In this way, the linguistic aspects are explicitly captured during the pre-training procedure. With structural pre-training, StructBERT encodes dependency between words as well as sentences in the contextualized representation, which provides the model with better generalizability and adaptability.

Fighting Coronavirus with Technology

For more details about the GLUE Benchmark, please visit: GLUE benchmark leaderboard

For Alibaba’s research paper on the StructBERT, please visit: StructBERT

For Alibaba’s overall technology support in fighting the coronavirus, please visit: Fighting Coronavirus with Alibaba Cloud

While continuing to wage war against the worldwide outbreak, Alibaba Cloud will play its part and will do all it can to help others in their battles with the coronavirus. Learn how we can support your business continuity at

Original Source:

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.