Natural language processing: Difference between revisions

From David's Wiki
 
Line 19: Line 19:
===Transformer===
===Transformer===
{{ main | Transformer (machine learning model)}}
{{ main | Transformer (machine learning model)}}
[https://arxiv.org/abs/1706.03762 Attention is all you need paper]<br>
[https://arxiv.org/abs/1706.03762 Attention is all you need paper]
A neural network architecture by Google.
 
It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks.
A neural network architecture by Google which uses encoder-decoder attention and self-attention.
It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks
However, it's computational complexity is quadratic in the number of input and output tokens due to attention.


;Guides and explanations
;Guides and explanations

Latest revision as of 19:48, 15 January 2021


Natural language processing (NLP)

Classical NLP

The Classical NLP consists of creating a pipeline using processors to create annotations from text files.
Below is an example of a few processors.

  • Tokenization
    • Convert a paragraph of test or a file into an array of words.
  • Part-of-speech annotation
  • Named Entity Recognition

Machine Learning

Datasets and Challenges

SQuAD

Link
The Stanford Question Answering Dataset. There are two versions of this dataset, 1.1 and 2.0.

Transformer

Attention is all you need paper

A neural network architecture by Google which uses encoder-decoder attention and self-attention.
It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks.
However, it's computational complexity is quadratic in the number of input and output tokens due to attention.

Guides and explanations

Google Bert

Github Link Paper Blog Post
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A pretrained NLP neural network. Note the code is written in TensorFlow 1.

Albert

Github

A Lite BERT for Self-supervised Learning of Language Representations

This is a parameter reduction on Bert.

Libraries

Apache OpenNLP

Link