Natural language processing: Difference between revisions

(6 intermediate revisions by the same user not shown)

Line 4:

==Classical NLP==

The Classical NLP consists of creating a pipeline using processors to create ~~annotation~~ from text files.

The Classical NLP consists of creating a pipeline using processors to create annotations from text files.

Below is an example of a few processors.

* Tokenization

Line 10:

* Part-of-speech annotation

* Named Entity Recognition

==Machine Learning==

====Datasets and Challenges====

===Datasets and Challenges===

====SQuAD====

[https://rajpurkar.github.io/SQuAD-explorer/ Link]

The Stanford Question Answering Dataset. There are two ~~version~~ of this dataset, 1.1 and 2.0.

The Stanford Question Answering Dataset. There are two versions of this dataset, 1.1 and 2.0.

===Transformer===

[https://arxiv.org/abs/1706.03762 Attention is all you need paper]~~ ~~

A neural network architecture by Google.

[https://arxiv.org/abs/1706.03762 Attention is all you need paper]

It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks.

A neural network architecture by Google which uses encoder-decoder attention and self-attention.

It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks.

However, it's computational complexity is quadratic in the number of input and output tokens due to attention.

;Guides and explanations

* [https://nlp.seas.harvard.edu/2018/04/03/attention.html The Annotated Transformer]

* [https://www.youtube.com/watch?v=iDulhoQ2pro Youtube Video]

===Google Bert===

[https://github.com/google-research/bert Github Link]

[https://arxiv.org/abs/1810.04805 Paper]

Line 28:

Line 37:

A pretrained NLP neural network.

Note the code is written in TensorFlow 1.

====Albert====

[https://github.com/google-research/google-research/tree/master/albert Github]

;A Lite BERT for Self-supervised Learning of Language Representations

This is a parameter reduction on Bert.

==Libraries==

===Apache OpenNLP===

[https://opennlp.apache.org/ Link]

@@ Line 4: / Line 4: @@
 ==Classical NLP==
-The Classical NLP consists of creating a pipeline using processors to create annotation from text files.<br>
+The Classical NLP consists of creating a pipeline using processors to create annotations from text files.<br>
 Below is an example of a few processors.<br>
 * Tokenization
@@ Line 10: / Line 10: @@
 * Part-of-speech annotation
 * Named Entity Recognition
 ==Machine Learning==
-====Datasets and Challenges====
+===Datasets and Challenges===
 ====SQuAD====
 [https://rajpurkar.github.io/SQuAD-explorer/ Link]<br>
-The Stanford Question Answering Dataset. There are two version of this dataset, 1.1 and 2.0.
+The Stanford Question Answering Dataset. There are two versions of this dataset, 1.1 and 2.0.
 ===Transformer===
-[https://arxiv.org/abs/1706.03762 Attention is all you need paper]<br>
+{{ main | Transformer (machine learning model)}}
-A neural network architecture by Google.
+[https://arxiv.org/abs/1706.03762 Attention is all you need paper]
-It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks.
+A neural network architecture by Google which uses encoder-decoder attention and self-attention.
+It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks.
+However, it's computational complexity is quadratic in the number of input and output tokens due to attention.
+;Guides and explanations
+* [https://nlp.seas.harvard.edu/2018/04/03/attention.html The Annotated Transformer]
+* [https://www.youtube.com/watch?v=iDulhoQ2pro Youtube Video]
 ===Google Bert===
+{{main| BERT (language model)}}
 [https://github.com/google-research/bert Github Link]
 [https://arxiv.org/abs/1810.04805 Paper]
@@ Line 28: / Line 37: @@
 A pretrained NLP neural network.
 Note the code is written in TensorFlow 1.
+====Albert====
+[https://github.com/google-research/google-research/tree/master/albert Github]<br>
+;A Lite BERT for Self-supervised Learning of Language Representations
+This is a parameter reduction on Bert.
 ==Libraries==
 ===Apache OpenNLP===
 [https://opennlp.apache.org/ Link]