Natural language processing: Difference between revisions

no edit summary
(Created page with "Natural language processing (NLP) ==Classical NLP== ==Machine Learning== ===Transformer=== ===Google Bert===")
 
No edit summary
Line 1: Line 1:
__FORCETOC__
Natural language processing (NLP)
Natural language processing (NLP)


==Classical NLP==
==Classical NLP==
The Classical NLP consists of creating a pipeline using processors to create annotation from text files.<br>
Below is an example of a few processors.<br>
* Tokenization
** Convert a paragraph of test or a file into an array of words.
* Part-of-speech annotation
* Named Entity Recognition
==Machine Learning==
==Machine Learning==
====Datasets and Challenges====
====SQuAD====
[https://rajpurkar.github.io/SQuAD-explorer/ Link]<br>
The Stanford Question Answering Dataset. There are two version of this dataset, 1.1 and 2.0.
===Transformer===
===Transformer===
[https://arxiv.org/abs/1706.03762 Attention is all you need paper]<br>
A neural network architecture by Google.
It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks.
===Google Bert===
===Google Bert===
[https://github.com/google-research/bert Github Link]
[https://arxiv.org/abs/1810.04805 Paper]
[https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html Blog Post]<br>
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding<br>
A pretrained NLP neural network.
Note the code is written in TensorFlow 1.
==Libraries==
===Apache OpenNLP===
[https://opennlp.apache.org/ Link]