Natural language processing: Difference between revisions

Natural language processing (view source)

1,112 bytes added , 4 November 2019

no edit summary

5,337

edits

@@ Line 1: / Line 1: @@
+__FORCETOC__
 Natural language processing (NLP)
 ==Classical NLP==
+The Classical NLP consists of creating a pipeline using processors to create annotation from text files.<br>
+Below is an example of a few processors.<br>
+* Tokenization
+** Convert a paragraph of test or a file into an array of words.
+* Part-of-speech annotation
+* Named Entity Recognition
 ==Machine Learning==
+====Datasets and Challenges====
+====SQuAD====
+[https://rajpurkar.github.io/SQuAD-explorer/ Link]<br>
+The Stanford Question Answering Dataset. There are two version of this dataset, 1.1 and 2.0.
 ===Transformer===
+[https://arxiv.org/abs/1706.03762 Attention is all you need paper]<br>
+A neural network architecture by Google.
+It is currently the best at NLP tasks and has mostly replaced RNNs for these tasks.
 ===Google Bert===
+[https://github.com/google-research/bert Github Link]
+[https://arxiv.org/abs/1810.04805 Paper]
+[https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html Blog Post]<br>
+BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding<br>
+A pretrained NLP neural network.
+Note the code is written in TensorFlow 1.
+==Libraries==
+===Apache OpenNLP===
+[https://opennlp.apache.org/ Link]