Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
June 27, 2020 11:04 am GMT

Getting Started in NLP

When I first started learning about NLP (Natural Language Processing - processing text data) I wanted to find a beginner's guide that gave me a framework for understanding the topics and terminology that I needed to search for to find learning resources. I struggled to find one. However, just this week, @omarsar0 ("elvis") on Twitter has posted some mind maps that look really useful, and I thought I should post them here in case other beginners find these useful too.

Text Mining Mind Map:

Alt Text

NLP Mind Map:

Alt Text

Also, here are some beginner's tutorials and code examples (in python) that I've found really helpful for getting started:

...and a useful book is "Applied Text Analysis with Python"

There are many more complicated "state of the art" (SOTA) methods not covered in the resources above (e.g. Word2Vec, GloVe, ELMo, BERT, and SOTA models since BERT) but I recommend staying away from those until you understand text mining with the traditional methods.

There are also many different tasks that can be performed using NLP techniques (e.g. translating between languages, summarising text, question answering, and more) and I recommend starting out with "text classification" or "sentiment analysis" (which is a type of text classification). There are lots of free tutorials and examples online for sentiment analysis e.g. trying to classify whether a Yelp review is a positive review or a negative review. Perhaps even before that I'd recommend importing text data and creating a wordcloud (this tutorial will help). If you don't know what a word cloud is, below is an example. It's a way to visualise the frequency of each word in some text.

Alt Text

I created the wordcloud above using this code:

# import matplotlib so that the wordcloud can be displayedimport matplotlib.pyplot as plt%matplotlib inline# import wordcloud so that the wordcloud can be createdfrom wordcloud import WordCloud# create a string of texttext_string = "NLP, NLP, NLP, NLP, NLP, NLP, NLP, NLP, NLP, \                text, text, text, spacy, spacy,\                sentiment analysis, translation, stopwords,\                tokenisation, tokenisation, tokenisation,\                 part-of-speech tagging, bag of words, TF-IDF,\                embedding, summarisation, language modelling,\                question answering, text classification,\                text classification, RNN, LSTM"# create a wordcloud from the string of textmy_wordcloud = WordCloud(background_color="white",                          max_words=50,                          ).generate(text_string)# display the wordcloudplt.imshow(my_wordcloud, interpolation='bilinear')plt.axis('off')plt.show()

Note: you may need to install wordcloud first (e.g. with !pip install wordcloud if you're writing python code in a Jupyter Notebook)

NLP is a massive field and it can be daunting and confusing to get started. It's a really interesting field though and well worth the effort. I'm planning on writing more about NLP in the future, as I'm learning a lot about it as part of my Data Science MSc project. In the meantime, I hope the resources I've mentioned here can help to make the journey a bit easier for total beginners.


Original Link: https://dev.to/nicfoxds/getting-started-in-nlp-b0e

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To