Introduction to NLP, NLP Pipeline

Priyansh Neema
3 min readSep 27, 2022

--

This blog gives you a quick introduction to NLP.

What topics will cover in this Blog:

Agenda

What is NLP?

Why NLP?

Real World Application

Common NLP Tasks

Approaches to NLP

Challenges in NLP

NLP Pipeline

What is NLP?

NLP is a subfield of linguistics(Human language), Computer Science, and Artificial Intelligence concerned with the interactions b/w computers and human language, in particular how to program computers to process and analyze large amounts of Natural language data.

Source: Google

Why NLP

  • Text is the largest repository of human knowledge and is growing quickly, it is highly unstructured.
  • Computer programs that understood text or speech. It is used to apply machine learning algorithms to text and speech.
  • NLP has many applications
  • NLP is hard.

Real World Applications

  • Contextual Advertisements
  • Email Clients-> Spam filtering, Smart reply
  • Social Media-> Removing adult content, opinion mining
  • Search Engine-> used NLP
  • Chatbots

Common NLP Tasks

  1. Text/Document Classsification
  2. Sentiment Analysis
  3. Information Retrieval-> In this we have Regular expression
  4. Parts of Speech Tagging(POS)
  5. Language Detection and Machine Translation
  6. Conversational Agents-> This is a kind of chatbot
    We have two types of chatbots-
    1. Text-based-> Used in Swiggy, Zomoto, etc.
    2. Speech Based-> Siri etc.
  7. Knowledge graph and Q/A System-> Used by google
  8. Text summarization-> Used by In-short
  9. Topic Modeling
  10. Text Generation-> Used in mobile keyboard when we type anything it predicts the next words.
  11. Speech checking and Grammer correction-> Best example Grammarly
  12. Speech to text

Approaches to NLP

Heuristic Methods-> It uses Rule based approach.
Example- Regular expressions, wordnet- lexical dictionary

Machine Learning-based Methods-> Starting in the Year 1990s
ML Workflow

Deep Learning-based Methods-> Starting in the Year 2010

Challenges in NLP

> Ambiguity
> Contextual words
> Synonyms
> Spelling errors
> Creativity and Diversity

NLP Pipeline

NLP is a set of steps followed to build end-to-end NLP software.

NLP Software consists of the following steps:

  1. Data Acquisition
  2. Text preprocessing
  3. Feature Engineering-> Bags of Words(BOW), TFIDF
  4. Modeling
  5. Deployment

Note-
It’s not Universal NLP Pipeline.
Deep Learning pipelines are slightly different.
Pipeline is non-linear

If you liked the story and want to appreciate me you can clap as much as you can. Also, share with your friends and LinkedIn networks, and also you can connect to us on...

LinkedIn : https://www.linkedin.com/in/priyansh-neema-3899a0175/

GitHub: https://github.com/Priyansh-jsk

Happy Learning!

--

--