Introduction to NLP, NLP Pipeline

Priyansh Neema
3 min readSep 27, 2022


This blog gives you a quick introduction to NLP.

What topics will cover in this Blog:


What is NLP?

Why NLP?

Real World Application

Common NLP Tasks

Approaches to NLP

Challenges in NLP

NLP Pipeline

What is NLP?

NLP is a subfield of linguistics(Human language), Computer Science, and Artificial Intelligence concerned with the interactions b/w computers and human language, in particular how to program computers to process and analyze large amounts of Natural language data.

  • Text is the largest repository of human knowledge and is growing quickly, it is highly unstructured.
  • Computer programs that understood text or speech. It is used to apply machine learning algorithms to text and speech.
  • NLP has many applications
  • NLP is hard.

Real World Applications

  • Contextual Advertisements
  • Email Clients-> Spam filtering, Smart reply
  • Social Media-> Removing adult content, opinion mining
  • Search Engine-> used NLP
  • Chatbots

Common NLP Tasks

  1. Text/Document Classsification
  2. Sentiment Analysis
  3. Information Retrieval-> In this we have Regular expression
  4. Parts of Speech Tagging(POS)
  5. Language Detection and Machine Translation
  6. Conversational Agents-> This is a kind of chatbot
    We have two types of chatbots-
    1. Text-based-> Used in Swiggy, Zomoto, etc.
    2. Speech Based-> Siri etc.
  7. Knowledge graph and Q/A System-> Used by google
  8. Text summarization-> Used by In-short
  9. Topic Modeling
  10. Text Generation-> Used in mobile keyboard when we type anything it predicts the next words.
  11. Speech checking and Grammer correction-> Best example Grammarly
  12. Speech to text

Approaches to NLP

Heuristic Methods-> It uses Rule based approach.
Example- Regular expressions, wordnet- lexical dictionary

Machine Learning-based Methods-> Starting in the Year 1990s
ML Workflow

Deep Learning-based Methods-> Starting in the Year 2010

Challenges in NLP

> Ambiguity
> Contextual words
> Synonyms
> Spelling errors
> Creativity and Diversity

NLP Pipeline

NLP is a set of steps followed to build end-to-end NLP software.

NLP Software consists of the following steps:

  1. Data Acquisition
  2. Text preprocessing
  3. Feature Engineering-> Bags of Words(BOW), TFIDF
  4. Modeling
  5. Deployment

It’s not Universal NLP Pipeline.
Deep Learning pipelines are slightly different.
Pipeline is non-linear

