Introduction to NLP, NLP Pipeline
This blog gives you a quick introduction to NLP.
What topics will cover in this Blog:
Agenda
What is NLP?
Why NLP?
Real World Application
Common NLP Tasks
Approaches to NLP
Challenges in NLP
NLP Pipeline
What is NLP?
NLP is a subfield of linguistics(Human language), Computer Science, and Artificial Intelligence concerned with the interactions b/w computers and human language, in particular how to program computers to process and analyze large amounts of Natural language data.
Why NLP
- Text is the largest repository of human knowledge and is growing quickly, it is highly unstructured.
- Computer programs that understood text or speech. It is used to apply machine learning algorithms to text and speech.
- NLP has many applications
- NLP is hard.
Real World Applications
- Contextual Advertisements
- Email Clients-> Spam filtering, Smart reply
- Social Media-> Removing adult content, opinion mining
- Search Engine-> used NLP
- Chatbots
Common NLP Tasks
- Text/Document Classsification
- Sentiment Analysis
- Information Retrieval-> In this we have Regular expression
- Parts of Speech Tagging(POS)
- Language Detection and Machine Translation
- Conversational Agents-> This is a kind of chatbot
We have two types of chatbots-
1. Text-based-> Used in Swiggy, Zomoto, etc.
2. Speech Based-> Siri etc. - Knowledge graph and Q/A System-> Used by google
- Text summarization-> Used by In-short
- Topic Modeling
- Text Generation-> Used in mobile keyboard when we type anything it predicts the next words.
- Speech checking and Grammer correction-> Best example Grammarly
- Speech to text
Approaches to NLP
Heuristic Methods-> It uses Rule based approach.
Example- Regular expressions, wordnet- lexical dictionary
Machine Learning-based Methods-> Starting in the Year 1990s
ML Workflow
Deep Learning-based Methods-> Starting in the Year 2010
Challenges in NLP
> Ambiguity
> Contextual words
> Synonyms
> Spelling errors
> Creativity and Diversity
NLP Pipeline
NLP is a set of steps followed to build end-to-end NLP software.
NLP Software consists of the following steps:
- Data Acquisition
- Text preprocessing
- Feature Engineering-> Bags of Words(BOW), TFIDF
- Modeling
- Deployment
Note-
It’s not Universal NLP Pipeline.
Deep Learning pipelines are slightly different.
Pipeline is non-linear
If you liked the story and want to appreciate me you can clap as much as you can. Also, share with your friends and LinkedIn networks, and also you can connect to us on...
LinkedIn : https://www.linkedin.com/in/priyansh-neema-3899a0175/
GitHub: https://github.com/Priyansh-jsk
Happy Learning!