Skip to content

Latest commit

 

History

History
32 lines (18 loc) · 2.17 KB

File metadata and controls

32 lines (18 loc) · 2.17 KB

Let's Dive In

Text Classification is a problem in computer science in which the task is to assign a text document to one or more classes or categories. This may be done manually or automatically. In the 21st century, web pages, emails, science journals, e-books, learning content, news and social media are all full of textual data. The idea is to create, analyze and report information fast. This is when automated text classification comes up for faster development! Machine Learning can be implemented to automate these tasks making the whole process super-fast and efficient.

Supervised Text Classification

This classification technique is based on metrics by defining features and labels for a certain text document. It works on training and testing principle. We feed labeled data to the machine learning algorithm for training. During the testing phase, the algorithm is fed with unobserved data and classifies them into categories based on the training phase. It basically tries to mimic the human human learning.

Examples --

  • Email Spam Filtering

Unsupervised Text Classification

This classification technique doesn't require labeled input while training data sets instead the algorithms try to discover natural structure in data by identifying similar patterns and structures in the data points and groups them into clusters. This technique is language-agnostic since it can operate on any textual data without the need to be explicitly labeled and can generate insights from such data. It also follows Train Once , Test Anywhere paradigm!

Examples --

  • Sentiment Analysis

One Application?

Natural Language Processing

How about Implementation!

There are plenty of great resources out there to help you get started in the domain of text classification. Here's one as a token of appreciation for reading this article!