AI in the real world —2. Sentiment Analysis using spaCy pipelines

Vishnu Nandakumar
3 min readNov 21, 2021

The way and tone in which people convey their thoughts highly set the world evolves, if everyone understands and practices humility towards each other then there will be nothing less than happiness and peace. Well, that makes us land on how sentiment analysis of simple sentences could have a greater impact in any domain. Sentiment analysis can be done in two learning methods:

  • Unsupervised learning: As we all have a fair idea of what is unsupervised learning is, so what we can do here is create an embedding for each group of words and find the semantic similarity of embedding of any sentence to get the group a particular sentence might belong to.
  • Supervised learning: As the counterpart to the unsupervised learning method, here we will be using labelled data to be modelled for.

Spacy pipelines:

If you haven’t heard of the word spaCy before then you should quickly jump to the link above, they are one of the pioneers of the NLP world. spaCy provides more robust, efficient pipelines and features when compared to others like TectBlob, NLTK etc. Now they have added the capability to incorporate transformers also into their module for training and inference purposes. Unfortunately spaCy doesn’t have an inbuilt sentiment analysis module, although fortunately, they have a text classification module that can be custom trained and be served as a model. All you need to do is pick a dataset, label and configure them as per spaCy’s format, pick a base model configuration and that’s all. I have trained and packaged a simple sentiment analysis module that can be easily installed in your machine via pip.

Training:

Training a spaCy pipeline is as easy now as ever, we need the data in the required format, select the base model configuration and train your pipeline. For the sentiment analysis model, I have used IMDB review comments of the dataset from the UCI ML repository. For training your own pipeline just look into the following notebook. Below given is a sample snippet from the notebook

Packaging

After training and saving the model you created using spaCy pipelines, you can package and server them as compiled and executable versions. For packaging your saved model use the following command as given in spaCy CLI.

python -m spacy package input_dir output_dir
  • input_dir: Directory which has your compiled model
  • output_dir: Directory which creates a distributable file that can be installed using pip

For other parameters check the official website of spaCy packages. Post packaging you have to follow the conventional methods of uploading the dist/ folder to your PyPi repository to make it available for the community if you wish.

eng_spacysentiment

As we have seen how to train and package a spaCy pipeline, now let us see one in action, the following is a simple python library for sentiment analysis of English sentences. You can install the package using the following

pip install eng-spacysentiment

  • Simple implementation

Okay, that’s a wrap buddies, we now can easily train a pipeline using the spaCy, package and deploy it as an executable using python. As always ever, immensely grateful and glad for the support that you have been showing me. Until next time, take care fellas and clap for the article if you like it. Follow me for more and I will do the same so we can all learn everything mutually.

--

--

Vishnu Nandakumar

Machine Learning Engineer, Cloud Computing (AWS), Arsenal Fan. Have a look at my page: https://bit.ly/m/vishnunandakumar