Machine Learning Feature Engineering
Feature engineering is the process of transforming raw data into inputs for a machine learning algorithm. In order to be used in machine learning algorithms, features have to be put into feature vectors, which are vectors of numbers representing the value for each feature. For sentiment analysis, textual data has to be put into word vectors, which are vectors of numbers representing the value for each word. Input text can be encoded into word vectors using counting techniques such as Bag of Words (BoW) , bag-of-ngrams, or Term Frequency/Inverse Document Frequency (TF-IDF).
Sentiment Classification Using Supervised Machine Learning.
After the input text has been converted into word vectors, classification machine learning algorithms can be used to classify the sentiment. Classification is a family of supervised machine learning algorithms that identifies which category an item belongs to (such as whether a text is negative or positive) based on labeled data (such as text labeled as positive or negative).
Classification machine learning algorithms that can be used for sentiment analysis include:
- Naïve Bayes is a family of probabilistic algorithms that determines the conditional probability of the class of the input data.
- Support Vector Machines finds a hyperplane in an N-dimensional space (N ishe number of features) that distinctly classifies the data points.
- Logistic regression uses a logistic function to model the probability of a certain class.
Sentiment Analysis Using Deep Learning
Deep learning (DL) is a subset of machine learning (ML) that uses multi-layered artificial neural networks to deliver state-of-the-art accuracy in tasks such as NLP and others. DL word embedding techniques such as Word2Vec encode words in meaningful ways by learning word associations, meaning, semantics, and syntax. DL algorithms also enable end-to-end training of NLP models without the need to hand-engineer features from raw input data.
There are different variations of deep learning algorithms. Recurrent neural networks are the mathematical engines to parse language patterns and sequenced data. They’re the natural language processing brains that give ears and speech to Amazon’s Alexa and used in language translation, stock predictions, and algorithmic trading. Transformer deep learning models, such as BERT (Bidirectional Encoder Representations from Transformers), are an alternative to recurrent neural networks that apply an attention technique—parsing a sentence by focusing attention on the most relevant words that come before and after it. BERT revolutionized progress in NLP by offering accuracy comparable to human baselines on benchmarks for intent recognition, sentiment analysis, and more. It’s deeply bidirectional and can understand and retain context better than the other text encoding mechanisms. A key challenge with training language models is the lack of labeled data. BERT is trained on unsupervised tasks and generally uses unstructured datasets from books corpus, English Wikipedia, and more.
GPUs: Accelerating NLP and Sentiment Analysis
A driver of NLP growth is recent and ongoing advancements and breakthroughs in natural language processing, not the least of which is the deployment of GPUs to crunch through increasingly massive and highly complex language models.
A GPU is composed of hundreds of cores that can handle thousands of threads in parallel. GPUs have become the platform of choice to train ML and DL models and perform inference because they can deliver 10X higher performance than CPU-only platforms.
State-of-the-art Deep Learning Neural Networks can have from millions to well over one billion parameters to adjust via back-propagation. They also require a large amount of training data to achieve high accuracy, meaning hundreds of thousands to millions of input samples will have to be run through both a forward and backward pass. Because neural nets are created from large numbers of identical neurons, they’re highly parallel by nature. This parallelism maps naturally to GPUs, providing a significant computation speed-up over CPU-only training. GPUs have become the platform of choice for training large, complex Neural Network-based systems for this reason, and the parallel nature of inference operations also lend themselves well for execution on GPUs. In addition, Transformer-based deep learning models, such as BERT, don’t require sequential data to be processed in order, allowing for much more parallelization and reduced training time on GPUs than RNNs.
NVIDIA GPU-Accelerated AI Libraries
With NVIDIA GPUs and CUDA-X AI™ libraries, massive, state-of-the-art language models can be rapidly trained and optimized to run inference in just a couple of milliseconds, or thousandths of a second. This is a major stride towards ending the trade-off between an AI model that’s fast versus one that’s large and complex. The parallel processing capabilities and Tensor Core architecture of NVIDIA GPUs allow for higher throughput and scalability when working with complex language models—enabling record-setting performance for both the training and inference of BERT.
NVIDIA GPU-Accelerated, End-to-End Data Science
The NVIDIA RAPIDS™ suite of software libraries, built on CUDA-X AI, gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
NVIDIA GPU-Accelerated Deep Learning Frameworks
GPU-accelerated DL frameworks offer flexibility to design and train custom deep neural networks and provide interfaces to commonly-used programming languages such as Python and C/C++. Widely used deep learning frameworks such as MXNet, PyTorch, TensorFlow, and others rely on NVIDIA GPU-accelerated libraries to deliver high-performance, multi-GPU accelerated training.