Those of you who are interested in machine learning will likely have heard of Google’s TensorFlow. While R is not officially supported, RStudio has developed a wrapper to be able to use TensorFlow in R. More information, and a few tutorials, are available on the website, but I’ll add to that list with some Natural Language Processing (NLP) examples, since they seem to not be overly abundant online.
Installation
To install tensorflow, ensure you have a working copy of Python available. They recommend Anaconda, but any distribution should work.
To insall, first get the package from GitHub:
Then install TensorFlow using the package:
Finally, test your installation:
There is a GPU version of TensorFlow available, if you have NVIDIA graphics cards. More information on this and other installation options are available on the tensorflow site.
Example
We’ll mirror the work done in this blog post from Towards Data Science and Rowel Atienza, as it’s a good simple example. The post contains a number of graphics and a very good explanation of the model itself, but in short we are using an Long Short Term Memory (LSTM) model of a recurrant neural network (RNN) for next word prediction.
The text used in the post and here is one of Aesop’s fables, cleaned and shown below:
Text Preprocessing
The first thing to do is convert the text to a form that the model can read. Models can only operate on numerical systems, so each word and punctuation needs to be swapped with a number, producing a list (vector) of numbers instead of words representing the text. The list of words and their number will be referred to as a dictionary. We’ll sort it to make it easier to comprehend as a regular paper dictionary.
Now, we can see that what was originally ‘long ago , the mice had a general council to …’ is now coded as ‘49, 6, 1, 86, 57, 38, 4, 36, 29, 93’. While that’s less human readable, it’s easier for the model to understand.
Setting up the RNN
The first thing to do is set up the neural network function iteself. Then, similar to the blog post, we’ll initialize some variables and parameters:
We need to know what the correct prediction is, so we prepare that and a measure of accuracy:
Train the Model
With all of our variables and parameters prepared, we can initialize tensor flow, then start a Session.
Now to train our model:
Play with the model
What can we do with this? We can use the trained model to generate text for us. This generation might seem familiar to the supplied text, or even repeat it, because of the small amount of input data. However, with a larger input, we could get some really novel speech patterns out.
If we supply some starting words, we’ll use it to generate a few sentences.
Obviously, the story is just recycled to us; we’re overfit to the data. Next post we’ll look at a larger corpus of text as a starting point.