Using a char-rnn on the R manual

Character recurrent neural networks have become very popular in recent years in the deep learning community for learning probability distributions over sequences of characters. More concretely, the probably of an “o” given the sequence of letters “hell” . There are many ways to model sequences of characters, and it does not have to be accomplished with either a recurrent neural network or on a character level. E.g., simple markov models can perform reasonably in this context (reasonably not defined..), or a model could be trained on words. I will not offer a more in depth explanation, but instead I’ll link to two posts on the matter which offer far more lucid explanations than I could ever offer on the subject.

Anyway, as far as I’m aware, the canonical blog post on the subject comes from Andrej Karpathy here . Karpathy open sourced a version of the network in the torch language. The model was rewritten to use Tensorflow in this implementation here . This is the code I used for this project, for no particular reason other than curiosity.

As far as I can tell, LSTM’s are generally more popular these days than recurrent neural networks. An excellent post on the subject is here .

These models can produce some entertaining results. A prime example is the @deepdrumpf Twitter account, which tweets samples from a trained recurrent neural network.

In a more nerdy direction, I trained a char-rnn on the sexy corpus of text known as the R manuals. I also added a small sprinkle of text from Hadley Wickham’s Advanced R. . The hope being that my neural net would produce some fabulous new pieces of R Wizardry.

I trained my model in the official Tenorflow docker container. I did this to avoid all the headache of setting up the proper python / jupyter / Tensorflow enviroment. I trained the model with a single hidden layer with 350 hidden units, and some standard amount of dropout that I don’t remember. It took several hours since I trained it on the CPU of my laptop.

The Incredible results

Next: build update that you should use file if the representation on the following table: these can save a little might be imported from by information and opened by complex types are handled used to create them elseugh R installed in others. R malloc(), but is checked siges to implement frequently.

The spelling is decent - remember that the model has no concept of grammar and has to learn everything from the corpus.

This is mostly gibberish. =( . Let’s try another!

A little better this time. The model learned some stuff about header files and Fortran being parts that drive R. It also learned about the write.table function and R CMD CHECK.

Let’s do one last sample:

Well I primed this sample with “function”, but looks like the model spit out some stuff about compiling and C++. Also another mention of Fortran here.

Ending thoughts

The results were not as amusing as I had hoped - perhaps that’s merely a product of this being a ridiculous endeavor. Anyway, it was worthwhile getting a little more familiar with Docker and Tensorflow. Obvious limitations include training a small network on a small corpus. The source for this is here.

Avatar
Josh Weinstock
Assistant Professor

Assistant Professor at Emory