OpenSource h2o.deepLearning() is package for deeplearning in R from h2o.ai
here's a write up http://www.r-bloggers.com/things-to-try-after-user-part-1-deep-learning-with-h2o/
And code: https://gist.github.com/woobe/3e728e02f6cc03ab86d8#file-link_data-r
######## *Convert Breast Cancer data into H2O*
dat <- BreastCancer[, -1] # remove the ID column
dat_h2o <- as.h2o(localH2O, dat, key = 'dat')
######## *Import MNIST CSV as H2O*
dat_h2o <- h2o.importFile(localH2O, path = ".../mnist_train.csv")
######## *Using the DNN model for predictions*
h2o_yhat_test <- h2o.predict(model, test_h2o)
######## *Converting H2O format into data frame*
df_yhat_test <- as.data.frame(h2o_yhat_test)
######## Start a local cluster with 2GB RAM
library(h2o)
localH2O = h2o.init(ip = "localhost", port = 54321, startH2O = TRUE,
Xmx = '2g')
########Execute deeplearning
model <- h2o.deeplearning( x = 2:785, # column numbers for predictors
y = 1, # column number for label
data = train_h2o, # data in H2O format
activation = "TanhWithDropout", # or 'Tanh'
input_dropout_ratio = 0.2, # % of inputs dropout
hidden_dropout_ratios = c(0.5,0.5,0.5), # % for nodes dropout
balance_classes = TRUE,
hidden = c(50,50,50), # three layers of 50 nodes
epochs = 100) # max. no. of epochs
In my opinion: it's both. It's referenced many times in the highly cited article on convolutional neural networks Gradient-Based Learning Applied to Document Recognition by Yann LeCun, Yoshua Bengio, Leon Bottou and Patrick Haffner.
The idea is that it is quite hard to hand-design a rich and complex feature hierarchy. For low level features, we see that conv-nets learn edges or color blobs. This makes intuitive sense and from early computer vision methods, we have some good quality hand-crafted edge feature detectors. But how to compose these features to form richer and more complex features is not a simple task to do by hand. And now imagine trying to design a 10-level feature hierarchy.
Instead what you can do is tie the representation learning and classification tasks together, as is done in deep networks. Now we allow the data to drive the feature learning mechanism.
Deep architectures are designed to learn a hierarchy of features from the data as opposed to ad-hoc hand-crafted features designed by humans. Most importantly, the features will be learned with the explicit objective of learning a hierarchical feature representation which obtains low error on a given loss function which measures the performance of our deep net. A priori, given some hand-crafted features, one does not know how good these features are for the task at hand. In this manner, desired high performance on the task at hand will drive the quality of the learned features and they become inextricably linked together.
This end-to-end training/classification pipeline has been a big idea when it comes to designing computer vision architectures.
Best Answer
For the static graphs, you should first draw the graph completely and then inject data to run(define-and-run), while using dynamic graphs the graph structure is defined on-the-fly via the actual forward computation. This is a far more natural style of programming(define-by-run).
When you can distribute the computations over multiple machines it is better to harness static graphs. Being able to do this with a dynamic computation graph would be far more problematic.
Some other comparisons:
Having a static graph enables a lot of convenient operations: storing a fixed graph data structure, shipping models that are independent of code, performing graph transformations, but it is a little bit more complex than dynamic graphs(for example when implementing something like recursive neural networks). When you're working with new architectures, you want the most flexibility possible, and these frameworks allow for that.
This answer is gathered from the reference at the bottom, and I hope my conclusions are correct and can help.
Reference: https://news.ycombinator.com/item?id=13428098