Found a couple of excellent videos on YouTube and would like to give a thumbs up here and share with my blog followers.
After looking through the 2 videos below, one will get a much clearer picture of the math formula and a understanding behind the jargons of overly hyped buzz words like “deep learning”. It covers neural network, LSTM and recurrent neural network. Explanation is made so simple and easy to digest as the presenter made a progression from a weak model to a strong model with >99% accuracy, unraveling “smart” modifications along the way to combat weakness as he observes along the progression.
In fact, its still a recursive optimisation workflow system, however, some “smart” modifications are…
- averaging into probabilistic output using alternative activation e.g. “Relu” functions besides Sigmoid function
- “dropout” as a solution to counter against high degree of freedoms, arising from convolution layers where simple averaging misses out important relationship between components being sum and averaged. Dropout best done at layers with the highest degree of freedom. This effectively prevents divergence of test error from training error
- where sequencing matters e.g. time series, sentence of words, sequential recurrent using previous word = current word as a neural network layer with LSTM allows making an educated guess of what’s next, on the basis of what it has read so far.. based on xx.. epoch
- LSTM allows bypassing/ignorance of activation function or persistence in series circuit
- he did briefly highlight difficulty of multiple local min solutions that complicates finding unique global min, decreasing rate of learning hyperparameter so that accuracy gets initial boost but stabilises thereafter many iterations without rollback in accuracy, the idea of divergence of test error and training error from the vanishing of gradient needed for gradient descent to locate min…
If you are interest in knowing more specifics, read a well summarised (with more links) KDnuggets post by Matthew Mayo here.