Data Science Interview Challenge
Welcome to today's data science interview challenge! Here it goes:
One of the characteristics about a RNN language model is that it applies the same weight matrix repeatedly. So what is the derivative of our loss function, let's say on step T, what is the derivative of that loss with respect to the repeated weight matrix of the hidden state Wh? And how d…
Keep reading with a 7-day free trial
Subscribe to The MLnotes Newsletter to keep reading this post and get 7 days of free access to the full post archives.

