- زمان مطالعه 0 دقیقه
- سطح خیلی سخت
دانلود اپلیکیشن «زوم»
این درس را میتوانید به بهترین شکل و با امکانات عالی در اپلیکیشن «زوم» بخوانید
متن انگلیسی درس
Back propagation the most intuitive lesson yet the hardest to grasp in mathematical terms we will first
explore the intuition behind it then look into the mathematics.
And yes math is optional but we would like to encourage you to look at it to gain a better understanding
you shouldn’t skip this lesson though as it’s the fun part of back propagation.
First I’d like to recap what we know so far we’ve seen and understood the logic of how layers are stacked.
We’ve also explored a few activation functions and spent extra time showing they are central to the
concept of stacking layers.
Moreover by now we have said 100 times that the training process consists of updating parameters through
the gradient descent for optimizing the objective function in supervised learning.
The process of optimization consisted of minimizing the loss.
Our updates were directly related to the partial derivatives of the loss and indirectly related to the
errors or deltas as we called them.
Let me remind you that the Deltas were the differences between the targets and the outputs.
All right as we will see later deltas for the hidden layers are trickier to define.
Still they have a similar meaning.
The procedure for calculating them is called back propagation of errors having these deltas allows us
to vary parameters using the familiar update rule.
Let’s start from the other side of the coin forward propagation forward propagation is the process of
pushing inputs through the net.
At the end of each epoch the obtained outputs are compared to the targets to form the errors.
Then we back propagate through partial derivatives and change each parameter so errors at the next epoch
For the minimal example the back propagation consisted of a single step aligning the weights given the
errors we obtained.
Here’s where it gets a little tricky when we have a deep net.
We must update all the weights related to the input layer and the hidden layers.
For example in this famous picture we have 270 weights and yes this means we had to manually draw all
270 arrows you see here.
So updating all 270 weights is a big deal.
We also introduced activation functions.
This means we have to update the weights accordingly.
Considering the use nonlinearities and their derivatives.
Finally to update the weights we must compare the outputs to the targets.
This is done for each layer but we have no targets for the hidden units.
We don’t know the errors So how do we update the weights.
That’s what back propagation is all about.
We must arrive the appropriate updates as if we had targets.
Now the way academics solve this issue is through errors.
The main point is that we can trace the contribution of each unit hit or not to the error of the output.
OK in the next lesson we will provide an illustration of the back propagation concept.
Thanks for watching.
مشارکت کنندگان در این صفحه
تا کنون فردی در بازسازی این صفحه مشارکت نداشته است.
🖊 شما نیز میتوانید برای مشارکت در ترجمهی این صفحه یا اصلاح متن انگلیسی، به این لینک مراجعه بفرمایید.