- زمان مطالعه 0 دقیقه
- سطح خیلی سخت
دانلود اپلیکیشن «زوم»
این درس را میتوانید به بهترین شکل و با امکانات عالی در اپلیکیشن «زوم» بخوانید
متن انگلیسی درس
Let’s continue exploring this table which contains mostly Greek letters we said the soft max function
has no definite graph y so while this function is different if we take a careful look at its formula
we would see the key difference between this function and the other is it takes an argument the whole
vector A instead of individual elements.
So the self max function is equal to the exponential of the element at position.
I divided by the sum of the exponentials of all elements of the vector.
So while the other activation functions get an input value and transform it regardless of the other
elements the SAAF Max considers the information about the whole set of numbers we have.
Time for an example.
Let A be equal to x w plus B which is our well known model.
Then we can say the output of Y will be the soft max of a.
Let’s look at a hidden layer with three units.
So a here is equal to h w plus B after transforming it through a linear combination we obtain a vector
with three elements minus 0.2 one 0.4 7 and 1.7 to.
Now if we used a different activation such as the sigmoid we would simply apply the formula for each
of the three numbers and we would obtain a new vector containing 3 new numbers.
But saaf Max is special.
Each element in the output depends on the entire set of elements of the input.
Let’s find the SAAF max of a first.
I’ll calculate the denominator.
It is given by E to the power of minus zero point to one plus E to the power of zero point for seven
plus to the power of 1.7 2.
That’s approximately 8.
Then we must divide each exponential by this denominator to get the new vector.
The result is 0.1 0.2 and 0.7.
This is our output layer OK.
A key aspect of the soft Max transformation is that the values it outputs are in the range from 0 to
There is some is exactly 1 What else has such a property.
Probabilities Yes probabilities indeed.
The point of the soft Max transformation is to transform a bunch of arbitrarily large or small numbers
that come out of previous layers and fit them into a valid probability distribution.
This is extremely important and useful.
Remember our example with cats dogs and horses we saw earlier.
One photo was described by a vector containing 0.1 0.2 and 0.7.
We promise we will tell you how to do that.
Well that’s how through a soft Max transformation we kept our promise now that we know we are talking
about probabilities we can comfortably say we are 70 percent certain the image is a picture of a horse.
This makes everything so intuitive and useful that the SAAF next activation is often used as the activation
of the final output layer and classification problems.
So no matter what happens before the final output of the algorithm is a probability distribution.
All right this will do for now.
Thanks for watching.
مشارکت کنندگان در این صفحه
تا کنون فردی در بازسازی این صفحه مشارکت نداشته است.
🖊 شما نیز میتوانید برای مشارکت در ترجمهی این صفحه یا اصلاح متن انگلیسی، به این لینک مراجعه بفرمایید.