Activation functions

توضیح مختصر

  • زمان مطالعه 0 دقیقه
  • سطح خیلی سخت

دانلود اپلیکیشن «زوم»

این درس را می‌توانید به بهترین شکل و با امکانات عالی در اپلیکیشن «زوم» بخوانید

دانلود اپلیکیشن «زوم»

فایل ویدیویی

متن انگلیسی درس

Welcome back.

We’ve talked about non-linearities for some time now without elaborating in more detail.

And that’s a topic we would like to address in this lesson in a machine learning context.

Non-linearities are also called activation functions.

Henceforth that’s how we will refer to them activation functions transform inputs into outputs of a

different kind.

Think about the temperature outside.

I assume you wake up and the sun is shining.

So you put on some light clothes you go out and feel warm and comfortable.

You carry your jacket in your hands in the afternoon the temperature starts decreasing.

Initially you don’t feel a difference at some point though your brain says it’s getting cold.

You listen to your brain and put on your jacket the input you got was the change in the temperature

the activation function transformed this input into an action put on the jacket or continue carrying

it.

This is also the output after the transformation.

It is a binary variable jacket or no jacket.

That’s the basic logic behind nonlinearities the change in the temperature was following a linear model

as it was steadily decreasing the activation function transformed this relationship into an output linked

to the temperature but was of a different kind ok in machine learning.

We have different activation functions but there are a few use much more frequently than others.

Here is a table with four of the most commonly used activation functions.

Let’s go through one row to see how the table is organized.

First we have the sigmoid also known as the logistic function.

Here is its formula.

Next to it we have its derivative as you may recall the derivative is an essential part of the gradient

descent.

Naturally when we work with tenths or flow we won’t need to calculate the derivative as tenths or flow.

Does that automatically Anyhow the purpose of this lesson is understanding these functions.

There are graphs and ranges in a way that would allow us to acquire intuition about the way they behave.

Here’s the functions graph.

And finally we have it’s range.

Once we have applied the sigmoid as an activator all the outputs will be contained in the range from

0 to 1.

So the output is somewhat standardized.

All right here are the other three common activators the Tench also known as the hyperbolic tangent.

The real Lu aka the rectified linear unit and the soft Max activator you can see their formulas derivatives

graphs and ranges.

The saaf next graph is not missing.

The reason we don’t have it here is that it is different every time.

Pause this video for a while and examine the table in more detail.

You can also find this table in the course notes.

So all these functions are activators right.

What makes them similar.

Well let’s look at their graphs all Armano tonic continuous and differentiable.

These are important properties needed for the optimization process as we are not there yet.

We will leave this issue for later.

Before we conclude I would like to make this remark.

Activation functions are also called transfer functions because of the transformation properties.

The two terms are used interchangeably in machine learning context but have differences in other fields.

Therefore to avoid confusion we will stick to the term activation functions.

All right let’s wrap it up here.

Thanks for watching.

مشارکت کنندگان در این صفحه

تا کنون فردی در بازسازی این صفحه مشارکت نداشته است.

🖊 شما نیز می‌توانید برای مشارکت در ترجمه‌ی این صفحه یا اصلاح متن انگلیسی، به این لینک مراجعه بفرمایید.