# Preprocessing introduction

/ فصل: Preprocessing / درس 1

### توضیح مختصر

• زمان مطالعه 0 دقیقه
• سطح خیلی سخت

### دانلود اپلیکیشن «زوم»

این درس را می‌توانید به بهترین شکل و با امکانات عالی در اپلیکیشن «زوم» بخوانید

### متن انگلیسی درس

Hey this is the last theoretical section of the course.

It is about the first activity you want to do when you start creating machine learning algorithm.

Pre-processing pre-processing refers to any manipulation we apply to the data set before running it

through the model.

Everything we saw so far was conditioned on the fact that we had already preprocessed our data in a

way suitable for training.

You’ve already seen some pre-processing in the tents or flow intro we created and PC file all the training

we did came from there.

So if you must work with data in an Excel file s v or whatever saving into an NPC file would be a type

of pre-processing in this section though we will mainly focus on data transformations rather than reordering

as before.

What is the motivation for pre-processing.

There are several important points.

The first one is about compatibility with the libraries we use as we saw earlier tenths or flow works

with tensors and not Excel spreadsheets and data science.

You will often be given data in whatever format and you must make it compatible with the tools you use.

Second we may need to adjust inputs of different magnitude.

Let’s say we are Forex traders if one input we are working with is the end of the day Eurodollar exchange

rate.

It would be a value around 1.

However if another input is the daily trading volume we would have values like 100000 and higher.

Obviously the orders of magnitude are quite different.

A linear combination of numbers based on such different skills is problematic in purely mathematical

terms.

A value of 1 is negligible regarding the value of 100000 as all the inputs are on an equal footing in

a vector or a matrix.

The algorithm is likely to ignore all values around 1 these values essentially represent the euro dollar

exchange rate itself.

So they are often more important than the volume of trading.

Obviously something needs to be done to solve this issue.

A third reason is generalization.

Problems that seem different can often be solved by similar models standardizing inputs of different

problems allows us to reuse the exact same models.

Sometimes there are cases when we can even reuse already train networks.

Imagine that you have trained a model previously you face a new problem.

You test your model and it works like a charm.

That’s not unusual in machine learning in the next few lessons.

We will focus on these concepts and introduce several pre-processing techniques.

Thanks for watching.

### مشارکت کنندگان در این صفحه

تا کنون فردی در بازسازی این صفحه مشارکت نداشته است.

🖊 شما نیز می‌توانید برای مشارکت در ترجمه‌ی این صفحه یا اصلاح متن انگلیسی، به این لینک مراجعه بفرمایید.