본문 바로가기
머신러닝 with Python

[Machine Learning] What is machine learning? What is ML?

by CodeCrafter 2024. 3. 6.
반응형

 

1. What is machine learning

1) Machine Learning : Finding Regularity in massive datasets

 

2) Regularities : Knowledge forms (rules, decision trees)

- Machine Learning usually uses inductive knowledge to make predictions.

- The procedure of ML : Data -> Finding regularity -> Representation as diverse forms -> Prediction

 

3) Machine Learning (Compared to traditional programming)

- ML : Input -> ML -> Knowledge forms

(Traditional programming : Rule based)

 

4) Applications of ML

-  Web search , Computational biology, Finance, E-commerce, Space exploration, Robotics, Social networks, etc.

 

5) Machine learning is the stduy of algorithms using T / P / E

- T : Tasks

- P : Performance

- E : Experience

a) Examples1 : Autonomous driving

* T : Driving on four-lane highways using vision sensors

* P : average distance traveled before an error

* E : Sequence of images and steering commands recorded while observing a human driver

b) Example 2 : Semiconductor manufuacturing process to predcit normla of fault

* T : predict process result (normal or fault) of a semiconductor manufacturing tool

* P : Prediction accuracy (proportion of wafers that are correctly classified)

* E : Sequences of process monitoring data (Sensor data)

 

c) Example 3 : Credit system (bank)

* T : predict profitable customers

* P : prediction accuracy

* E : sequences of credit records kept in a bank

 

6) Notation of ML : Input, Output, Target function, training data, Test data

- X : input samples, Y : output samples, f : target function

- Training data : to generalize from the samples

- Test data : to estimate the output for new samples in the future

 

* Input is also referred to as : predictor, independent variable, attribute, feature, or explanatory variable, covariate, and regressor

* output is alos referred to as : response, dependent variable

* y_hat = F_hat(X_training)  (X: training data)

* F_hat(X_test) = y_hat  -> y_hat vs y -> if the difference is minimized, the performance is good -> Generalized well

 

7) Functional Categorization of ML 

-  Regression : Training data consist of <input, real-valued output>

   / Task is to predict outputs of new samples <input, ?>

- Classification : Training data consist of <input, labeled output>

   / Task is to classify class labels of new samples <input, ?>

- Clustering : Training data consist of <attribute values(inputs)

   / Task is to group samples such that each group contains samples with similar attributes values

=>  By using clustering, we can label input data (Ex. Shopping / VIP or Normal customers)

- Association : Given an item set I, a training sample (also called transaction) consists of items in I to buy (Ex. Market basket data) / Training data consist of such samples / Taks is to find association rules of the form X è Y, where X and Y are subsets of I

  (In case of this, we can’t classify which one is input & outputs)

     Ex. Diapers in baskets ->  Beers in baskets : over 90% -> Find Regularities

 

8) Categorization based on types of training samples

- Supervised Learning : Training data includes desired outputs (ex. Regression & Classification)

- Unsupervised Learning : Training data does not include desired outputs (ex. Clustering)

- Semi-supervised Learning : Training data includes a few desired outputs(most of data do not have outputs) ex. Autoencoder based DL

- Reinforcement learning : Rewards from a sequence of actions against the environment (Agent in Environment -> Action -> Rewards -> State transition .)

 

반응형

댓글