[Machine Learning by Stanford] Model and Cost Function

This is a brief summary of ML course provided by Andrew Ng and Stanford in Coursera.

You can find the lecture video and additional materials in

https://www.coursera.org/learn/machine-learning/home/welcome

Coursera | Online Courses From Top Universities. Join for Free

1000+ courses from schools like Stanford and Yale - no application required. Build career skills in data science, computer science, business, and more.

www.coursera.org

Objective: How to fit best possible straight line to our data.

Hypothesis: $h_{\theta}(x) = \theta_0 + \theta_1 x $

$\theta_i$ : Parameters

How to choose $\theta_i$?

Quiz:

1. 0, 1

2. 0.5, 1

3. 1, 0.5

4. 1, 1

Answer: 2

Values for the parameters theta zero and theta one, that corresponds to a good fit to the data

Idea: Choose $\theta_0, \theta_1$ so that $h_{\theta}(x)$, the value we predict on input x, is close to y for the training examples (x,y)

Minimize $\theta_0, \theta_1$ $ \frac{1}{2m} \sum_{i=1}^m (h_{\theta}(x^{(i)}) -y^{(i)})^2$ (to be small)

$h_{\theta} (x^{(i)})= \theta_0 + \theta_1 x^{(i)} $

Objective function for linear regression is about finding the values of theta zero and theta one so that the average, the 1 over the 2m, times the sum of squre errors between my predictions on the training set.

Cost function is also known as the squared error function. These squared error cost function is a reasonable choice and works well for problems for most regression programs. Most common for regression probs.

Lecturer's Note

We can measure the accuracy of our hypothesis function by using a cost function. This takes an average difference of all the results of the hypothesis with inputws from x's and the actual output y's

To break it apart, it is $\frac{1}{2} \overline{x}$, where $\overline{x}$ is the mean of the squares of $h_\theta (x_{i}) - y_{i}$, or the difference between the predicted value and the actual value.

This function is otherwise called the "Squared error function", or "Mean squared error". The mean is halved $\frac{1}{2}$ as a convenience for the computation of the gradient descent, as the derivative term of the square function will cancel out the $\frac{1}{2}$ term.

저작자표시 비영리 변경금지

'BITS' 카테고리의 다른 글

[Machine Learning by Stanford] Model and Cost Function - Cost Function: Intuition II (0)	2019.04.01
[Machine Learning by Stanford] Model and Cost Function - Cost Function: Intuition I (0)	2019.04.01
[Machine Learning by Stanford] Model and Cost Function - Model Representation (0)	2019.04.01
[Machine Learning by Stanford] Introduction - Unsupervised Learning (0)	2019.03.30
[Machine Learning by Stanford] Introduction - Supervised Learning (0)	2019.03.30

The Privilege Is Ours

[Machine Learning by Stanford] Model and Cost Function - Cost Function

'BITS' 카테고리의 다른 글

티스토리툴바

[Machine Learning by Stanford] Model and Cost Function - Cost Function

'BITS' 카테고리의 다른 글

'BITS' Related Articles

티스토리툴바