[Machine Learning by Stanford] Features and Polynomial Regression

This is a brief summary of ML course provided by Andrew Ng and Stanford in Coursera.

You can find the lecture video and additional materials in

https://www.coursera.org/learn/machine-learning/home/welcome

Coursera | Online Courses From Top Universities. Join for Free

1000+ courses from schools like Stanford and Yale - no application required. Build career skills in data science, computer science, business, and more.

www.coursera.org

Polynomial regression

Sometimes, non-linear regression model fits better and should apply the polynomial regression.

Quadratic model might not work because it will fall later.

Feature Scaling should work in this case below where you set the squred size and the cubic size as params.

Choice of features

Can apply squared root of the size instead of the cubic size.

Quiz: Suppose you want to predict a house's price as a function of its size. Your model is

$h_{\theta} (x) = \theta_0 + \theta_1(size) + \theta_2 \sqrt{size}$

Suppose size ranges from 1 to 1000 ( $ft^2$ ). You will implement this by fitting a model

$h_{\theta} (x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2 $

Finally, suppose you want to use feature scaling (without mean normalization).

Which of the following choices for $x_1$ and $x_2$ should you use? (Note: $\sqrt{1000} \approx 32$)

Answer: $x_1 = \frac{size}{1000}, x_2 = \frac{\sqrt{size}}{32}$

Lecturer's Note

We can improve our features and the form of our hypothesis function in a couple different ways.

We can combine multiple features into one. For example, we can combine $x_1$ and $x_2$ into a new feature $x_3$ by taking $x_1$ x $x_2$ .

Polynomial Regression

Our hypothesis function need not be linear (a straight line) if that does not fit the data well.

We can change the behavior or curve of our hypothesis function by making it a quadratic, cubic or square root function (or any other form).

For example, if our hypothesis function is $h_{\theta} (x) = \theta_0 + \theta_1(size) x_1$, then we can create additional features based on $x_1$ to get the quadratic function $h_{\theta} (x) = \theta_0 + \theta_1 x_1 + \theta_2 x_1^2$ or the cubic function $h_{\theta} (x) = \theta_0 + \theta_1 x_1 + \theta_2 x_1^2 + \theta_3 x_1^3$

In the cubic version, we have created new features $x_2$ and $x_3$ where $x_2$ = $x_1^2$ and$x_3$ = $x_1^3$

To make it a square root function, we could do: $h_{\theta} (x) = \theta_0 + \theta_1 x_1 + \theta_2 \sqrt{x_1}$

One important thing to keep in mind is, if you choose your features this way then feature scaling becomes very important.

eg. if $x_1$ has range 1 - 1000 then range of $x_1^2$ becomes 1 - 1000000 and that of $x_1^3$ becomes 1 - 1000000000

저작자표시 비영리 변경금지 (새창열림)

'BITS' 카테고리의 다른 글

[Machine Learning by Stanford] Computing Parameters Analytically - Normal Equation Noninvertibility (0)	2019.04.03
[Machine Learning by Stanford] Computing Parameters Analytically - Normal Equation (0)	2019.04.03
[Machine Learning by Stanford] Gradient Descent in Practice II - Learning Rate (0)	2019.04.03
[Machine Learning by Stanford] Gradient Descent in Practice I - Feature Scaling (0)	2019.04.03
[Machine Learning by Stanford] Gradient Descent for Multiple Variable (0)	2019.04.02

The Privilege Is Ours

[Machine Learning by Stanford] Features and Polynomial Regression

'BITS' 카테고리의 다른 글

티스토리툴바

[Machine Learning by Stanford] Features and Polynomial Regression

'BITS' 카테고리의 다른 글

'BITS' Related Articles

티스토리툴바