# Use SAS to answer question

Help me study for my Computer Science class. I’m stuck and don’t understand.

ECO520 Homework 5

Regression Analysis on House Price in Chicago

Revisiting House Prices in Chicago

Let’s work on the house price in Chicago. Here is the code we used to load the data on SAS and some descriptive statistics.

filename webdat url "http://bigblue.depaul.edu/jlee141/econdata/housing/mls2018_smpl.csv" ;

/* Import Chicago Community data*/

PROC IMPORT OUT= mls

DATAFILE= webdat

DBMS=CSV REPLACE;

GETNAMES=YES;

DATAROW=2;

RUN;

proc contents ; run ;

/*To check the relationship between the variables */

proc sgscatter data = mls ;

plot log_price *(log_sqft bedroom bathroom garage agebld fireplace sold_30day) ;

run;

data mls ; set mls ;

agesq = agebld**2 ;

run ;

- Nonlinear Regression Model
- Consider that we want to estimate a piecewise linear regression using agebld. Let’s consider the knot of agebld is 70, and estimate the model
- Let’s build a category variable to define the four classes of houses by price range (HPRICE).
- Carefully explain the following terms in regression model.

Let’s estimate the following regression model and compare the relationship between age of building to log of house price for both models.

proc reg data=mls ;

model log_price = log_sqft agebld ;

model log_price = log_sqft agebld agesq ;

run ;

Log_price* _{i}* =

*b*

_{0}+

*b*

_{1}

*log_sqft*

_{i}_{}+

*b*

_{2}

*agebld*

*+*

_{i}*b*

_{3}(

*agebld*

*–*

_{i}*70*) agebld

*+*

_{i}*e*

_{i}/* Piecewise Nonlinear Regression */

data mls1 ; set mls ;

if agebld > 70 then kdum =1 ; else kdum = 0 ;

age_dum = agebld*kdum ;

run ;

proc reg data= mls1 ;

model log_price = kdum log_sqft agebld age_dum ;

run ;

quit ;

Carefully explain the regression output from the piecewise regression model in terms of age of building (agebld).

1^{st}: 0-99999

2^{nd}:100000-249999

3^{rd}: 250000-499999

4^{th}: 500000+

Estimate the simple regression model between log_price and log_sqft including the category variable you created. (Keep in mind that you have to use dummy variable for the category variable). Explain the findings based on the category.

- Spurious Correlation
- Least Square Estimation
- Perfect Multicollinearity and Dummy Variable Trap
- irrelevant variables
- omitted variables
- Stepwise regression
- Forward selection
- Backward selection
- Sequential replacement procedure
- Best-subset procedure
- Overfitting