Second November Post
- ynishimura73
- Nov 28, 2017
- 2 min read
Today I installed a software of a computer language called R and its integrated tools called R studio, which provides a wide variety of statistical and graphical techniques. I followed the video (https://www.youtube.com/watch?v=7cGwYMhPDUY) that explained how they work and some ways we could do using R.
The window of R studio looks like the picture below. There are 4 windows: file (to put program, write code, and save program), console (to show the outputs of your work), work space and history (to check on data and see what commands have been executed), and file, plots, packages, and other information. First of all, I learned that "="(the equal sign) is "<-"on R, so if I type "a <- 1", it is read as "a equals 1". The values are shown under "environment".

In order to practice writing code, I used the data below about cars that was used as an example on the video.

Once I uploaded the table, I named the data "mydata", so every time I want to use the data, I would type "mydata". Firstly, I typed "summary(mydata)" and the table below appeared on the console window.

This table shows the data for each category (price, miles per gallon, repairs, weight, length, and foreign), specifically the minimum, first quartile, median, mean, third quartile and maximum of the data. The first quartile means 25% observations are below this quantity, and similarly, the third quartile means 75% observations are below this quantity.
Then I typed "sort(mydata$make)", which gave me an alphabetically ordered data of make.

In addition, typing "table(mydata$make,mydata$foreign)" would give me a table of make on the row and foreign on the column. 0 for foreign means the car is domestic and 1 for international cars.

To see the correlation between prices of cars and miles per gallon, I put "cor(mydata$price,mydata$mpg)". The result was -0.438. The negative correlation indicates that as the prices of cars increase, the miles per gallon decreases.
To estimate the Ordinary Least Squares (OLS) Linear Regression, I typed "olsreg <- lm(mydata$mpg ~ mydata$weight + mydata$length + mydata$foreign)" and "summary(olsreg)" where mpg is the dependent variable, and independent variables are connected with "+" sign.

The numbers here essentially means
mpg=44.968582-(weight*-0.005008)+(length*0.043056)+(foreign*-1.260211). Therefore, by plugging in a weight, length, and whether foreign or no, miles per gallon can be estimated. Also, as the results show, the higher the weight, the lower miles per gallon, and so on because the slopes are negative for each variable.
R is very practical because using it, we are able to clean up the data and compare each variable. It is also useful for consumers to determine the most beneficial product to buy.
Comments