## Thursday, January 27, 2011

### Example: Data Distributions

Given the following data:

240 240 240 240 240 240 240 240 255 255
265 280 280 290 300 305 325 330 340 265

Below is a graphical representation of the distribution for this data. The actual graph is referred to as a histogram, and the green curve is a kernel density estimate, which is an estimation of the probability density function for this data.

The R code used to generate this graph, and to calculate the mean is below:

```salary <- c(240, 240, 240, 240, 240, 240, 240, 240, 255, 255, 265, 280, 280, 290, 300, 305, 325, 330, 340, 265)

mean(salary) # calculate the mean

hs <- hist(salary) # plot the distribution - histogram and store
# the data about the distribution as the variable hs

d <- density(salary) # calculate the density function for salary

rs <- max(hs\$counts)/max(d\$y) # resclale the density function so that it can be graphed on the same plot

lines(d\$x, d\$y*rs, type ="l", col = 51)  # graph the density function for salary```
Created by Pretty R at inside-R.org