## Sunday, February 20, 2011

### Do Farm Subsidies Benefit the Largest Farms the Most?

Most people contend that farm subsidies should be eliminated because they benefit mostly larger farms vs. saving the family farm. It's true that many subsidies are tied to commodity production. As a result, those that grow more commodities (i.e. larger farms) will get more money from the government. As a result larger producers take in a larger share of all subsidies (especially those related to commodities). However, subsidies account for a much smaller percentage of income for large producers, and make up a much larger percentage of total income for medium or small producers.

Definitions:        Commercial farms:  >=  \$250,000
Farms with sales < \$250,000 include
1) Intermediate farms: full time operators
2) Rural residence farms

As the chart above (from the USDA) shows, in 2008 farms earning less than \$250,000 /yr recieved a much greater percentage of their income in the form of government payments, while subsidies only accounted for 4% of income for producers with the largest incomes. The chart below indicates that this relationship seems to hold across years for the last decade.

References:
USDA Report- Government Payments and the Farm Sector: Who Benefits and How Much?
http://www.ers.usda.gov/Briefing/FarmPolicy/gov-pay.htm

USDA Report-Farm Income and Costs: Farms Receiving Government Payments
http://www.ers.usda.gov/Briefing/FarmIncome/govtpaybyfarmtype.htm

## Wednesday, February 9, 2011

### Chebyschev, the Empricial Rule, and Few other Basic Concepts

Chebyshev's theorem can be used to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean, regardless of the distribution.
The empirical rule can be used to determine the percentage of data values that must be within one, two, and three standard deviations of the mean for data having a bell-shaped distribution.

Correlation coefficient: measures the strength of the linear relationship between x and y.

POPULATION PARAMETERS VS SAMPLE STATISTICS
When we take sample data and calculate a mean, we are calculating a sample statistic. The sample mean is used to ‘estimate’ the actual mean of the population we are sampling from. The population mean is referred to as a population parameter. Sample statistics are used to estimate corresponding population parameters. For this reason, sample statistics are often referred to as ‘estimators.’ In statistics we often use different symbols to represent sample statistics vs. sample parameters.

For those interested, below is the R code used in tonight's handout.

```# *------------------------------------------------------------------
# | PROGRAM NAME: EX_CORRELATION_R
# | DATE: 2/8/11
# | CREATED BY: MATT BOGARD
# | PROJECT FILE:
# *----------------------------------------------------------------
# | PURPOSE: DEMONSTRATION OF CORRELATION USING R
# |
# *------------------------------------------------------------------
# |
# |  1:
# |  2:
# |  3:
# |*------------------------------------------------------------------
# | DATA USED:
# |
# |
# |*------------------------------------------------------------------
# | CONTENTS:
# |
# |  PART 1: positive linear relationship
# |  PART 2: negative linear relationship
# |  PART 3: no linear relationship
# *-----------------------------------------------------------------
# |
# |
# *------------------------------------------------------------------

# *------------------------------------------------------------------
# |
# |PART 1: positive linear relationship
# |
# |
# *-----------------------------------------------------------------

x <- c(1,2,3,4,5,6,7,8,9,10)

y <- c(2,3,4,4,5,6,9,8,8,10)

# plot x and y

plot(x,y)
title("positive linear relationship")

# fit a linear regression line to the data (a topic for later in the semester)

reg1 <- lm(y~x)
print(reg1) # output
abline(reg1) # plot line
title("positive linear relationship")

cov(x,y) # covariance between x and y

sd(x) # standard deviation of x
sd(y) # standard deviaiton of y

cor(x,y) # correlation coefficient for x and y

# *------------------------------------------------------------------
# |
# |PART 2: negative linear relationship
# |
# |
# *-----------------------------------------------------------------

# let's keep the same x as above, but look at new data for y:

y2 <- c(9,10,8,7,5,4,6,4,2,1) # read in data for y2

# plot x and y2

plot(x,y2)

# fit line to x and y2

reg2 <- lm(x~ y2)
summary(reg2)
abline(reg2)
title("negative linear relationship")

cov(x,y2) # covariance between x and y2
sd(x) # standard deviation of x
sd(y2) # standard deviation of y2
cor(x,y2) # correlation between x and y

# *------------------------------------------------------------------
# |
# |PART 3: no linear relationship
# |
# |
# *-----------------------------------------------------------------

y3 <- c(5,7,10,5,1,8,7,4,5,9) # read in y3 data

plot(x,y3) # plot x and y3 data

# fit line to x and y3

reg2 <- lm(x~ y3)
summary(reg2)
abline(reg2)
title("no linear relationship")

cov(x,y3) # covariance between x and y3
sd(x) # standard deviation of x
sd(y3) # standard deviation of y3
cor(x,y3) # correlation between x and y3```
Created by Pretty R at inside-R.org

## Wednesday, February 2, 2011

### Commands for Calculating Mean, Variance, & Standard Deviation

Commands in Excel

Statistic: MEAN
Command: =AVERAGE(left click and select data)

Statistic: VARIANCE
Command: =VAR(left click and select data)

Statistic: STANDARD DEVIATION
Command: STDEV(left click and select data)

Commands in R

garst <- c(148, 150, 152, 146, 154)    # enter values for 'garst'

print(garst)     # see if the data is there

var(garst)      # compute variance

sd(garst)      # compute standard deviation

plot(garst)   # plot

hist(garst) # histogram or distribution