Sunday, January 31, 2010

Animal Cruelty and Statistical Reasoning

In a recent article, animal rights activists (Mercy for Animals-MFA) went undercover and made some observations about animal abuse on dairy farms. See-
Governor Paterson, Shut This Dairy Down

The author of the above article states:

"But the grisly footage that every farm randomly chosen for investigation--MFA has investigated 11--seems to yield, indicates the violence is not isolated, not coincidental, but agribusiness-as-usual."

Where the statement above could get carried away, is if someone tried to apply it not only to the population of dairy farmers in that state or region, but to the industry as a whole. It's not clear how broadly they are using the term 'agribusiness as usual' but let's say a reader of the article wanted to apply it to the entire dairy industry.

This is exactly why economists and scientists employ statistical methods. Anyone can make outrageous claims about a number of policies, but are these claims really consistent with evidence? How do we determine if some claims are more valid than others?

Statistical inference is the process by which we take a sample and then try to make statements about the population based on what we observe from the sample. If we take a sample (like a sample of dairy farms) and make observations, the fact that our sample was 'random' doesn't necessarily make our conclusions about the population it came from valid.

Before we can say anything about the population, we need to know 'how rare is this sample?' We need to know something about our 'sampling distribution' to make these claims.

According to the USDA, in 2006 there were 75,000 dairy operations in the U.S. According to the activists claims, they 'randomly' sampled 11 dairies and found abuse on all of them. That represents just .0146% of all dairies. If we wanted to investigate the proportion of dairy farms that were abusing animals, if we wanted to be 90% confident in our estimate ( that is construct a 90% confidence interval) and we wanted the estimate (within the confidence interval)to be within a margin of error of .05, then the sample size required to estimate this proportion can be given by the following formula:

n = (z/2E)^2 where

z = value from the standard normal distribution associated with a 90% confidence interval

E = the margin of error

The sample size we would need is: (1.645/2*.05)^2 = (16.45)^2 = 270.65 ~271 farms!

To do this we have to make some assumptions:

Since we don't know the actual proportion of dairy farms that abuse animals, the most objective estimate may be 50%. The formula above is derived based on that assumption. (if we assumed 90% then it turns out based on the math (not shown) that the sample size would have to be the same as if we assumed that only 10% of farms abused their animals, which gives a sample size of about 98 or way more than 11). This also assumes normally distributed data. But to calculate anything, we would have to depend still on someone's subjective opinion of whether a farm was engaging in abuse or not.

I'm sure the article that I'm referring to above was never intended to be scientific, but the author should have chosen their words more carefully. What they have is allegedly a 'random' observation and nothing more. They have no 'empirical' evidence to infer from their 'random' samples that these abuses are 'agribusiness-as-usual' for the whole population of dairy farmers.

While MFA may have evidence sufficient for taking action against these individual dairies, the question becomes how high should the burden of proof be to support an increase in government oversight of the industry as a whole? (which seems to be the goal of many activist organizations)This kind of analysis involves consideration of the tradeoffs involved. This may depend partly on subjective views. We can use statistics to validate claims made on both sides of the debate, but statistical tests have no 'power' in weighing one person's preferences over another. Economics has no way to make interpersonal comparisons of utility.

Note: The University of Iowa has a great number of statistical calculators for doing these sorts of calculations. The sample size option can be found here. In the box, just select 'CI for one proportion' Deselect finite population ( since the population of dairies is quite large at 75,000)then select your level of confidence and margin of error.

References:

Profits, Costs, and the Changing Structure of Dairy Farming / ERR-47
Economic Research Service/USDA Link

"Governor Patterson Shut Down This Dairy", Jan 27,2010. OpEdNews.com