Piracy has now become an unavoidable and, I think, unremovable part of life on the internet, and is constantly referred to as harmful by the mainstream media. Whether it is or isn’t doesn’t concern me for this post, all I’m interested in is if it’s possible to know whether it’s harmful or not with any level of certainty.
To prove that piracy is harmful you have to show that it has caused the loss of income and for me this is where the trouble comes in, how do you do this? Economics systems are very complicated because they contain people, and lots of them. People have a tendency to behave however they want, and can be hard to predict with great accuracy. Not only are their individual habits hard to predict but so are large scale features of their behavior .
Cum Hoc Ergo Propter Hoc
When presented with information about the effect of internet piracy on the media (or other industries) it is important to remember one key thing: correlation does not imply causation. Links have to be proved with data, and you should always question how the data was obtained and how the conclusions were reached. Theories have to be tested and investigated. Unfortunately, when it comes to internet piracy this is extremely difficult, maybe even impossible.
In order to accurately test the effects of piracy, for something like the profit of a film, you really need a second earth. You would have to simultaneously release your film on Earth-1, where the film can be pirated, and Earth-2 where it is impossible to pirate, and see how much money you make. If the profit on Earth-1 is less than the profit on Earth-2, all other things being equal, then you have proof that piracy has a detrimental affect on sales in this one case. After lots of similar experiments have been done you can build up a general trend/rule.
Of course it is impossible in practice to set up this kind of experiment, so shouldn’t we just lower our standards of proof? I’ve seen some people argue that you can try and use stats from groups with different levels of internet use. If the population of a certain group uses the internet less, then it would seem reasonable to assume there is less chance for internet piracy; however I don’t think this is going to work. For all you know there could be a link between the low internet uptake in this community and low cinema attendance which would mean this situation could not be used to infer any general rules. There could also be prejudices built into the community that mean certain films wouldn’t do as well, regardless of the level of piracy. Even if you did manage to find two situations with similar cinema attendance and diametrically opposite levels of internet use  you are not guaranteed an accurate comparison as you are dealing with different people. The problem of not having an adequate control limits the amount you can infer from the data you would get.
Are there any ways around this? My knowledge when it comes to performing social experiment like this is very limited, so maybe I’m missing something big. Maybe it is possible to create groups big enough to be very accurately representative, and the same enough in all manners except piracy to work as a control. The problem I have is quantifying the level of “sameness” between the different groups. Even if there’s a way to account for and reduce the “errors” from using groups of different people, can you quantify to what value the errors in your data have been reduced to? Can you then repeat the experiment with the exact same group? Using a lot of different groups once is no good as you cannot infer much from a set of independent singular data points. There needs to be high levels of repeatably and control.
Maybe I’m being far too strict, but as a scientist there are certain levels of data confidence I am used to expecting to get me to agree with a conclusion, and I have’t seen any examples of this in the debate about internet piracy. I welcome someone pointing out some good evidence about piracy as I am very interested in how data for situations like this is collected and analysed, but as it stands I feel you cannot prove or say with certainty that piracy has decreased sales. The system is far too complicated for confident conclusions to be drawn unless unpractically high standards of control are implemented.
Could anyone have predicted the internet phenomena that is LOLcats?↩
Of course you still have to provide evidence that internet use is correlated with internet piracy.↩