In 2006 Netflix, at the time an online DVD rental company, ran a competition to improve their recommendation system. Their existing recommendation system, called Cinematch, made predictions about how many stars a user would give a movie, based on how they had the rated movies they had watched before. If Cinematch thought a user would give a film a high star rating, and the user had not already seen it, then Netflix would promote that movie to the user as their next rental choice.
They wanted to improve on Cinematch, so in 2006 they offered a prize of $1,000,000 to any team that could improve the accuracy of predictions by 10%. Cinematch was not a sophisticated system, and teams in the competition quickly worked out how to get a 7% improvement .
Getting the full 10% improvement was significantly more difficult. Eventually the prize was claimed in 2009 by a team called BellKor’s Pragmatic Chaos, a collaboration by several competitors who combined their techniques to get over the 10% line.
During the three years the competition ran Netflix had moved on, the system was never implemented. A sequel to the competition was cancelled amid privacy concerns – Netflix had claimed that the data it released about star ratings as part of the first competition was anonymised and could not be tied back to any individual user, but two researches had from Texas proved this claim to be false . However, the company still felt Big Data was going to be the key to their future.
The process of recommending a movie for users can be broken down into two steps. The first is to discover what kind of movies the user likes, the second is to find a movie that fits this category which the user hasn’t seen. What happens if your system says users want to watch, say, a Washington-based big name political drama and they have watched all such existing material?
One answer is to make more, which is what Netflix did in producing the TV series House of Cards a political drama featuring Kevin Spacey and directed by David Finch . While the precise impact of Big Data on the production process of House of Cards is unclear, Netflix certainly gained much press attention for their use of extremely detailed data about movie consumption when deciding what kind of show to make . As a result the media have taken House Of Cards to presage a statistical era of cultural production where Big Data is king, with headlines such as ‘How Netflix is Turning Viewers Into Puppets’ , ‘“House of Cards” and Our Future of Algorithmic Programming”’ , ‘The Secret Sauce Behind Netflix’s Hit, “House Of Cards”: Big Data’ 
In this paper, three problems with the use of Big Data when making cultural artefacts will be discussed:
1) Big data approaches struggle when failure is expensive
2) Cultural output may become boringly repetitive if statistical methods are applied
3) Cultural output is reflexive, and relies on a shifting context that might be hard to capture by analysing existing behaviour
As a result, big data is most likely to be used as a rough guide, rather than to completely specify, cultural artefacts – unlike the way it has been used in other industries – since most creative endeavours already use informal historical data to forecast success anyway. For these reasons, I suggest that In contrast to other arenas, in culture big data will only represent an incremental change in approach.
The need for human intervention
It’s hard to discover exactly how instrumental the use of ‘big data’ was in the the formulation of House of Cards, how much of the programme’s success is due to Netfilx’s marketing power and the other novel features of the production – such as publishing the whole series simultaneously.
Casting a big star with a track record of successful films is not a new idea, so choosing Kevin Spacey isn’t intrinsically revolutionary – nor is it unusual to use well-known directors or target genres that are known to be popular. To know for sure whether Big Data was important we have to know that Kevin Spacey was better than other actors who might have been cast in the role without the use of ‘big data’ statistical techniques.
This is clearly impossible, and perhaps an unreasonable demand. However we can get a feel for the Netflix approach by looking in more detail at the data they assemble and they way the use it.
Ian Bogost & Alexis Madrigal took apart the Netflix system for recommendations , reverse engineering it to see how it works – at least the parts of it that are visible on their website. Subsequently Netflix cooperated with them to explain some of their processes, granting some window into the company’s inner workings. In addition to the data they have about who rents which movies, they have have panels of expert reviewers who
…receive a 36-page training document that teaches them how to rate movies on their sexually suggestive content, goriness, romance levels, and even narrative elements like plot conclusiveness. They capture dozens of different movie attributes.
Using this data, films are classified into 76,897 “altgenres” that can then be used for recommendations (eg ‘African-American Crime Documentaries’). Presumably similar data was used to inform the production of House of Cards.
Altgenres frequently include the name of an actor or director – ‘Dramas Starring Sylvester Stallone’ or, more pertinently here, ‘Mysteries starring Raymond Burr’. Raymond Burr, star of 1950s TV Series Perry Mason, is the single most mentioned actor in the categories. Perry Mason director Christian I. Nyby II is the most mentioned director, and Barbara Hale, who starred alongside Raymond Burr is also very high, in fact directly above Clint Eastwood. Madrigal calls this “Perry Mason Mystery” – why does the system rank all things Perry Mason so highly? When questioned about the Perry Mason Mystery, Todd Yellin, the designer of the system, simply says ‘These ghosts in the machine are always going to be a by-product of the complexity’.
In a similar quirk, in a previous attempt I undertook to measure the importance of historical figures using data from Wikipedia revealed that Mircea Eliade, a virtually unknown Romanian historian, as the fifth most linked person on the site .
Clearly, it would not make sense for Netflix to cast Raymond Burr in House of Cards, even if he was alive. This highlights the obvious point that Netflix are at most using their data to guide their intuition about the production – they aren’t going make bizarre casting choices just because the system says. This is not step change from how TV production worked before, for example casting directors have always been guided by previous successes, perhaps now even actual statistics, to make their choices.
The T-Shirt manufacturer Solid Gold Bomb is a useful illustration of what can happen when algorithms are allowed to make creative decisions on their own. Their system advertised thousands of different T-Shirts with automatically generated slogans printed on them through Amazon – hoping to find success through weight of numbers. If someone bought one it would be printed on demand to avoid the expense of actually making such a variety T-Shirts, most of which would never be purchased. Unfortunately, their system automatically generated misogynistic slogans such as “Keep Calm and Hit Her”, among others, causing outrage and the removal of all their products from Amazon.
Big Data has been most successful in scenarios where occasional failures can be tolerated. For example we can accept a credit card being blocked if a fraud detection system suggests it has been compromised, as long as we can remove the block if the alert is wrong.
In the case of designing T-Shirts, society found the failure of the algorithm morally unacceptable, in the case of producing a TV show failure is too expensive: it’s impossible to imagining a commissioner defying a strong intuition and casting a seemingly inappropriate actor on the basis of statistical evidence.
The success of House of Cards will ensure that next time Netflix look at their data, Kevin Spacey, Political Dramas and director David Fincher will seem even more popular. If they were blindly to go by the numbers, they might see that the best thing to make was another House of Cards.
This is a problem that advertising platforms, such as Google AdWords, also face. They want to show the adverts that have been clicked on most, because they are the ones that will most likely be clicked on in the future. However, if this was all they did then there would be no opportunity to expose new, potentially even more effective adverts to users – because a new advert will, by definition, never have been clicked on. This is described as the Multi Armed Bandit problem, and there are a number of mathematical solutions to it.
Another way to understand this problem is as it was posed by Richard Feynman . He thought about the problem of choosing what to eat in a restaurant – should you have the dish you know you like, or try something else, which could be disappointing, or even better?
All of the mathematical solutions to this problem balance some amount of choosing the option currently thought to be best while occasionally trying out some riskier things. When applied to the Netflix problem, it suggests commissioning some shows that are very likely to be winners (House of Cards), but also to try a few riskier things which might prove successful, but whose success is not so well predicted by the data. In the cut-and-dry, high frequency world of online advertising, this might be a useful result. However, in terms of TV commissioning, isn’t that what already happens?
Social scientist Donald T. Campbell formulated the following adage which captures something important about the problems that Netflix and cultural producers more generally might face in formulating statistical forecasts of success
The more any quantitative social indicator (or even some qualitative indicator) is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.
Economists have the same concept, due to Charles Goodhart 
Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.
Big hit Hollywood films already follow a well observed formula [13,14], which is frequently derided as mechanical. In this area, we might expect Big Data to allow a fine tuning of ‘safe bet’ mainstream productions – but surely not revolutionising the industries approach, since it’s already so mechanistic.
But in general, even if it is the case that House of Cards made important use of Big Data, it may not be the case that this approach can be continued into the future. Audiences, especially more discerning ones, might come to recognise the kinds of patterns it produces and become tired of them.
More importantly, if Big Data driven content becomes a regular occurrence writers will respond to this new context – they might parody or satarise the phenomena, choosing to deliberately ignore, exaggerate or lampoon certain effects to play on the audience’s expectations – in much the same way as the Spike Jones film Adaptation pokes fun at script writing courses.
The three reasons given above are intended to describe some limitations of big data in the creative industries. One reason why Big Data, as a slogan, has gained so much traction is because of its facility for very directly increasing revenue by improving advertising effectiveness. In this sphere it suffers much less from the problems above: a poorly targeted advert rarely causes offence, algorithms can optimise while preventing convergence, and there is much less reflexivity.
A cynic might think Netflix already knows this. Perhaps their real play is to understand their customers in order to serve better targeted adverts in their films, not to directly shape their productions.
What could big data do?
In describing how Netflix likely uses its data, as a guide to temper intuitions, a new horizon for big data in the creative process may have been opened. Rather than looking movie consumption data as a way of algorithmically generating TV programs, perhaps a more productive approach would be to think of the data as more grist to the writer’s mill.
In this case, why stop at Netflix internal data? Projects such as IBM’s trend forecasting look as though they provide data which is just as relevant. By crunching through social media data, IBM attempt to distill what’s capturing people’s imagination. In two published examples they predict in 2012 (incorrectly?) that 2014 will be the year of Steampunk , and more plausibly that cycling would be the flavour of the zeitgeist for 2014 . Even more ambitiously, proposals have been mooted to try and simulate the entire social world , which, as well as sounding like the plot of a film, might augment a company’s ability to formulate novel creative output that keys into the collective psyche.
In some sectors, human input is a (prohibitively) expensive, error prone factor to be eliminated, but in the creative world it’s an intrinsic, and valued, part of the product. This does not mean that Big Data cannot play a part, but, unlike other applications, in the case it is to provide novel, inspirational support to the creative – and essentially human – process, rather than a substitute.
 Rajaraman, Anand, and Jeffrey David Ullman. Mining of massive datasets. Cambridge University Press, 2011.
 Narayanan, Arvind, and Vitaly Shmatikov. “How to break anonymity of the Netflix prize data set.” The University of Texas at Austin (2007).
 Chakrabarti, Deepayan, et al. “Mortal multi-armed bandits.” Advances in Neural Information Processing Systems. 2009.
 Feynman, Richard P., Robert B. Leighton, and Matthew Sands. Exercises for the Feynman lectures on physics. Basic Books, 2014.
 Goodhart, Charles Albert Eric. Monetary Theory and Practice: The UK Experiencie. Macmillan Publishers Limited, 1984.
 Snyder, Blake. “Save the cat.” The Last Book on Screenwriting You’ll Ever Need. 1st edition. Ingram Pub Services (2005).
 Hauge, Michael. Writing screenplays that sell. A&C Black, 2011.
 Paolucci, Mario, et al. “Towards a living earth simulator.” The European Physical Journal Special Topics 214.1 (2012): 77-108.