Someone Like You: How Big Should Big Data Get?

Deputy Features Editor Kyi Yeung Goh analyses the increasing role that ‘big data’ plays in public life, and the trade-off between certainty and free choice

Living with uncertainty is often uncomfortable. It is no surprise then that we would invest an inordinate amount of time to establish if there might be more absolute inevitabilities than death and taxes in our lifetimes. This endeavour has spawned whole industries across human history- astrology, fortune telling, insurance and political punditry amongst others. Although little progress has been made in this regard, we have managed to (or convinced ourselves) that we are now better equipped than ever to probabilistically claim that something will occur. Part of the reason, as Cathy O’Neil argues in her Weapons of Math Destruction, is due to the information revolution that has made available a seemingly endless amount of data points for one to examine human behaviour.

Through your weekly grocery purchases, supermarkets are now able to provide personalised discount coupons for goods that you actually regularly purchase. Yet, things start becoming a little strange when your Instagram feed begins showing an advertisement for a product that you had bought just a few minutes earlier. To O’Niell, the use of these algorithmic models can be highly dubious. At the heart of her concern, lies the problem of whether proxies should be used to guide large-scale and often life-changing social decisions. In the ideal world, the course of action for anyone seeking to ascertain an individual’s behaviour is to directly ask him or her about it. However, with limited resources and almost infinite needs, we often subcontract this task to models that, in theory, are able to effectively approximate our preferences. So instead of asking you how you might behave, things often end up considering how a person like you would behave.

On first glance, this all seems fine and well. It is, after all, a subjectively fair trade-off. You gain spade loads in operational efficiency for a small price in accuracy. There, however, should be exceptions to this. Indeed, our alarm bells should start ringing if models become opaque, scalable and most crucially, damaging to society. Take predictive policing for example. Most people would have little idea how these algorithms work and there is little room for feedback to be processed by the system. Opacity? Check. With budget constraints, there is increasing pressure on police forces to maximise the efficiency of their limited resources. In certain cases, this has led to the use of algorithms to predict criminal activity across states in the United States. Scalability? Check. If not properly handled, “predictive policing” assigns greater resources to hotspots that in turn could “lead to discovery of more minor crimes and a feedback loop which stigmatizes poor communities.” Damaging to society? Check. Essentially, there is a risk that such models may evolve into large-scale and damaging “garbage in, garbage out” operations. This is a problem that can surface in many industries: human resources, insurance, work performance, University rankings and loans services. Proxies are not always possible and even if they are, can lure us into a false sense of certainty.

At an LSE lecture titled Politics after Brexit and Trump, Professor Richard Pildes reminded the audience how politics is often shaped by what he termed as ‘invisible choices’ or, in academic jargon, intervening variables that throw ‘predictions’ off-course. One can attempt to proxy voting pattern but there is always a chance that things may go awry. Just ask Ada, the computer algorithm used by Hilary Clinton’s 2016 Presidential Election campaign. Although it ran some 400 000 daily simulations of how a race against Trump would pan out, it failed to pick up the importance of “Rust Belt” states such as Wisconsin and Michigan until it became too late.

There are clear signs that governments, shareholders and interest groups are increasingly alarmed by the power wielded by these models. Where should our worries begin? For starters, we have unregulated political influence. Michael Brand of Monash University points to the “covert meddling” Facebook had carried out during the 2012 and 2014 election seasons which involved the manipulation of over 1.9 million user newsfeeds. In the 2010 Congressional elections, Facebook collaborated with the University of California, San Diego to study the impact that publicising the act of voting would increase voter turnout. This involved the use of a “voter megaphone” where users reported to their friends whether they have voted through an “I Voted” button. In the end, the button’s call to action increased the total vote count by 340 000 votes. Then there are the cases of the unwitting interventions as was the case in the 2016 US Presidential Elections and Ukraine.

Where do we go from here? First, we should make better use of existing legal infrastructure. This could be employing competition law to prevent the rise of monopolies on consumer information or to ensure that criminal use of data be actively as well as heavily punished. Second, is the need for both government and interest groups alike to demand greater transparency from these organisations. This may take the form of open-access to anonymised information and ability to subject algorithms to some form of oversight.

Therein lies the challenge of our times: to be able to minimise uncertainty but not cede more control of our already limited ability to choose.

Share:

Share on facebook
Facebook
Share on twitter
Twitter
Share on pinterest
Pinterest
Share on linkedin
LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *

On Key

Related Posts

scroll to top