Vesselin Popov: "What kind of world would we all like to live in?"

Vesselin Popov is Business Development Director for the University of Cambridge Psychometrics Centre, a multidisciplinary research institute specialising in online behaviour and psychological assessment. He is coming to Open Innovations to participate in the session of the third day on big data. Exclusively for the Forum’s website Vesselin has shared his views on the perspectives and development of a big data.

"Artificial intelligence and machine learning are transforming many industries at a staggering rate. The data needed to drive innovations in education, health, finance, law and other areas already exists and is already being mined to automate certain functions and predict a range of outcomes. However, what is largely lacking from these efforts is a sensitivity to the sociological and ethical implications of placing an excessive reliance on ‘black box’ systems. By this I mean algorithms whose decision-making process is not transparent or easily auditable and which is therefore at risk of perpetuating historical prejudices contained in the data on which the models were trained. We need a far more multidisciplinary approach to AI, incorporating techniques and knowledge from psychology, philosophy, history and other social sciences, in order to rid our technologies of machine bias and ensure that everyone can share equally in the exciting breakthroughs that lie before us.

If we use Big Data and AI responsibly and inclusively, everyone stands to benefit. The increasing ability to automate menial and non-creative tasks, while certainly threatening to many jobs, could also provide us with more leisure time to spend with our families and care for our ageing populations. Predictive health could help us optimise treatments, discover cures faster and save lives. Smart cities and smart grids could help us live more environmentally sustainable lifestyles and improve the quality of life for future generations. All these and more opportunities are attainable using Big Data and predictive technologies, but they cannot be achieved by technology alone. For society to benefit from Big Data we need to be having more inclusive and more intelligent debates about the kind of world we would all like to live in – about what ought to be digitized and what should remain private, about where power lies between organisations, governments and individuals, and many other issues. We need continuing research, interdisciplinary collaboration and well-reasoned laws and policies to get the most out of Big Data. Events such as the Open Innovations Forum are therefore indispensable in contributing and enabling this open debate.

One of the pitfalls of Big Data that I think we ought to talk more about is the evaluation of the quality of training data. It can be tempting to analyse and find correlations in large unstructured datasets without necessarily investigating what biases might be present in the environment, whether it be an occupational, educational, clinical or other complex setting. In most of these cases, the data in question will relate to human behaviour of some kind, and we know that human behaviour (while predictable in some cases) can also be highly irrational, emotional or unfair. An algorithm that seeks to minimise the error of its predictions and to accurately replicate the outcomes of a human system can end up concealing or even perpetuating certain social or individual prejudices that ought to be dealt with rather than being automated. There have been many examples of this: Google search showing women adverts for lower-paying jobs, the Princeton Review tutoring site showing Asian parents higher prices than non-Asian parents; minority ethnic households in the UK being charged up to £450 a year more for car insurance due to indirect discrimination in a pricing algorithm; searching for black-sounding names being more likely to serve up ads suggestive of a criminal record than white-sounding names, or predictive policing software in the United States disproportionately scoring minority ethnic convicts with higher risks of re-offending. Clearly, most if not all of these indirect effects are unintended by the developers of the software, but they reveal the risk that an enthusiasm for using Big Data techniques can at times distract from the real problem at hand. These deeper societal challenges can certainly be tackled with a Big Data mindset, but we should be wary of claiming this technology to be a silver bullet solution. As Samuel Taylor Coleridge wrote in 1831, “If men could learn from history, what lessons it might teach us! But passion and party blind our eyes, and the light which experience gives us is a lantern on the stern, which shines only on the waves behind us.” Perhaps he was right, but it is not too late to learn from and improve upon our past".