From the course: Ethics and Law in Data Analytics

Descriptive analytics and identity

From the course: Ethics and Law in Data Analytics

Descriptive analytics and identity

- By now, you have heard both terms descriptive and predictive analytics. I want to focus now on descriptive analytics. Predictive and descriptive analytics are similar in that they both depend on massive sets aggregated data. The difference is in what they do with it. Predictive analytics takes that aggregated data and then uses statistical models to forecast or predict future behavior. Descriptive analytics takes that aggregated data and then feeds it to machine learning algorithms to form a picture of the identity of the person or thing it is trying to describe. So, one tries to describe the present or past and one tries to predict the future, both based on massive amounts of aggregated data. We have already looked at predictive analytics, so what about descriptive analytics? I find it's sometimes helpful when talking about ethical issues to make them as personal as possible. So, some questions that you will want to know about your own data. What data is being aggregated about me and is it painting an accurate picture of who I am and how is the data going to be used? The knee-jerk response many people have is to demand privacy and just insist that no one should have access to their data at all. We'll talk about that problem shortly. It's actually a pretty tricky one. But let's say this: At least if they're going to find out a lot about me, I hope whatever they conclude is accurate. This anxiety that is going in two directions at once is nicely described by Kate Crawford with an analogy. We are worried that our data is like a harsh fluorescent light that shows both too much of us and not enough at the very same time and therefore, displays a false and unflattering picture of us. For one thing, many crumbs of our digital trail depend deeply on context for their meaning. For example, what if you posted a highly sarcastic comment on a social media site that got a quick laugh from your friends but might make you seem like a menacing person if an algorithm took it literally? I mean, do we know how to make algorithms that understand sarcasm? But what if we give the algorithm something a little less open to interpretation? In a highly publicized 2013 study conducted by the University of Cambridge, researchers were able to train algorithms to decern a user's race, gender, religion, political affiliation, and other personal characteristics with a high degree of accuracy simply by correlating their likes on Facebook. That's kind of cool and has obvious uses for businesses looking to develop new products and politicians trying to craft their message. But when I say high degree of accuracy, I'm not talking about 100%. I'm talking about 90, 80, or 70, or worse. And that should give us pause because as a matter of fact, there are already many proprietary systems out there trying to figure out who the bad guys are by combing through these massive datasets. Now, I'm all for cops catching bad guys, but you have to wonder. Is there some part of your digital profile that is being misinterpreted and putting you on some kind of bad guy list? In response to this, you might say that at least you should have a right to see what descriptive analytics thinks it knows about who you are and then you should have a chance to challenge those findings if you believe they are inaccurate. Actually, that's also complicated, because let me tell you what the response will be. The users of descriptive analytics will say that access to your digital trail actually lets them know you better, in some ways, than you know yourself. So, of course, you know things about yourself such as religion and sexual orientation because you're the one who defines those things about yourself. But consider this example: Maybe you don't think of yourself as a materialistic person. In fact, you go on rants about the evils of materialism and people spending money on things they don't need and maybe even post a blog about it. But then, maybe an algorithm could tell us how much time you spend online looking at luxury goods and what your past purchases are and discover that you actually are on the materialistic side and maybe that surprises you. Maybe you are much more materialistic than you realize. So, what users of proprietary descriptive analytic algorithms would say is that allowing you to go back and challenge the characteristics that they have discovered about you is going in the wrong direction. Descriptive analytics tells us things about yourself you might not know. So, do we have identity rights in this new world? Will governments have a standard of due process similar to the one they have for the physical world? As Eva has been pointing out, there is currently no laws governing this space. So, we'll have to see.

Contents