Content-type: text/html
When Analytics Does Not Work


Unedited Google recorder transcription from audio

Hi everyone. I'm Steven Downs. Welcome to another episode of Ethics Analytics, and the duty of care. We're in module three, looking at ethical issues in analytics, and todays talk is on when analytics, does not work just a preliminary before we get into this. I know that producing a whole bunch of videos and calling it.

The course isn't ideal pedagogy. And it's important to understand that these videos are not. The course, these videos are things that I'm doing to create some content that we can work around. And is, well, I'm recording these videos in order to create an audio track, which I can convert into text, which gives me a body of textual material to work from, to eventually create a book out of this.

But the course itself is the the graph that we've produced in the website that you've been working on. And it's important to understand that what makes this course. Isn't this content? What makes this course are your activities around this content. I will say, you know, I recognize as well.

I'm the first to recognize this could be the world's most boring course because, you know, and module one, when we went through a whole bunch of applications of AI analytics. And in module two, we're going through a whole bunch of our module three, regards through a whole bunch of issues in module four.

We're going to do ethical code after ethical code. I might change up how I deliver the videos, but still it's pretty dry content. And the idea here is to give ourselves the necessary layers of I won't say knowledge, but layers of data or information that we can draw inferences from one of the problems that I've found in the work that's being done on ethics and analytics, is that this basic work isn't being done.

People are beginning from the point where their intuitions leave off about what ethical issues. There are what applications are. And as a result, they're getting a very narrow intersection of applications issues. And therefore potential solutions were drawing this map drawing maps has pouring. Sorry, but if we don't do this work, we're not able to do the really interesting work that takes place in the second half of the course anyhow.

Having said that let's continue with when analytics does not work. So let's face. It learning analytics is complex. We're working with complex technologies. New technologies that we don't have a whole lot of experience working with it's difficult to master. It would be difficult to master even with oat the technologies because it's based on the aggregation of big data and statistics and all the rest of it you need, if you're gonna build these systems from scratch, you need advanced mathematics.

Who knows that that would be useful and not the mathematics that they taught you in school necessarily either as well. Awful lot of work done with matrices in analytics, and I don't know about you, but they never taught us about matrices at all and 13 years of mathematics education.

What I went to school, there are many sources of error, many sources of admission. And any of these places where an can go wrong, but creates an ethical issue. If we look at, you know, if there's a quick overall, look at some of the things that can happen, you know, the difficulties with training data difficulties with the model or the LA collection of links.

This structure of the network, that's produced as a result attacks. On real world data, steely of models, etc. There's all kinds of places that analytics can fail. So, let's look at some error, just simple playing error is probably the first least talk about, but most significant source of failure of learning analytics.

All kinds of ways you can make errors the data might simply be wrong. There's many ways data can be wrong, can be missed transcribe. People might put any 2004s instead of 2014, they might spell people's names. Wrong dates were wrong, addresses wrong, all kinds of ways, it can be missed transcribed, in this course I'm using automated transcription to create text.

Automated transcription is a source of error, and if I use that text is input to an analytics engine. I'm creating error predictions, maybe incorrect. Most analytics is based on their most predictive analytics is based on regression. Which means drawing a line through the data. We should know that these lines are not always predictable.

There may be poor implementation, the wrong algorithms are used wrong. Kinds of algorithms are used that are organized in the wrong way. Even more there are different meanings of, correct. What is the correct? Interpretation of a line of graph? What is the correct data to use in order to make the sort of inference that you want when errors are made?

Who's accountable, is it the person who collected the data, the person who wrote the software, the person who implements it and practice and, you know, and this will come up again, how do you correct errors in an analytics system? There's often no way to retract or change errors that have been made because you don't know where they were made.

That's just one kind of ethical issue. All of these can raise ethical issues because if you take any of these errors and then you apply them to a real world situation, you may well be causing much more harm than good. The second aspect in which analytics can fail is with respect to reliability again, we can look at the data that goes into learning analytics system and right off the bat, you want reliable data as opposed to, for example, suspicion rumor, gossip and unreliable evidence.

Imagine, we took the collected speeches of Donald Trump as input for our analytics system. What kind of result would that produce if you have, reliability, your protecting your system from accidental inconsistency, some data, saying the temperatures five degrees Sunday, the same, the temperatures eight degrees and you're also protecting it from deliberate manipulation.

There are different kinds of reliability and high illustrated them here on the slide. One type is interreter reliability such that when two people do the same thing, you get the same data test, retest, we're not reliability. If one person does something at one point and does something at a later point, you get the same data.

That's a type of reliability. Parallel forms. Reliability, very similar. If you do something once and you do something twice, the results of version a and version, B are the same and then internal consistency reliability. The overall approach is such that if you have one attribute and you have another attribute, and you have another retribute and they're all produced in the same way, then they're all the same attribute.

So you can see, these are many different ways that analytics can go wrong, you know? And it just takes a simple coating or and believe me, I know, you know, you have two forms for input data because you're you're getting input data from different sources and you code one slightly differently than you code the other and that creates a reliability issue.

Because now even if it's the same data, it's coming in in different forms, it might look different in the database and all you built in a source of error consistency. Failure consistency basically is well, they're different ways to look at it. The definition here is, when the state recorded by one part of the network is different from the state recorded by other parts of the network.

A lot of analytics systems are based on distributed systems. It's not one big central database. You have a variety of different databases in a distributed system. One of the key challenges is that all of these different databases end up reporting the same thing. You don't have one database saying Bob is 40 years old and another database saying Bob is 42 years old.

Those two differences have to in some way be reconciled. And that's a hard problem, like a really hard problem for distributed systems. It's a hard problem even in databases generally which is why an entire theory called database normalization was created and the concept of single source of truth was created.

But sometimes you can't normalize it database. Sometimes you can't have a single source of truth and in that case, you have to be checking for consistency. And if you're not, well, there's your source of error bias is probably the most talked about source of error in analytics. The problem of bias, pervades, AI and analytics.

It can be in the data itself that can be in the collection of the data, the management of the data in the analysis or interpretation of the data, or in the application of out interpretation, we all know the story and it'll come up again of tea, the Microsoft bought that became very racist because the input was racist.

We've heard of cases, where the bias in the sampling, resulted in a bias in enforcement of the law. For example, a sample saying that black people in this certain district or more likely to me a crimes, how it turns out that black people in that district were more likely to be police and more likely to be accused of crimes because of racism on the part of the police force.

But now once this gets to be taking as data into the system and then applied, your system is going to tell you these black people in this place or criminals. Well, they're not. They're the victims of racism and that's why bias is such a persistent and pervasive issue and analytics will talk about it.

More through the course here. We're just flagging it as one of many sources of error and such systems misinterpretation. I mentioned that we bet earlier analytics engines, don't know what they're watching. And so if they draw conclusions, for example, if they'd identify entities in the data, there is a persistent possibility that they will misinterpret.

What that data is I've put down here a famous illustration called the duck rabbit. And for those of you who can't see the image, it's a drawing. And when you look at it, you could see on the one handed duck and you have the two long bills, or the two long parts of its bill.

And then it's, I and so on or if you shift your perspective ever, so slightly, it looks like a rabbit in the long things are. Actually it's ears and it's actually looking the other way from the duck. Now, we can shift back and forth in this perspective, fairly easily is humans and AI can't.

And in tests, the we found that the AI thought it was definitely absolutely a duck or maybe a goose waterfowl of some sort and the possibility that it's a rabbit just never occurred. So that's an example of misinterpretation and it's a persistent problem with artificial. Intelligence distortion is another case of another source of error.

In AI. We see it often in the effects. It's well known that people can be gradually dead into supporting more and more extreme views. Mmm. This is a swell known side effect of recommendation engines and we talked to the about in a previous section and it's well known that a when people have taken a position on an issue they will when questioned entrenched their views interpret evidence in favor of their views.

See the world from the perspective of their views. So and this leads to a hardening of position and sometimes a radicalization of position. Now, the same thing that can happen to a human. In this way, can happen to an artificial intelligence. Where once it decides that it's going to lean a certain way, it just keeps going that way.

That's what happened with the duck rabbit thing, right? But it doesn't just lean that way. It creates a feedback loop where now, begins to interpret everything as evidence for leaning that way, even if it's not good evidence for leaning that way. And so it goes further and further into that position.

Thereby not just misrepresenting the phenomena but actually distorting the phenomena. So, this is distortion and again, here's where we have the instance of tei, that became a racist spot very quickly, simply by taking this input and then expanding on it and, and building on it, bad pedagogy. This one is specific to learning analytics, but if you think about it, all the ways that an artificial intelligence can support pedagogy are also ways AIs.

Can support bad pedagogy, if it's poorly applied. So the good kinds of things and I've got a list of them here. Personalized, learning adaptive group, formation, 24/7, response, virtual reality, learning, for all new methods of teaching assistance for teachers or increasing tech experience for students. These are all good things but these can be misapplied personalized and customized learning can descend into stereotypes.

There is, for example, right now, a huge debate about whether learning styles are a thing. Well, you can use learning, you can use analytics to detect learning styles and then use that to shape the pedagogy. According to the people who say learning styles are a myth, this would create bad pedagogy.

Same with group formation. If you do it badly, you'll end up with groups with one person. If you do a badly, you'll end up with groups where people all come from the same place or all have the same background etc. Instead of the desired diversity in groups that you might like new methods of teaching again garbage in garbage out, right?

If the new methods of teaching that you're using, the AI for are not good methods of teaching VAI is simply going to amplify and implement these bad methods. And so on, we could go through this entire list irrelevance. This is kind of a, an interesting one. I imagine the scenario in which AI produces no positive learning outcome.

However, even if I'm learning outcome or perhaps outcomes of minimal value now, the question becomes, is it ethical to spend all of this time and money developing a eye solutions and applying ace high solutions when you're not really getting any benefit? And this isn't just idle thinking there's a reference here that shows that there are significant negative benefits of AI.

And if we look at this as measured against the UNESCO strategic development goals, or sustainable development goals, and if you look at education and specifically, the ratio is about four to three roughly in favor of which means for every four positive benefits, there are three negative benefits. And that's pretty much a saw off and the use of AI could have significant other negative negative benefits, you know, cause considerable other harms and won't say negative benefits.

That's a terrible way of putting it, you know, and we've talked about, you know, the use of electricity and so on, but we, but other people have talked about these of electricity. The environmental impact, the dehumanizing impact, all of these countering against the positive pedagogical output. So those are our places where artificial intelligence can be an error, you may think of more, if you do think of more feel free to submit them, just go to the all issues page in module three and you'll find that there's a form where you can submit your own issue and you can categorize that as an instance where artificial intelligence does not work but I think it should be clear.

You know, again everybody talks about bias and yes bias is a source of error for artificial intelligence and a significant one in the source of a lot of grief and heartache but it's only one of many. There are many ways that analytics and artificial intelligence can introduce error into our application of analytics and therefore, cause harm by misrepresenting the environment that they've been entrusted with, not just representing, but understanding and identifying best practices or good applications from.

So that's it for this video. I'm Steven Downs. We've still got a few more videos from this module, but I hope you enjoyed this one as much as you can for a list of hairs and I'll see you soon.

Force:yes