Bias

Category: When Analytics Does Not Work

The subject of bias in analytics is wide and deep. In one sense, it is merely a specific way analytics can be in error or unreliable. But more broadly, the problem of bias pervades analytics: it may be in the data, in the collection of the data, in the management of the data, in the analysis, and in the application of the analysis.

The outcome of bias is reflected in misrepresentation and prejudice. For example, "the AI system was more likely to associate European American names with pleasant words such as 'gift' or 'happy', while African American names were more commonly associated with unpleasant words." (Devlin, 2017) "The tales of bias are legion: online ads that show men higher-paying jobs; delivery services that skip poor neighborhoods; facial recognition systems that fail people of color; recruitment tools that invisibly filter out women" (Powles and Nissenbaum, 2018).

One cause of bias lies in the data being used to train analytical engines. "Machine learning algorithms are picking up deeply ingrained race and gender prejudices concealed within the patterns of language use, scientists say." (Devlin, 2017)

Another cause is inadequate data. For example, Feast (2019) writes of 'omitted variable bias', which "occurs when an algorithm lacks sufficient input information to make a truly informed prediction about someone, and learns instead to rely on available but inadequate proxy variables." For example, "if a system was asked to predict a person's future educational achievement, but lacked input information that captured their intelligence, studiousness, persistence, or access to supportive resources, it might learn to use their postal code as a proxy variable for these things. The results would be manifestly unfair to intelligent, studious, persistent people who happened to live in poorer areas" (Eckersley, et.al., 2017).

A third cause of bias is found in the use of labels in data collection and output. " The vast majority of commercial AI systems use supervised machine learning, meaning that the training data is labeled in order to teach the model how to behave. More often than not, humans come up with these labels" (Feast, 2019).

It may be argued that we have always faced the problem of bias. "A problematic self-righteousness surrounds these reports: Through quantification, of course we see the world we already inhabit" (Powles and Nissenbaum, 2018). It is true that discrimination and prejudice have a long history. However, applying analytics to them exaggerates the problem. "AI is not only replicating existing patterns of bias, but also has the potential to significantly scale discrimination and to discriminate in unforeseen ways" (Fjeld, et.al., 2020:48).

Examples and Articles

A Detroit community college professor is fighting Silicon Valleyâ€™s surveillance machine. People are listening.
"Far from academiaâ€™s elite institutions, Gilliard, 51, has emerged as an influential thinker on the relationship between trendy tech tools, privacy and race. From â€œdigital redliningâ€ to â€œluxury surveillance,â€ he has helped coin concepts that are reframing the debate around technologyâ€™s impacts and awakening recognition that seemingly apolitical products can harm marginalized groups. While some scholars confine their work to peer-reviewed journals, Gilliard posts prolifically on Twitter, wryly skewering consumer tech launches and flagging the latest example of what he sees as blinkered techno-optimism or surveillance creep. (Among his aphorisms: â€œAutomating that racist thing is not going to make it less racist.â€) Itâ€™s an irony of the world Silicon Valley has constructed that an otherwise obscure rhetoric and composition teacher with a Twitter habit could emerge as one of its sharpest foils. Among a growing chorus of critics taking on an industry thatâ€™s remolding the world in its image, Gilliard is not the most prominent or credentialed. Yet his outsider status is integral to a worldview that is finding an audience not only on social media but in the halls of academia, journalism and Washington." Direct Link

AI for Good: Battling Bias Before it Becomes Irreversible
"Melvin Conway observed that how organizations were structured would have a strong impact on any systems they created. This has become known as Conwayâ€™s Law and it holds true for AI. The values of the people developing the systems are not just strongly entrenched, but also concentrated." Direct Link

Bias on the Web
Good overview of a number of different sources of bias. Ricardo Baeza-Yates Communications of the ACM, June 2018, Vol. 61 No. 6, Pages 54-61 10.1145/3209581 Direct Link

Do you have another example of Bias? Suggest it here

Force:yes