Big Data is big. That much is obvious. What might not be so obvious is just how powerful and error prone it can also be. Susan Etlinger of Altimeter Group noted in her 2014 TED talk that “we can process exabytes of data at lightning speed, which also means we have the potential to make bad decisions far more quickly, efficiently, and with far greater impact than we did in the past.”

This developing capability to rapidly make bad decisions based on blindly accepting the data presented to you – especially when said data is presented without context or supporting material, or when judged against pre-existing demographics based on broad assumptions – is an obstacle that organisations will need to overcome. But in order to do so, Etlinger believes that organisations using big data platforms like IDEA Data Analysis need to start adhering to certain ethical principles.

Specifically, Etlinger and colleague Jessica Groopman recommend paying particular attention to The Information Accountability Foundation‘s (IAF) paper A Unified Ethical Frame for Big Data Analysis and its “Principles of Ethical Use;” that data use should be beneficial, progressive, sustainable, respectful, and fair.

1. Beneficial

“Data scientists, along with others in an organization, should be able to define the usefulness or merit that comes from solving the problem so it might be evaluated appropriately.” (IAF)

Before any data is compiled for analysis, there should be an expectation that the processing will deliver value to all concerned parties. Joshua Kanter, senior vice president of revenue acceleration at Caesars Entertainment, mentions, “Before conducting any new analysis, we ask ourselves whether it will bring benefit to customers in addition to the company. If it doesn’t, we won’t do it.”

2. Progressive

“If the anticipated improvements can be achieved in a less data-intensive manner, then less-intensive processing should be pursued.” (IAF)

The value of progressiveness, according to Etlinger and Groopman, is reliant, largely, on the expectation of continuous improvement and innovation. In other words, what organisations learn from applying big data should help deliver better and more valuable results.

3. Sustainable

“Big-data insights, when placed into production, should provide value that is sustainable over a reasonable time frame.” (IAF)

Sustainability, according to the authors, is broken down into these categories: data, algorithmic, and device and/or manufacturer based.

  • Data sustainability: Sustaining value is closely related to what access organizations have to different social data sets. “While this is a fact of access and economics, it can wreak havoc when sets of data from public and private sources are combined,” mention Etlinger and Groopman. “The issue of sourcing also comes into play… Inconsistencies in sample sizes or methodologies affect the integrity of the data and the sustainability of the algorithm.”
  • Algorithmic sustainability: A critical element of sustainability is an algorithm’s longevity. The Altimeter report suggests longevity is affected by how the data is collected and analyzed.
  • Device- and/or manufacturer-based sustainability: A third consideration is the lifespan of the data being collected. “For example, if a company develops a wearable or other networked devices that collect and transmit data, what happens if that product is discontinued, or the company is sold, and the data is auctioned off to a third party?” ask Etlinger and Groopman.

4. Respectful

“Big-data analytics affect individuals to whom the data pertains, organizations that originate the data, organizations that aggregate the data, and those that might regulate the data in different ways.” (IAF)

In their report Etlinger and Groopman state, “The advent of social and device-generated data captured in real time decimates the norms for data analytics[…] As a result, even seemingly minor decisions can have tremendous downstream implications.”

As can be expected, the individual who originated the data will be impacted the most by big-data analysis, in particular making private, semi-private, or even public information more public.

5. Fair

The IAF paper states that “United States law prohibits discrimination based on gender, race, genetics, or age.” As does the United Kingdom,” Yet, big data processes can predict all of those characteristics without actually looking for fields labeled gender, race, or age.”

Etlinger and Groopman consider the ability to predict characteristics at any level just by asking for what they call unintended consequences. To counter unintended consequences, the authors again use Caesars Entertainment as an example, writing:

“Caesars has a simple yet effective litmus test for fairness, which it calls the Sunshine Test—whether the issue can be discussed openly and the final decision disclosed without any sense of misgiving. Before deciding on a course of action that requires customer data, the company’s executives imagine how people would react if all of the details were out in the open, in the light of day. Would it strengthen or threaten customer relationships?”

Joshua Kanter adds, “If the initiative fails the Sunshine Test, we do not move forward.”

Read the source article here.