The Value of Data: A Case for Decentralized AI

By Ivan Nonveiller and Léandre Larouche

A handful of companies were able to collect and monetise large data-sets since the highly lucrative digital data revolution started in the early 2000s. Decentralized AI computing has the potential to slow down the aggregation of big data within FAMGA and other siloed networks and to empower users. 

Nearly three decades ago, Timothy Berners-Lee intended to help scientists share data across an obscure system of interconnected computer networks called the Internet. He released a free source code, which aimed for an open and democratic platform, and called it the World Wide Web.

Last week, almost 30 years later, Berners-Lee outlined a manifesto for a new and fairer internet in a Medium blog post called One Small Step for the Web…. He displayed an intensity and urgency rather uncharacteristic for an MIT academic.

As part of his project, Berners-Lee announced Inrupt, an MIT-fueled company, and Solid, an open-source web decentralization project. The objective underlying these projects is simple: to decentralize data ownership and control, empowering Internet users instead of a handful of large tech companies.

Berners-Lee has been calling for more Internet regulation for a while now. He said to have been disturbed by the Cambridge Analytica scandal, and he pointed out how FAMGA (Facebook, Apple, Microsoft, Google and Amazon) collects gigantic amounts of data and missuses users’ personal information.

Berners-Lee’s initiative is not only ambitious but also well timed. A few days before publishing his Medium post, Facebook had revealed the details of a massive data breach affecting 30 million users.  

FAMGA’s Market Cap

Today, FAMGA accounts for a remarkable $3.5 trillion of the $8.2 trillion total market cap of the top 100 NASDAQ companies. FAMGA is so dominant that its market cap is greater than the next 44 largest market cap tech companies combined.

Such a massive discrepancy did not always exist, however. In 2006, the marketplace had an altogether different face. The top-5 companies were Exxon Mobil, Wal-Mart Stores, General Motors, Chevron and Ford Motors—none of which specialize in tech. From 2008 onward, all top-5 corporations but Microsoft were replaced by tech companies. 

Information-technology companies dominate the fastest-growing industries. Their growth rate is more than double of the average industry growth (of 6.8%). There are several possible factors at play here. Acceleration of innovation, a mass audience movement from TV to internet and the 5 tech giant’s obsessive consumer-centred, data-driven attitude. However number one factor seems to be the capacity to monetise personal and behavioural data.

Tesla is worth more than Ford or GM even though it has less than 1% of their sales. When Ford or GM sells a car to a customer, their relationship with that customer is typically limited, except for servicing and maintenance. Tesla, on the other hand, collects terabytes of driving data from its customers, sometimes even including video data. The data Tesla collected is then put to use in automation and improving the self-driving features of its cars.  

Tesla’s data collection is likely to translate into a huge advantage in making safe and effective self-driving cars since this work depends on machine learning, which in turn requires reams of the data that AI learns from. Experts estimate that Tesla’s new mass-market Model 3 sedan will be up to 10 times safer than the average car.


Big Data and Artificial Intelligence

A recent article published in The Economist estimates that the world’s most valuable resource is no longer oil, but data, and this fuel is processed by machine learning data analytics. FAMGA is the dominant player in the AI ecosystem, with Google spending more than $433 million/year on DeepMind. A 2018 KPMG report estimates that investment in AI, along with machine learning and robotic process automation (RPA) technology, is set to reach $232bn by 2025.

Experts estimate that digital advertising data for an average US consumer per media channel is worth roughly $240 per year. But for advertising to work effectively, companies need to use large amounts of data and employ data scientists. Machine learning allows companies to use data to improve their customer experience as well as the effectiveness of their products and services. One striking example is Netflix: automated recommendation accounts for around 75% of what people watch on the platform. As for the online retail giant Amazon, more than a third of what people buy on the website is attributed to purchase recommendations.

Meanwhile, Facebook, which owns the popular app Instagram, uses machine learning to recognise the content of posts, photos and videos and display relevant ones to users as well as filter out spam. While it used to rank posts chronologically, it now selects post and ads and by relevance, which keeps users more engaged.

Without machine learning, Facebook would never have been able to achieve its current scale, argues Joaquin Candela, head of Facebook applied AI group. “Companies that did not use AI in search, or that were late to do so, struggled. Such was the case of Yahoo and its search engine as well as Microsoft’s Bing.”

Berners-Lee, Solid and Data Empowerment

In order for companies like Facebook and Google to provide us with their services for free in exchange of private user data. Berners-Lee believes that the current model where users have to hand over personal data to digital giants in exchange for perceived value is far from a fair deal.

Indeed, it is increasingly clear that trade between users and FAMGA is not in the user’s best interests. The corporations often undermine users privacy rights. The Solid platform, however, is guided by the principle of “personal empowerment through data,” which is fundamental to the success of the next era of the web. Data should empower each and every one of us.

Solid, which is built using the existing web, attempts to remediate to the problems surrounding data collection. It gives every user the choice of where data is stored, which specific people and groups can access select elements, and which apps they use. It allows users, as well as their family and colleagues, to link and to share data with anyone. It also allows people to look at the same data with different apps at the same time.

Solid is how we develop the next era of the web in order to restore balance — by giving every one of us complete control over data, personal or not, in a revolutionary way. It unleashes incredible opportunities for creativity, problem-solving and commerce. It will empower individuals, developers and businesses with entirely new ways to conceive, build and find innovative, trusted and beneficial applications and services.

There are multiple market possibilities, including Solid apps and Solid data storage. Imagine if all your current apps talked to each other, collaborating and conceiving ways to enrich and streamline your personal life and business objectives? That’s the kind of innovation, intelligence and creativity Solid apps will generate. With Solid, you will have far more personal agency over data — you decide which apps can access it.


Orion Platform and Blockchain-based AI

Berners-Lee is not alone making the case for a more decentralized, and neither is Solid the only platform out there to push for data empowerment. George Gilder, the author of greatly influential books on economics and most recently of Life After Google, predicts the fall of Big Data and the rise of blockchain technology in our day-to-day life. There are multiple efforts coordinated by AI companies to promote decentralized AI computing.

Orion Platform is a blockchain-based, AI-focused, distributed supercomputing platform established to meet the growing demand for computational power required for complex AI computations. The platform was developed by Nebula AI, a Montreal-based startup that collaborates with Concordia and McGill University.

Orion’s computational supply is harnessed from distributed GPU-based computing resources owned by individuals as well as crypto-mining and gaming companies. Nebula AI is able to buy these companies’ compute power for more than they can earn by mining or providing computing power for gaming. At the same time, Nebula AI provides the computing resources to AI companies for less than they pay to traditional compute suppliers.