The dictum ‘if you don’t pay, you are the product yourself’ is more applicable than ever in the AI era. Data is like gold, just like its security and the ability to have some control over where that data is stored. There now seem to be two kinds of users of the internet, social media, cloud environments and other IT infrastructure: those who pay for control over their data and those who do not (or do less). The latter group might get the idea their data is up for grabs.
Anyone who runs an enterprise IT environment knows there is no such thing as a free lunch. If you want your data stored securely, compliantly and away from prying eyes, you have to be willing to pay for it.
Yet, in recent years, a standard has crept into the broader IT landscape that says your data doesn’t belong to you at all unless you pay for it. Even when you do, it remains to be seen whether the processor of your data can always guarantee compliance with all legislation.
‘All information on the internet is freeware’
Mustafa Suleyman, boss of Microsoft’s AI unit, whose portfolio includes browser Edge, search engine Bing and the much-discussed AI assistant Copilot, recently made a bold if not outright outrageous statement. According to him, all info on the Internet since the 1990s is basically ‘freeware’. He said this at a conference during an interview with a host of TV channel CNBC. In other words, everything on the Internet is fair game for AI training and similar applications. Indeed, according to him, this would have long been the accepted norm.
According to Suleyman’s entirely unique take on reality, only companies that have stated in advance that their content is not generally usable will be able to avoid their content being harvested. And where ambiguity exists, ‘it’s going to work its way through the courts’. This approach is at odds with copyright law in many countries, but this is how Microsoft’s AI boss thinks.
Suleyman puts into words a mentality that is probably commonplace among large tech companies: users’ data does not belong to them unless those users pay to not have their data used—if that option exists at all. Social media companies, of course, have long been taking advantage of everything users post on their platforms. Making user data available to paying advertisers is their revenue model.
Selling user data
Recently in the news was the ‘pay or consent’ model used by Meta, the parent company of Facebook and Instagram. In short, it amounts to the company selling users’ data to advertisers unless those users pay to prevent that. Even though Meta says it complies with the law, the European Commission now threatens a fine for these practices.
Armed with the recently introduced Digital Markets Act, the EU has opened hunting season on monopoly practices by US Big Tech companies. The aim is to ensure consumer choice and keep the market open to smaller providers, but there are often privacy, security, and data residency issues attached to such themes.
Earning from user-generated content
And for good reason. Forget advertisers, AI is now bringing in more money than those social media companies ever thought possible. Social platform Reddit, for example, now earns handsomely from user-generated content on its platform. It sells this content for tens of millions a year to OpenAI and Google for AI training. In fact, it protects that revenue source to the point that the company now turns away fledgling AI data collectors who haven’t signed a deal with Reddit.
Only with a lucrative contract (for Reddit) is it possible to scrape data from the platform, which starting AI companies cannot afford. This is an understandable action from Reddit’s perspective, but it shows how little users control their own data. After all, where are the users’ voices in this story? Answer: those voices are being monetized.
Tip: OpenAI deal with Reddit shows complete lack of user agency over content
Paying for data residency
Another example involving a representative of Big Tech concerns data residency: research from various Dutch IT organizations into Google Workspace for Education revealed that only the paid version offers the ability to store data in the European Union. The tone of the coverage is fairly unctuous: educational institutions can safely use those services; they just have to use the paid versions. That sounds logical, but still, data only stays in the EU for those who pay. Compliance is a premium service.
Even such a ‘promise-for-payment’ is sometimes a wash. Indeed, choosing a local data center does not always guarantee ‘data residency’. A wry example is Police Scotland, where after an IT specialist’s freedom of information request, it turned out that data residency regarding police evidence could not be guaranteed at all—at least not for data-in-transit.
Even though legislation required such sensitive data to remain in the UK, that did not happen in all cases. In fact, Microsoft reported in correspondence with Police Scotland that data travelling overseas would be ‘inherent’ in Azure’s architecture.
Buying logging capacity
In terms of security, too, a cheaper subscription sometimes means that you can exercise less management over your own data. Until last year, for example, Microsoft limited users’ ability to see logging data in Azure Active Directory applications. It concerns readily available data that could only be accessed by more expensive accounts. An artificial security barrier, in other words.
Only Purview Audit Premium customers could detect an incident like last year’s when the Chinese hacker group Storm-0558 obtained an MSA private key. Meanwhile, pressure from the U.S. Cybersecurity & Infrastructure Security Agency (CISA) has ensured that Microsoft now offers extended logging capabilities for free.
And still no guarantees
The examples above deal with different issues: data residency, security, consumer choice, advertising, AI training. What they have in common is that everyone seems to have something to say about user data, except users themselves. The precedent now seems to be that only if you pay can you choose not to have your data sold to advertisers, to be compliant or to gain insight into comprehensive security settings. And even then, you have no guarantee that the paid service will honour your wishes or is even able to comply.
That’s a bad thing because this way business users, as well as ordinary consumers, can get the idea that their digital properties are slipping through their fingers like loose sand. When companies, lawmakers and advertisers bicker over user data but those users themselves feel they have lost control of their data, it fuels scepticism about the desirability of making a ‘digital transition’ in the first place.