StrategicAI
Posts
Will AI kill the Internet? How big tech owns us all, how Elon is upending news, Google's medical AI, and the first commissioned Sora music video

Will AI kill the Internet? How big tech owns us all, how Elon is upending news, Google's medical AI, and the first commissioned Sora music video

May 10, 2024

Thought of the week: Will AI-generated content kill the Internet?

“Within ONE night, I’ve generated 782 blogs, enough to post for the next 2-3 years.”

This testimonial, for a Vietnamese app that helps companies automate AI spam content generation, is a great example of the present and future of content generation. A report by Europol, the EU’s crime agency, estimates that “as much as 90 percent of online content may be synthetically generated by 2026”. Further research highlights that the majority of internet visits and traffic are in fact automated bots that roam the web and scrape data to feed the insatiable desire for AI training data. In Ireland, for example, 71% of internet traffic is automated according to the same research.

All these stats lend weight to the “dead internet theory”. This is the idea that the fountain of knowledge that the internet was designed to be has been fundamentally polluted. First by Google incentivising companies to search engine optimise their content, then by social media firms encouraging pointless posts, and now by large language models (LLMs) like ChatGPT arming anyone to do all the above but in any language, thousands of times and in seconds.

So what does this new world of fully commoditised content mean for how we find, consume and write it?

Content fatigue. Many are already talking about the instant switch-off when reading synthetic content. This will only grow as people crave more authentic and human ways to communicate.
The search engine is dead; long live search agents. If everyone can use AI to generate search engine optimised content, then what’s the point of search engines? The concept of wading through poor content is looking increasingly outdated. AI-driven search agents such Perplexity are increasingly eating into Google’s market share, and OpenAI is rumoured to be prepping a summer release of its own search engine.
Curation is key. People will happily pay someone else to cut through the AI-generated fog on their behalf (though this might eventually be AI agents trained on your viewing history and personal interests).
Trusted content will come at a cost. Ironically, AI content that has been created from stolen news content may be the one thing that saves journalism. As long as media organisations resist the urge to use AI (which I don’t think they will), paywalled, premium content will be increasingly valuable.

One possible outcome is the emergence of multi-tier internets. Those who value curated, balanced and well-researched premium content (a very small proportion of the global population) pay for it, whilst the vast majority continue to be swayed by AI-generated, echo-chamber fodder that tells them what they want to hear and serves whatever interests lie behind its generation. A third tier will be the increasing number of people who opt out of online content altogether.

How big tech owns us

I’m writing a chapter of my AI book on how the web of applications owned by big tech (i.e. Google, Microsoft, Amazon and Apple) have unprecedented levels of our personal data. Google ‘owns’ me in that it has every email I've ever written, all the files I’ve stored on Google Drive, what searches I’ve done (even via the misnomer of incognito browsing), what I’ve spent money on, where I’ve been and who my friends are.

Slap on a layer of today’s crude AI systems and you’ve got a 360-degree panopticon that knows who you are, what you’re likely to do and, worse, how to get you to do what it wants you to do (vote, consume, etc). In an interview with MIT Technology Review, Sam Altman clearly states that this is what OpenAI is working towards and that an AI agent would know everything about the person using it.

Given the huge costs of building AI models (Gemini Ultra is estimated to have cost $191 million, and OpenAI spends upwards of $700,000/day to let you play with ChatGPT), surely it’s only a matter of time before big tech mines our data to get a return?

Elon is trying to kill news

Elon Musk continues his quest to challenge the status quo on, well, everything. This week, he’s upending the way social media presents news. Premium subscribers on X will no longer get their news from traditional and trusted sources like newspapers and media outlets. Rather X's AI chatbot, Grok, will analyse thousands of X posts to generate news summaries of the news.

Let’s unpack this. The content for these news summaries (called 'Stories'), will be the content of posts and comments on news articles rather than the articles themselves! In a similar vein to Gogglebox, the inane but hugely popular UK TV programme that films people commenting on TV, comments and posts will become the news.

As I discussed last week, media outlets have the stark choice of paywalling their content, selling out to big tech or facing obsolesce from apps summarising their content without paying for the content itself.

The battle for a medical ChatGPT hots up

Healthcare systems the world over are broken. They tend to be either non-existent or deadly (in most developing countries) or breaking down under the unprecedented and twin pressures of an increasingly geriatric population and falling public spending (in most developed countries). So, unsurprisingly, many are looking to AI as a low-cost, high-value solution to providing healthcare to the masses.

But we’ve been here before. Anyone remember IBM Watson? No? Exactly. Let me refresh your memory. IBM built an AI that managed to beat the two highest ranked players in the number one American question and answer game 'Jeopardy!' in 2011. A decade later, on the back of the fanfare surrounding this AI victory, IBM massively oversold and overhyped a ChatGPT forerunner called Watson.

Watson, according to IBM, was a cutting-edge AI system that would enable hospitals and doctors to do way more with less: from diagnosing cases faster to helping find the all so elusive cure for cancer.

The slight fly in the ointment was that Watson actually gave inaccurate and unsafe recommendations, leading high-profile hospitals to cancel their contracts. The consensus from the many Watson post-mortems is that IBM’s AI failed due to a squeamishness about data privacy. With GDPR freshly introduced, IBM wasn’t allowed to train their AI on real-world patient data. Instead, it was trained on hypothetical cases provided by a small group of doctors in a single hospital, which didn’t control for the doctors’ own biases and blind spots and wasn’t necessarily generalisable to all patient cases.

Fast forward to 2024, and hundreds and billions of dollars are being poured into medical AI, and for a very good reason; LLMs are actually really good at drawing inference from data. Unlike Watson, LLMs like ChatGPT have the dual advantage of being trained on millions of real patient records and also every single medical paper known to man and woman. But their killer app is being incredible at the ‘Needle In A Haystack’ (NIAH) problem.

The NIAH test is one of many evaluations used by AI researchers to assess the efficacy of AI models. They typically involve uploading a pretrained LLM with, say, a million medical records and asking it to find the ‘needle’ (typically a single sentence). This kind of task would take a team of humans years to achieve. A well-trained AI system can do this in seconds.

This is no trivial feat. I remember the bureaucratic mountain that I had to climb a decade ago to get a copy of my medical records. This involved tracking down a tiny office tucked away in the bowels of my local Victorian-era hospital, and submitting, in person, all sorts of paperwork. I then had to wait weeks to receive a huge bundle of documents, some of which were illegible doctors' scribbles and none of which I could understand. Luckily, most developed country health systems now insist on digital record-keeping, which offers up the possibility of a Watson-like system that actually works. This would see reams of useful data being made available for almost instant analysis by any clinician or researcher. This will massively reduce the cognitive load that doctors are under and enable them to make better evidence-based diagnoses.

So far so good, as long as you have good patient data and functional hospitals staffed with doctors. However, things are dire in developing countries where public health systems are so bad that they are seen as one step removed from the morgue. Unsurprisingly, African health indicators are terrible, with the average Nigerian not expected to make it past 53. Coupled with this, a chronic lack of health data means that most health ministries have no idea why people die or how to put in place policies to help reduce the huge number of preventable deaths.

Whilst AI is of limited use in a hospital with no doctors or electricity, it can at least help with digitising records and building embryonic national health data profiles that can show where scarce resources should be targeted.

More robots doing amazing things

Competition for headlines and online attention is so fierce that even student researchers claiming to have found an AI breakthrough can garner millions of views without that pesky thing called 'peer review'. Robotic research is even more prone to this syndrome, as the one thing that’s guaranteed to freak people out is a humanoid robot.

At the moment, there are lots of silly videos of very expensive and complex bits of kit being deployed to do really simple things such as fold a shirt, or put things in a box; skills that a three-year-old can do without being wired up. So we have far to go before these robots are generally useful, especially whilst we have lots of excess human capacity.

However, what’s changed in 2024 is that all of these bots now come with an off-the-shelf brain. Gone are the strange-sounding metallic tin men who were more cuddly than scary. The cost of research is falling exponentially. There is massive demand for industrial robots and a seemingly unlimited pool of investment cash. Expect to see generally useful robots by the end of 2025.

Other interesting AI news this week

The first officially commissioned Sora music video!

Researchers predict that generative AI will soon be designing new drugs and inventing viable proteins and structures with designs that humans have yet to think of.
Nick Bostrom, Director of Oxford's Future of Humanity Institute, says superintelligence could happen in timelines as short as a year and is the last invention we will ever need to make.
How generative AI is clouding the future of Google search.
Humans now share the web equally with bots, report warns amid fears of the ‘dead internet’.
Recruiters are going analogue to fight the AI application overload.

Tools that we're playing with this week

Explorer is a really cool way to visually display knowledge trees about people, places and things. Billed as a discovery engine, it creates a table of contents on a topic, and then visualises each section and concept with an optimal image or diagram. It hallucinates often, but the concept of visual knowledge graphs is awesome.

That's all for this week. Subscribe for the latest innovations and developments with AI.

So you don't miss a thing, follow our Instagram and X pages for more creative content and insights into our work and what we do.