On-Prem

Personal Tech

Portable Large Language Models – not the iPhone 15 – are the future of the smartphone

Personal AI can redefine the handheld experience and perhaps preserve privacy too


Column Smartphone innovation has plateaued. The iPhone 15, launched overnight, has some nice additions. But my iPhone 13 will meet my needs for a while and I won't rush to replace it. My previous iPhone lasted four years.

Before that phone I could justify grabbing Cupertino’s annual upgrade. These days, what do we get? The iPhone 15 delivered USB-C, a better camera, and faster wireless charging. It's all nice, but not truly necessary for most users.

Yet smartphones are about to change for the better – thanks to the current wild streak of innovation around AI.

Pretty much everyone with a smartphone can already access the "Big Three" AI chatbots – OpenAI's ChatGPT, Microsoft's Bing Chat and Google's Bard – through an app or browser.

That works well enough. Yet alongside these "general purpose" AI chatbots, a subterranean effort – spearheaded by another of the behemoths of big tech – looks to be gaining the inside track.

Back in February, Meta AI Labs released LLaMA – a large language model scaled down both in its training data set and in its number of parameters. Our still-rather-poorly-intuited understanding of how large language models work equates a greater number of parameters with greater capacity – GPT-4, for example, is thought to have a trilion or more parameters, though OpenAI is tight-lipped about those numbers.

Meta's LLaMA gets away with a paltry 70 billion and, in one version, just seven billion.

So is LLaMA only one two-thousandth as good as GPT-4? This is where it gets very interesting. Although LLaMA has never beaten GPT-4 head-to-head in any benchmarking, it's not bad – and in many circumstances, it's more than good enough.

LLaMA is open source-y in a kinda sorta very Meta-ish way, enabling a field army of researchers to take the tools, the techniques and the training and improve them all, rapidly and dramatically. Within weeks, we saw Alpaca, Vicuna and a menagerie of other large language models, each tweaked to be better than LLaMA - all the while drawing closer to GPT-4 in benchmarking.

When Meta AI Labs released LLaMA2 in July – under a less Meta-centric license – thousands of AI coders set to work tuning it for a variety of use cases.

Not to be outdone, three weeks ago Meta AI Labs also did its own bit of fine tuning, releasing Code LLaMA – tuned to provide code completions inline with an IDE, or simply to be fed code for analysis and repair. Within two days, a startup called Phind had fine-tuned Code LLaMA into a large language model that beat GPT-4 – albeit at a single benchmark.

That's a first – and a warning shot across the bow of OpenAI, Microsoft and Google. It seems these "tiny" large language models can be good enough, while also small enough that they don't have to run in an airplane-hangar-sized cloud computing facility where they consume vast resources of power and water. Instead, they can run on a laptop – even a smartphone.

That's not just theory. For months I've had the MLC Chat app running on my iPhone 13. It runs the seven-billion-parameter model of LLaMA2 without much trouble. That mini-model is noticeably less bright than the LLaMA2 model that employs 13 billion parameters (which sits in a sweet spot between size and capability) – but my smartphone doesn't have enough RAM to hold that one.

Nor does the iPhone 15 - although Apple's spec sheets omit details of RAM.

These personal large language models – running privately, on device, all the time – will soon be core features of smartphone operating systems. They'll suck in all your browsing data, activity and medical data, even financial data – all the data that today we hand off to the cloud to be used against us – and they will continuously improve themselves to represent more accurately our states of mind, body, and finances.

They'll consult, they'll encourage - and they'll warn. They won't replace the massive general purpose models – but neither will they leak all our most personal data to the cloud. Most smartphones already have enough CPU and GPU to run these personal large language models, but they need more RAM – the better to think with. With a bit more memory, our smartphones can grow wildly smarter. ®

Send us news
65 Comments

Zuck dives deeper into the metaverse, dragging Snoop Dogg along for ride

Meta's annual conference sees the company playing catch-up to OpenAI but pulling ahead of Apple

Colleges snub Turnitin's AI-writing detector over fears it'll wrongly accuse students

By the time they graduate, employers will be making them use LLMs anyway

Intel slaps forehead, says I got it: AI PCs. Sell them AI PCs

People try to put us down, talkin' 'bout ML generation

Medium asks AI bot crawlers: Please, please don't scrape bloggers' musings

OpenAI and Google might respect robots.txt but how about the others?

IRS using AI to catch rich people and tax-dodging corps

Plus: Google CEO says AI will be biggest tech shift in our lives, new official AI words on Dictionary.com

OpenAI in talks with Jony Apple Ive and Softbank over iPhone-but-for-AI monster

So, a portable Alexa or Google Home-esque gadget?

Chip firm accused of IP theft bites back, claims Apple's contracts are rotten

iGiant says Rivos poached talent and SoC designs in '22

UK judge rates ChatGPT as 'jolly useful' after using it to help write a decision

PLUS: Coca-Cola's AI-designed drink to debut; chip startups struggle to compete with Nvidia as funding flees

Unions claim win as Hollywood studios agree generative AI isn't an author

The pen is (slightly) mightier than the algorithm

Apple pairs well with profits, not repair shops

iFixit demotes iPhone 14 from 7/10 to 4 after reality of software locks hit home

The iPhone 15 has a Goldilocks issue: Too big or too small. Maybe a case will make it just right

Fanboi numbers are well down – but Apple's queueing system, rather than apathy, is likely the cause

Cloudflare loosens AI from the network edge using GPU-accelerated Workers

Isn't that how Skynet took over?