Apple's Personal Cloud Compute

June 11, 2024

Apple’s keynote yesterday was very exciting - I’m more excited to upgrade my iPhone and Macbook than I have been for many years because of the AI capabilities that Apple is building in.

Quill runs as much as possible locally because of all the advantages that Apple pointed out - it’s secure, there’s a lot of context and power in other data and apps, and it’s the interface that we all use every day to live our lives.

One of biggest announcements IMO is the private cloud compute infrastructure. This represents a solution to scaling AI capabilities for older and cheaper machines that only Apple could build. We do a lot of work to figure out the capabilities of users’ machines, load different models, and run parts of our pipeline at different frequencies and priorities in order to gracefully scale capabilities up and down. Apple is sidestepping this for people with older machines by outsourcing inference to their private cloud built with Apple Silicon.

Because it’s Apple Silicon, I suspect the same models are going to be able to run on device and in the cloud, reducing development overhead, and meaning that server usage will go down over time as consumer devices get more and more powerful so models can smoothly transition to running fully on device for future devices.

Outstanding questions

  • Is there a pre-defined set of models on the private cloud compute servers or can custom models be uploaded?
  • If custom models can be uploaded, when will this be available to 3rd party developers? What will the costs be?
  • If it’s a defined set of models, will they all be available in Swift APIs? e.g. could we make a call to Apple’s own LLM instead of going to OpenAI, Anthropic, or Google?

Personally, as a user, I would love to use apps whose models are compatible with Apple’s compute so that my data is always going only to Apple and not to a huge number of potentially unknown startups.

As a developer, making this capability transparent to models that use MLX or CoreML would drive massive adoption of those capabilities.

So, my guess is that Apple will use this infrastructure internally for a year, but eventually open it up to 3rd-party apps.

I’d imagine they will a process equivalent to app notarization where trusted developers register models with Apple’s private cloud compute before releasing within their app, then those models can be trusted when run on user devices and on PCC without having to have each user upload potentially gigabytes of model weights. If we could skip the CoreML compilation that happens when a model is run for the first time on a particular machine, that would be even more amazing. Currently, that step prevents Quill from using CoreML for some models because we have to have the first experience using Quill run immediately.

Overall, I suspect that Apple’s moves here are going to set a gold standard for personalized AI assistance for the average person. Once again, Apple comes a little late, but makes bold UX moves enabled by their history of solid hardware & software integration.



m [at] mpdaugherty.com