Apple, a company practically synonymous with technological innovation, has once again positioned itself at the forefront of the AI revolution. The Cupertino, Calif.-based company recently announced significant strides in artificial intelligence research through two new papers introducing new techniques for 3D avatars and efficient language model inference. These advancements could enable more immersive visual experiences and allow complex AI systems to run on consumer devices such as the iPhone and iPad.
In the first research paper, Apple scientists propose HUGS (Human Gaussian Splats) to generate animated 3D avatars from short monocular videos (i.e. videos taken from a single camera). This method only requires a short video with 50-100 frames and can automatically learn to separate the static background from a fully animatable human avatar in about 30 minutes.
HUGS represents both the human and the background scene using 3D Gaussian splatting, an efficient rendering technique. The human model starts from a statistical body shape model called SMPL. However, HUGS allows for the Gaussians to deviate, capturing details like clothing and hair. A unique neural deformation module animates the Gaussians realistically using linear blend skinning, avoiding artifacts when reposing the avatar. HUGS can generate new human poses and views of both the human and the scene.
Compared to previous methods, HUGS is up to 100 times faster in training and rendering. Researchers show photorealistic results after optimizing the system for just 30 minutes on a typical gaming GPU. HUGS also surpasses state-of-the-art techniques like Vid2Avatar and NeuMan in terms of 3D reconstruction quality.
The new technology allows quick insertion of digital characters, or “avatars,” into new scenes using just one video of a person and the place, making the images appear smooth and realistic by updating 60 times per second. This breakthrough could have applications in virtual try-on, telepresence, and synthetic media, opening up possibilities for creating novel 3D scenes directly on your iPhone camera.
In the second paper, Apple researchers addressed a key challenge in deploying large language models (LLMs) on devices with limited memory. Modern natural language models like GPT-4 contain hundreds of billions of parameters, making inference on consumer hardware expensive.
The proposed system minimizes data transfer from flash storage to limited DRAM during inference. Their method includes constructing an inference cost model that aligns with flash memory behavior, aiming to reduce the volume of data transferred from flash and reading data in more extensive, contiguous chunks.
Two main techniques are introduced: “Windowing,” which reuses activations from recent inferences, and “row-column bundling,” which reads larger blocks of data by storing rows and columns together. These methods improve inference latency significantly—by 4-5x on an Apple M1 Max CPU and 20-25x on a GPU compared to naïve loading.
These advancements could soon enable advanced LLMs to run efficiently on iPhones, iPads, and other mobile devices, expanding their applicability and accessibility.
Both papers demonstrate Apple’s growing leadership in AI research and applications. While these advancements are promising, experts caution that Apple will need to handle these technologies responsibly, considering societal impacts such as privacy protection and misuse prevention.
As Apple potentially integrates these innovations into its product lineup, it’s clear the company is not just enhancing its devices but also anticipating future needs for AI-infused services. By enabling more complex AI models to run on devices with limited memory, Apple is setting the stage for a new class of applications and services using the power of LLMs in previously difficult scenarios.
Publishing this research shows Apple’s commitment to the broader AI community, potentially spurring further advancements in the field. This move reflects Apple’s confidence in its position as a tech leader and its dedication to pushing the boundaries of what’s possible.
If applied judiciously, Apple’s latest innovations could take artificial intelligence to the next level. Photorealistic digital avatars and powerful AI assistants on portable devices once seemed like a distant possibility—but thanks to Apple’s scientists, the future is rapidly becoming a reality.