Microsoft’s new hard sell: GenAI that lives inside laptops than in the cloud | Technology News

May 26, 2024

67 5 minutes read

Microsoft’s new hard sell: GenAI that lives inside laptops than in the cloud | Technology News — MS.jpg

Just ahead of its three-day ‘Build’ annual conference that took place earlier this week in Seattle, Microsoft kicked off a new campaign for its new ‘Copilot+ PC’. Launched in partnership with ecosystem players such as Qualcomm, Intel and AMD, alongside downstream OEM partners including Dell, HP and Samsung, Microsoft has promised that its OpenAI GPT-4o-powered PC will be “20x more powerful and up to 100x as efficient for running AI workloads and deliver industry-leading AI acceleration,” and capable of “outperforming Apple’s MacBook Air 15-inch by up to 58 per cent in sustained multithreaded performance, all while delivering all-day battery life.”

Microsoft’s Windows-based PCs have been consistently edged out by Apple’s devices for years, a trend that has become more pronounced since the launch of the first M1 chip for MacBooks in 2020, which helped Apple devices claim superior battery life and optimised performance. But Apple seems to be seriously behind Microsoft in the AI race. With its new AI PCs (powered by Qualcomm’s new Snapdragon X Elite chip), the Redmond-based software major is confident that the new launch could finally tip the scales back in its favour.

AI inside a PC

AI PCs are essentially personal computers that incorporate specialised processors or accelerators called neural processing units or NPUs, which are optimised to run AI apps locally on the device rather than relying on cloud-based services such as a ChatGPT or a Gemini. The AI PC push aims to capture the sustained AI build-up over the last 24 months. More importantly, it comes at a time when the sharp ramp up in post-Covid demand for new hardware, and accompanying software, is petering out after a nearly four year run. Something new had to be on offer to push the next PC wave, as most of the systems bought by consumers in the post-Covid phase are now up for an upgrade.

With the latest version of Windows, a dedicated Copilot key, and an NPU capable of over 40 trillion operations per second, consumers can run Microsoft Copilot locally on a personal device or machine such as a laptop or a desktop. Unlike traditional GPUs (graphic processing units such as the ones made by Nvidia) or traditional CPUs (central processing units), a NPU is almost entirely optimised for AI computation at the hardware level to improve performance and energy efficiency.
According to Jensen Huang, the co-founder, president and CEO of Nvidia, the concept of “on prem” is becoming cool again. The ‘on device’ model is simply taking the “on prem” model a step further.

AI models: From training to inference, cloud to edge

The development comes at a time when there is talk of a slow pivot in machine learning (ML) and AI models from ‘training’, or the first phase for an AI model, to what is called ‘inference’. Training entails a process of trial and error, or a process of acquainting the AI model with examples of the desired inputs and outputs, or both. That is what foundational models such as OpenAI’s Chat GPT, Google’s Gemini, and Meta’s Llama, have done. Inference is the process that follows AI training, where a trained model is then fine-tuned for practical, and sometimes more specific, use and new applications are built on it.

Traditionally, training models only ran on powerful servers in the cloud, or a network of data centres, given the huge volume of data that needs to be chomped on for simply training the algorithm. For this cloud infrastructure, or public cloud solutions, have provided the base for enterprise computing. Public cloud solutions are fully managed by third-party providers, relieving IT teams of the need to purchase, install, manage and upgrade technology on site. On-premise (or on prem) infrastructure is more like a private cloud environment that is available for use only by one client. On-device ML is when the consumer performs inference with models directly on a device using a mobile app or web browser. AI processing that happens right on a user’s computer, which is also sometimes called edge computing.

With edge computing, the ML model processes input data—like images, text, or audio— on a device, rather than sending that data to an external server. That really is the new frontier, which hardware and software makers in the personal computing space are pushing. Big improvements in compute power (CPU, GPU processing power, and through dedicated ML accelerator blocks such as NPUs) have enabled bringing on-device performance close to what could be achieved on dedicated servers.
The advantages of on-device ML includes lower latency, since there is no round trip to the server and back, and that there is much greater privacy, since the data processing is happening on the user’s device itself. Disadvantages include on-device models needing to be much smaller than their server counterparts.

‘Total’ Recall

Microsoft is touting the advantage of both the lower latency and privacy gains in its Copilot+ PCs, which the Redmond–based company’s Executive Vice President, Consumer Chief Marketing Officer, Yusuf Mehdi called “the fastest, most intelligent Windows PCs ever built”.

Three big positives from a user application point of view include a function called Recall, with which to easily find and remember what a user has seen on the PC; a function called Cocreator that enables generating and refining AI images in near real-time directly on the device; and, a function called Live Caption that makes it possible for a user to bridge language barriers by translating audio from 40+ languages into English.

The Recall function is really the big draw here, which essentially works like a photographic memory, presenting the user with what they have previously seen or done on the PC. Recall is not keyword search, but rather a comprehensive, even pictographic, search over all a user’s entire history. How it works is that Windows will constantly take screenshots of what’s on a user’s screen whenever the device is operational, and then use a generative AI model right on the device and the on-board NPU to process all this data and make it searchable, including specific inputs such as dresses in photos or figures on a bar graph slide.

This function really places a premium on privacy and hence the fact that it is all being done locally and a user can trust that all the data is on her computer, instead of this data being uploaded and then retrieved using an AI application that runs on the cloud.

The last word

Microsoft clearly is beginning to have a clear edge in AI. But Apple is unlikely to take this lying down. The company is set to host its annual Worldwide Developers Conference event on June 10, where it’s expected to unveil its own AI apps and hardware. This could include its new M4 Pro and M4 Max chips, which are touted as a significant scale up over its latest M4 chips and could even outgun Qualcomm’s chips.

The AI twist could finally turn the battle between Macs and Windows PCs interesting again.

Edot: On-device machine learning

On-device machine learning happens when the user performs AI inference with models directly on a device using a mobile app or web browser. The advantages include lower latency, since there is no round trip to the server and back, along with greater privacy as the data processing is happening on the user’s device itself.

Source

May 26, 2024

67 5 minutes read