Categories: News

Yet another tech startup wants to topple Nvidia with ‘orders of magnitude’ better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B

Portrait of a man with glasses. Half face in cyberspace. Artificial intelligence concept

Sagence brings analog in-memory compute to redefine AI inference

Ten times lower power and 20 times lower costs

Also offers integration with PyTorch and TensorFlow

Sagence AI has introduced an advanced analog in-memory compute architecture designed to address issues of power, cost, and scalability in AI inference.

Using an analog-based approach, the architecture offers improvements in energy efficiency and cost-effectiveness while delivering performance comparable to existing high-end GPU and CPU systems.

This bold step positions Sagence AI as a potential disruptor in a market dominated by Nvidia.

Efficiency and performance

The Sagence architecture offers benefits when processing large language models like Llama2-70B. When normalized to 666,000 tokens per second, Sagence’s technology delivers its results with 10 times lower power consumption, 20 times lower costs, and 20 times smaller rack space compared to leading GPU-based solutions.

This design prioritizes the demands of inference over training, reflecting the shift in AI compute focus within data centers. With its efficiency and affordability, Sagence offers a solution to the growing challenge of ensuring return on investment (ROI) as AI applications expand to large-scale deployment.

At the heart of Sagence’s innovation is its analog in-memory computing technology, which merges storage and computation within memory cells. By eliminating the need for separate storage and scheduled multiply-accumulate circuits, this approach simplifies chip designs, reduces costs, and improves power efficiency.

Sagence also employs deep subthreshold computing in multi-level memory cells – an industry-first innovation – to achieve the efficiency gains required for scalable AI inference.

Are you a pro? Subscribe to our newsletter Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed! Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors

Traditional CPU and GPU-based systems rely on complex dynamic scheduling, which increases hardware demands, inefficiencies, and power consumption. Sagence’s statically scheduled architecture simplifies these processes, mirroring biological neural networks.

The system is also designed to integrate with existing AI development frameworks like PyTorch, ONNX, and TensorFlow. Once trained neural networks are imported, Sagence’s architecture negates the need for further GPU-based processing, simplifying deployment and reducing costs.

“A fundamental advancement in AI inference hardware is vital to the future of AI. Use of large language models (LLMs) and Generative AI drives demand for rapid and massive change at the nucleus of computing, requiring an unprecedented combination of highest performance at lowest power and economics that match costs to the value created,” said Vishal Sarin, CEO & Founder, Sagence AI.

“The legacy computing devices today that are capable of extreme high-performance AI inferencing cost too much to be economically viable and consume too much energy to be environmentally sustainable. Our mission is to break those performance and economic limitations in an environmentally responsible way,” Sarin added.

Via IEEE Spectrum

Original Author: udinmwenefosa@gmail.com (Efosa Udinmwen) | Source: TechRadar

AMD’s Radeon PRO V710: A Game-Changer for Cloud Computing?

October 16, 2024

In "Article"

AMD’s Radeon PRO V710: A Game-Changer in Cloud GPU Territory

October 16, 2024

In "Article"

AMD’s Radeon PRO V710: The Next Big Thing in Cloud Computing?

October 16, 2024

In "Article"

Akshit Behera

Google’s Trillium TPU: A Game-Changer in AI Performance

Previous « Samsung in 2024: the hits, the misses, plus what to expect in 2025

Trump administration’s deal is structured to prevent Intel from selling foundry unit | TechCrunch

The deal allows the U.S. to take more equity in Intel if the company doesn't…

6 months ago

News

3 Apple Watches are rumored to arrive on September 9 – these are the models to expect

We're expecting two new models alongside the all-new Apple Watch Series 11. | Original Author:…

6 months ago

News

Fujitsu is teaming with Nvidia to build probably the world’s fastest AI supercomputer ever at 600,000 FP8 Petaflops – so Feyman GPU could well feature

Japan’s FugakuNEXT supercomputer will combine Fujitsu CPUs and Nvidia GPUs to deliver 600EFLOPS AI performance…

6 months ago

News

Microsoft fires two more employees for participating in Palestine protests on campus

Microsoft has fired two more employees who participated in recent protests against the company’s contracts…

6 months ago

News

Microsoft launches its first in-house AI models

Microsoft announced its first homegrown AI models on Thursday: MAI-Voice-1 AI and MAI-1-preview. The company…

6 months ago

Review

Life 3.0 – Being Human in the Age of Artificial Intelligence by Max Tegmark

A comprehensive review of Max Tegmark's Life 3.0, exploring the future of artificial intelligence and…

6 months ago

Yet another tech startup wants to topple Nvidia with ‘orders of magnitude’ better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B

Ten times lower power and 20 times lower costs

Also offers integration with PyTorch and TensorFlow

Efficiency and performance

Related

AMD’s Radeon PRO V710: A Game-Changer for Cloud Computing?

AMD’s Radeon PRO V710: A Game-Changer in Cloud GPU Territory

AMD’s Radeon PRO V710: The Next Big Thing in Cloud Computing?

Recent Posts

Trump administration’s deal is structured to prevent Intel from selling foundry unit | TechCrunch

3 Apple Watches are rumored to arrive on September 9 – these are the models to expect

Fujitsu is teaming with Nvidia to build probably the world’s fastest AI supercomputer ever at 600,000 FP8 Petaflops – so Feyman GPU could well feature

Microsoft fires two more employees for participating in Palestine protests on campus

Microsoft launches its first in-house AI models

Life 3.0 – Being Human in the Age of Artificial Intelligence by Max Tegmark