Nvidia's KV Cache Transform Coding (KVTC) compresses LLM key-value cache by 20x without model changes, cutting GPU memory costs and time-to-first-token by up to 8x for multi-turn AI applications.
Multiverse Computing S.L. said today it has raised $215 million in funding to accelerate the deployment of its quantum computing-inspired artificial intelligence model compression technology, which ...
Effective compression is about finding patterns to make data smaller without losing information. When an algorithm or model can accurately guess the next piece of data in a sequence, it shows it’s ...
Large language models (LLMs) such as GPT-4o and other modern state-of-the-art generative models like Anthropic’s Claude, Google's PaLM and Meta's Llama have been dominating the AI field recently.
San Sebastian, Spain – June 12, 2025: Multiverse Computing has developed CompactifAI, a compression technology capable of reducing the size of LLMs (Large Language Models) by up to 95 percent while ...
Intel has disclosed a maximum severity vulnerability in some versions of its Intel Neural Compressor software for AI model compression. The bug, designated as CVE-2024-22476, provides an ...
One of Europe’s most prominent AI startups has released two AI models that are so tiny, they have named them after a chicken’s brain and a fly’s brain. Multiverse Computing claims these are the ...
I see awful diminishing returns here. (Lossless) compression of today isn't really that much better than products from the 80s and early 90s - stacker (wasn't it?), pkzip, tar, gz. You get maybe a few ...