Multimodal Model - Search News

12hon MSN

Nvidia's Nemotron 3 Nano Omni model unifies vision, audio and language for agents

Nvidia (NVDA) has released its new Nemotron 3 Nano Omni model, which is designed to help developers build and deploy more ...

News.az

NVIDIA unveils Nemotron 3 Nano Omni model, enhancing AI agents’ efficiency by 9x

This best-in-class model gives enterprises and developers a production path for more efficient and accurate multimodal AI ...

Investing.com South Africa

NVIDIA launches Nemotron 3 Nano Omni multimodal AI model

NVIDIA (NASDAQ:NVDA) introduced Nemotron 3 Nano Omni on Tuesday, an open multimodal model that unifies vision, audio and ...

From GPT-5.5 to DeepSeek V4: How Developers Are Building Smarter AI Agents with Multi-Model Routing in 2026

SINGAPORE, SINGAPORE, SINGAPORE, April 26, 2026 /EINPresswire.com/ -- April 2026 was the most intense month in the ...

Hosted on MSN

Nvidia debuts multimodal AI as costs outpace labor savings

Nvidia has unveiled its Nemotron 3 Nano Omni model, combining vision, audio, and language capabilities to create more efficient AI agents, even as industry leaders warn AI remains costlier than human ...

11h

Xiaomi releases open-weight MiMo-V2.5 AI model, claims "frontier-level agentic capability"

This is a multimodal AI model that understands text, images, audio and video. It's available for download, online and as an ...

Neowin

Microsoft announces Phi-4-multimodal and Phi-4-mini small language models

Microsoft has unveiled two new additions to its Phi-4 family of small language models: Phi-4-multimodal, which integrates speech, vision, and text, and Phi-4-mini. In December 2024, Microsoft ...

Geeky Gadgets

AnyGPT any-to-any open source multimodal large language model (LLM)

AnyGPT is an innovative multimodal large language model (LLM) is capable of understanding and generating content across various data types, including speech, text, images, and music. This model is ...

TechCrunch

Mistral releases Pixtral 12B, its first multimodal model

French AI startup Mistral has released its first model that can process images as well as text. Called Pixtral 12B, the 12-billion-parameter model is about 24GB in size. Parameters roughly correspond ...

SiliconANGLE

Microsoft releases new Phi models optimized for multimodal processing, efficiency

Microsoft Corp. today expanded its Phi line of open-source language models with two new algorithms optimized for multimodal processing and hardware efficiency. The first addition is the text-only ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results