MULTIMODAL
30 days · UTC
Synchronizing with global intelligence nodes...
Google ships Gemini Embedding 2: one multimodal vector model for text, images, audio, video, and PDFs
Google released Gemini Embedding 2, a single multimodal embedding model that unifies text, image, audio, video, and PDF embeddings with flexible dimen...
OpenAI gpt-image-1-mini: cheaper image generation with text+image input
OpenAI released gpt-image-1-mini, a cost-efficient image model that accepts text and image inputs and returns images. Pricing is low per image ($0.005...
OpenAI rolls out GPT-5.2 with stronger code and data handling
OpenAI introduced GPT-5.2, saying it improves code generation, chart/graph understanding, factual accuracy, and long-context use. The model ships in "...
OpenAI’s GPT-4o API reportedly ending by Feb 2026; debate over "human-like" assistants
A community campaign (#Keep4o) says OpenAI will end GPT-4o API access by February 2026, sparking pushback from users who prefer 4o’s warmer, more cont...
Update: Google DeepMind AGI roadmap and agentic systems
In a new video, Demis Hassabis lays out the clearest public roadmap to AGI yet, explicitly centering on agentic systems that plan, use tools, and work...