Cainew - Curated AI news for developers

TL;DR

Model Releases

Tools & Products

Research Papers

Stage-adaptive Token Selection for Efficient Omni-modal LLMs
MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation
PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset

Tutorials

Learnings from 100K lines of Rust with AI (2025)

Industry News

Model Releases

Qwen3.7-Max: The Agent Frontier

Qwen3.7-Max represents advances in AI agent capabilities, pushing the frontier of autonomous AI system development. This model marks progress in creating more sophisticated and capable AI agents for complex task execution.

RSS

Stable Audio 3

Stable Audio 3 is a new audio generation model capable of creating high-quality audio content. The model represents advancement in AI-driven audio synthesis technology.

ArXiv

Tools & Products

HermannBjorgvin/Clawdmeter

ESP32 desk dashboard that shows Claude Code usage

GitHub

virgiliojr94/book-to-skill

Turn any technical book PDF into a Claude Code skill — ready to study, reference, and use while you work.

GitHub

tolibear/goalbuddy

A better /goal for Codex and Claude Code

GitHub

MFS9628/Deepseek-v4-pro-app

DeepSeek v4 Pro github Flash chat: API flash gemma 4 gemini qwen claude chatgpt 4 key pricing tier, open source weights, huggingface model repository, local execution ollama setup. context window token limit, coding benchmark leaderboard ranking, reasoning model architecture v4, .visual studio code extension integration, cursor ai

GitHub

OpenAI Adopts Google's SynthID Watermark for AI Images with Verification Tool

OpenAI has adopted Google's SynthID watermarking technology to mark and verify AI-generated images, adding an authenticity verification tool. This implementation helps users identify and authenticate images created by AI systems.

OpenAI

StoreClaw: Grow your store profits with agents that know how to sell

StoreClaw is the first AI commerce platform with agents that know how to sell, so you can make more money with less effort and less stress. Connect StoreClaw to your existing store and it will study your numbers, current sales figures, and growth trajectory, and then offer proactive suggestions that it can execute on your behalf — once you give it your approval. Ask StoreClaw how your business is doing any time, anywhere. Sell more with less stress: StoreClaw.

ProductHunt

mailX by mailwarm: Email deliverability toolkit for humans and AI agents

Your emails go to spam. mailX shows you why, and how to fix it in seconds with clear answers and exact steps. Built for humans and AI agents. API and MCP ready.

ProductHunt

AtomicBot-ai/atomic-agent

Atomic Agent is an intelligent automation tool that performs complex tasks autonomously using AI-powered agents. It streamlines workflows and improves operational efficiency.

GitHub

Gemini Omni: Create anything from any input – starting with video

Create anything from anything, starting with video. Gemini Omni is where Gemini’s ability to reason meets the ability to create. It delivers a leap in world understanding, multimodality, and editing.

ProductHunt

Manus Scheduled Tasks 2.0 : Run recurring Manus work inside the same task context

Manus now runs scheduled tasks inside the same task context, reuses Project setups, and adds recurring actions to Manus-built web apps. For knowledge workers and teams automating repeatable workflows in Manus.

ProductHunt

Remove-AI-Watermarks – CLI and library for removing AI watermarks from images

Remove-AI-Watermarks is a CLI tool and library designed to strip watermarks from AI-generated images. This utility enables users to remove identifying marks from images created by artificial intelligence systems.

GitHub

Infomaniak transitions to a foundation model to protect user data privacy

Infomaniak is transitioning to its own foundation model to enhance user data privacy protection. This move allows the company to reduce reliance on third-party AI systems and maintain greater control over user information.

RSS

Testing distributed systems with AI agents

AI agents are being applied to test and validate distributed systems, offering automated testing capabilities across complex architectures. This development enables more thorough and efficient testing of systems with multiple interconnected components.

GitHub

A new experiment brings better group meetings to Google Beam

See and hear your colleagues in true-to-life size and sound, making hybrid meetings feel more inclusive and connected.

RSS

Research Papers

Stage-adaptive Token Selection for Efficient Omni-modal LLMs

Omni-modal large language models (om-LLMs) achieve unified audio-visual understanding by encoding video and audio into temporally aligned token sequences interleaved at the window level. However, processing these dense non-textual tokens throughout the LLM incurs substantial computational overhead. Although training-free token selection can reduce this cost, existing methods either focus on visual-only inputs or prune om-LLM tokens only before the LLM with fixed per-modality ratios, failing to c...

HuggingFace

MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation

Video generation is rapidly evolving from single-shot synthesis to complex multi-shot audio-video (MSAV) narratives to meet real-world demands. However, evaluating such frontier models remains a fundamental challenge. Existing benchmarks are limited in scope and data diversity, and rely on rigid evaluation pipelines, preventing systematic and reliable assessment of modern MSAV models. To bridge these gaps, we introduce MSAVBench, the first comprehensive benchmark and adaptive hybrid evaluation f...

HuggingFace

PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset

Text-to-Image (T2I) models have recently seen notable progress around 1K and 2K resolution. With the extreme desire for better visual experience and the rapid development of imaging technology, the demand for Ultra-High-Resolution (UHR) image generation has grown significantly. However, UHR image generation poses great challenges due to the scarcity and complexity of high-resolution content. In this paper, we first introduce PixVerve-95K, a high-quality, open-source UHR T2I dataset curated with ...

HuggingFace

Matérn Noise for Triangulation-Agnostic Flow Matching on Meshes

This paper tackles the task of learning to generate signals over triangle meshes in a triangulation-agnostic manner, meaning the trained model can be applied to different meshes and triangulations effectively. Practically, the paper adapts the flow matching (FM) paradigm to a mesh-based, triangulation-agnostic setting. Theoretically, it proposes a specific noise distribution which is triangulation agnostic, to be used inside the FM model's denoising process. While noise distributions are usually...

HuggingFace

Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding

Speculative decoding (SD) accelerates large language model inference by leveraging a draft-then-verify paradigm. To maximize the acceptance rate, recent methods construct expansive draft trees, which unfortunately incur severe VRAM bandwidth and computational overheads that bottleneck end-to-end speedups. While dynamic-depth pruning can reduce this latency by removing marginal branches, it also discards potentially valid candidates, preventing the acceptance rate from reaching the upper bound of...

HuggingFace

optimize_anything: A Universal API for Optimizing any Text Parameter

Can a single LLM-based optimization system match specialized tools across fundamentally different domains? We show that when optimization problems are formulated as improving a text artifact evaluated by a scoring function, a single AI-based optimization system-supporting single-task search, multi-task search with cross-problem transfer, and generalization to unseen inputs-achieves state-of-the-art results across six diverse tasks. Our system discovers agent architectures that nearly triple Gemi...

HuggingFace

OpenComputer: Verifiable Software Worlds for Computer-Use Agents

We present OpenComputer, a verifier-grounded framework for constructing verifiable software worlds for computer-use agents. OpenComputer integrates four components: (1) app-specific state verifiers that expose structured inspection endpoints over real applications, (2) a self-evolving verification layer that improves verifier reliability using execution-grounded feedback, (3) a task-generation pipeline that synthesizes realistic and machine-checkable desktop tasks, and (4) an evaluation harness ...

HuggingFace

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

We present GoLongRL, a fully open-source, capability-oriented post-training recipe for long-context reinforcement learning with verifiable rewards (RLVR). Existing long-context RL methods often treat data construction as a matter of designing increasingly complex retrieval paths, leading to homogeneous task coverage and reward formulations that inadequately reflect practical long-context requirements. Our work offers two contributions. (1) Capability-oriented data construction with full open rel...

HuggingFace

CEPO: RLVR Self-Distillation using Contrastive Evidence Policy Optimization

When a model produces a correct solution under reinforcement learning with verifiable rewards (RLVR), every token receives the same reward signal regardless of whether it was a decisive reasoning step or a grammatical filler. A natural fix is to condition the model on the correct answer as a teacher, identifying tokens it would have generated differently had it known the answer. Prior work shows this either corrupts training by leaking the answer into the gradient, or produces a weak signal that...

HuggingFace

Fast 4D Mesh Generation by Spatio-Temporal Attention Chains

4D mesh generation has recently emerged as a powerful paradigm for recovering dynamic 3D structure from videos, but existing methods remain slow, computationally expensive, and difficult to scale to longer sequences. We introduce a training-free approach that accelerates 4D mesh generation while improving temporal correspondence quality. Our key observation is that temporal correspondences emerge inside a 4D backbone long before its generated meshes become visually accurate. We exploit this with...

HuggingFace

CogOmniControl: Reasoning-Driven Controllable Video Generation via Creative Intent Cognition

Recent diffusion models achieve strong photorealism and fluency in video generation, yet remain fragile under abstract, sparse or complex conditions, leading to poor performance in professional production workflows such as storyboard sketches and clay render conditions. Existing video generation models, either inject conditions through adapters or couple a generic vision-language model (VLM) within a diffusion backbone, leaving a capability gap and failing to produce the videos that align with t...

HuggingFace

TideGS: Scalable Training of Over One Billion 3D Gaussian Splatting Primitives via Out-of-Core Optimization

Training 3D Gaussian Splatting (3DGS) at billion-primitive scale is fundamentally memory-bound: each Gaussian primitive carries a large attribute vector, and the aggregate parameter table quickly exceeds GPU capacity, limiting prior systems to tens of millions of Gaussians on commodity single-GPU hardware. We observe that 3DGS training is inherently sparse and trajectory-conditioned: each iteration activates only the Gaussians visible from the current camera batch, so GPU memory can serve as a w...

HuggingFace

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Automating scientific discovery requires more than generating papers from ideas. Real research is iterative: hypotheses are challenged from multiple perspectives, experiments fail and inform the next attempt, and lessons accumulate across cycles. Existing autonomous research systems often model this process as a linear pipeline: they rely on single-agent reasoning, stop when execution fails, and do not carry experience across runs. We present AutoResearchClaw, a multi-agent autonomous research p...

HuggingFace

Where Does Authorship Signal Emerge in Encoder-Based Language Models?

Authorship attribution models fine-tuned with the same pretrained encoder, data, and loss can differ four-fold in performance depending only on their scoring mechanism. We use mechanistic interpretability tools to explain this gap. Stylistic features such as word length, punctuation density, and function-word frequency are equally available at every layer in every model, including in an off-the-shelf control encoder, hence the gap not coming from representation quality. Instead, causal intervent...

HuggingFace

PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents

Large language model (LLM) agents increasingly operate over long and recurring external contexts, like document corpora and code repositories. Across invocations, existing approaches preserve either the agent's trajectory, passive access to raw material, or task-level strategies. None of them preserves what we argue is most needed for repeated same-context workloads: reusable orientation knowledge (e.g., what the context contains, how it is organized, and which entities, constants, and schemas h...

HuggingFace

Tutorials

Learnings from 100K lines of Rust with AI (2025)

A comprehensive analysis of 100K lines of Rust code reveals key learnings from using AI assistance in large-scale Rust development. The findings provide insights into AI's effectiveness and challenges when applied to substantial codebases.

RSS

Industry News

Mistral AI acquires Emmi AI

Mistral AI has acquired Emmi AI, expanding its capabilities and product portfolio. This acquisition strengthens Mistral AI's position in the competitive AI market.

RSS

OpenAI Is Preparing to File for an IPO Soon

OpenAI is preparing to file for an Initial Public Offering (IPO), marking a significant step toward becoming a publicly traded company. This move would make OpenAI's shares available to public investors.

RSS

The next phase of OpenAI’s Education for Countries

OpenAI advances Education for Countries, expanding AI adoption in schools with new partnerships, teacher training, and tools to improve global learning outcomes.

OpenAI

Introducing OpenAI for Singapore

OpenAI for Singapore launches a multi-year AI partnership to expand deployment, build local talent, and support businesses and public services with AI.

OpenAI