Cainew - Curated AI news for developers

TL;DR

Model Releases

DiffusionGemma: 4x Faster Text Generation

Tools & Products

Research Papers

Industry News

Discussion

Model Releases

DiffusionGemma: 4x Faster Text Generation

RSS

Tools & Products

diffusionstudio/lottie

Generate production-ready Lottie animations with Claude Code or Codex

GitHub

zarazhangrui/lark-coding-agent-bridge

Bot that bridges Feishu/Lark messenger with a local Claude Code or Codex CLI. Streaming cards, per-chat sessions, multiple workspaces

GitHub

duncatzat/vigils

A local control plane for AI agents — see what they do, approve what matters, keep secrets out. Rust + Tauri + Chrome MV3.

GitHub

garyqlin/gbase

GBase — Recursive Self-Improvement Agent Framework. Memory, evolution, quality gates, identity system, and 40+ auto-registered tools.

GitHub

pedrofariasx/qwenproxy

Proxy API OpenAI-compatible que usa automação com Playwright para rotear requisições para modelos do Qwen com suporte a múltiplas contas, tools e sessões persistentes.

GitHub

TypingMind: Pay per use, no subscription, 18 model providers supported

TypingMind is the most popular app to use LLM AI models with API keys. It brings you all the best models across 18 providers in one powerful AI workspace without having to pay for subscriptions to each one. TypingMind also provides the best AI experience ever with features focused on pro users like Projects, Fork/Parallel chat, Plugins/MCP/Skills, and a wide range of customizable options that you literally cannot find anywhere else! Go give it a try at www.typingmind.com :)

ProductHunt

borhen68/TokenTamer

A drop-in proxy that compresses bloated code context in real-time, cutting LLM API costs by 50–80% without losing what the model actually needs to know.

GitHub

openhackai/OpenHack

Open Source Agentic Security Scanner

GitHub

agentic-in/inferoa

Inference-native Tokenmaxxing Agent Harness for Loop Engineering

GitHub

Apache Burr: Build reliable AI agents and applications

RSS

iArt.ai: Turn ideas & designs into stunning video/animation.

A faster agent delivers promos/shorts, explainers, kinetic type and PRO motion graphics with audio. Ditch AE/PR/CapCut. Chat to refine and ship impact.

ProductHunt

Axol: Automate physical work with a powerful robot

Axol is a dual-arm robot designed for teams automating real work with physical AI. Easy data collection, long reach, and a high range of motion means you can automate work that matters.

ProductHunt

SeaTicket: Al agent that resolves issues across all your channels

Software teams are drowning in a sea of fragmented issues across GitHub, Discourse and emails. Valuable feedback is often buried under noise. SeaTicket transforms community support by syncing these into a single workspace. What makes it different? Full Context: Bring related issues and documents when solving an issue. AI Agents: Built-in agents autonomously suggest resolutions using existing documents and previous issues. Stop digging for context. Start resolving.

ProductHunt

Hero Studio Photos: Snap one photo, get listing-ready shots from every angle

Snap one photo of whatever you're selling and Hero turns it into clean studio shots from every angle, including on-model and mannequin-style images for clothing. Live on iOS and Android, with an API and MCP server so developers and AI agents can build on it.

ProductHunt

Vibe coding my way to a healthy family: Introducing Gamow Labs

Gamow Labs introduces a new approach to health and wellness by combining coding with family-oriented practices to promote healthier lifestyles. The platform aims to make health tracking and improvement more engaging through gamification and development activities.

RSS

Research Papers

Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks

Researchers demonstrate ultrafast machine learning on FPGAs using Kolmogorov-Arnold Networks, achieving significant speedups in neural network inference on specialized hardware. This approach combines advanced network architectures with hardware acceleration for exceptional performance.

RSS

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

Expressive continuous control policies, such as diffusion and flow models, form the backbone of recent advances in scaling imitation learning for simulated and real robot control. While they are known to scale stably in the supervised imitation learning setting, incorporating them into reinforcement learning (RL) pipelines for policy improvement has proven more difficult. It often requires specialized training objectives or backpropagating through denoising processes, which cause well-known issu...

HuggingFace

One Token per Multimodal Evidence: Latent Memory for Resource-Constrained QA

External memory effectively grounds large language models (LLMs) and vision-language models (VLMs)-based question answering (QA) in relevant multimodal evidence. However, existing memory paradigms represent each memory item in raw text and image forms, so retrieval-based systems must pass the retrieved text or images to the generation LLMs/VLMs, resulting in high token consumption and storage pressure, making it unaffordable for resource-constrained applications. We propose Latent Memory, a late...

HuggingFace

The Role of Feedback Alignment in Self-Distillation

Conditioning a language model on additional context, such as feedback on a previous attempt, typically improves its response. Self-distillation trains the model to retain this improvement when the context is not present. The method works by matching the model's output distribution under two settings: a student that sees only the question, and a self-teacher that also sees the context. What the model learns therefore depends on what context the self-teacher receives, yet the design of this contex...

HuggingFace

Dynamic Linear Attention

The scalability of Large Language Models (LLMs) to long contexts is fundamentally constrained by the quadratic complexity of standard attention, motivating the adoption of linear attention mechanisms with sub-quadratic cost. To improve representation capacity under long contexts, recent approaches organize memory in a multi-state manner. However, existing multi-state linear attention methods rely on fixed state merging policies that cannot adapt to dynamically varying token importance, irreversi...

HuggingFace

Next Forcing: Causal World Modeling with Multi-Chunk Prediction

Autoregressive video generation has emerged as a powerful paradigm for World Action Models (WAMs). However, existing approaches suffer from slow training convergence and limited converged accuracy, particularly at high frame rates, as the training supervision is confined to the current chunk without explicit signals about future dynamics; they also suffer from slow inference due to iterative video denoising. In this paper, we present Next Forcing, a multi-chunk prediction (MCP) framework for cau...

HuggingFace

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

Controlled character animation requires transferring motion from a driving sequence to a reference character. Prior works heavily rely on intermediate representations, including pose skeletons to represent motion or masked background to represent environment, which inevitably leads to information loss. To address this, we present SCAIL-2, an framework that bypasses those intermediates and achieves end-to-end character animation. By directly concatenating driving videos to the sequence, the model...

HuggingFace

Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields

Recent years have witnessed the rapid evolution of AI agents toward handling increasingly complex, real-world tasks. However, existing benchmarks rarely evaluate whether agents can operate graphical user interfaces to complete long-horizon, high-value professional workflows across diverse domains. Current GUI benchmarks still predominantly focus on general-purpose software, relatively simple applications, and short-horizon tasks, leaving it largely unknown whether modern agents can follow user i...

HuggingFace

How Does Reasoning Flow? Tracing Attention-Induced Information Flow for Targeted RL in LLMs

Token-level credit assignment remains a key obstacle for reinforcement learning (RL) in large language models (LLMs), where RL recipes typically treat all tokens equally, failing to distinguish decisive reasoning steps from routine formatting or fluent filler. Recent attempts leverage model-internal signals to assign finer-grained credit, but these are often point-wise heuristics that ignore the global structure of information propagation. We propose FlowTracer, an RL framework that traces answe...

HuggingFace

Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories

Data tells stories that shape society; the data journalist's job is to turn raw information into stories non-experts can trust. A high-quality news feature takes a newsroom team weeks: hunting for context, running statistics, choosing an angle, and designing visuals. Recent agents handle individual steps well: data-science agents close the analysis loop, while design agents synthesize beautiful websites. But can an agent serve as a data journalist end to end? We introduce Data Journalist Agent (...

HuggingFace

FadeMem: Distance-Aware Memory Consolidation for Autoregressive Video Diffusion

Autoregressive video generators synthesize long videos by generating successive temporal segments, but their historical KV cache grows with video length. Existing bounded-cache methods reduce this cost with local windows, sink tokens, or compressed memory states, yet they usually assign fixed roles to different parts of the history. We propose FadeMem, a distance-aware KV memory consolidation mechanism that organizes historical KV blocks into a temporal hierarchy under a fixed cache budget. This...

HuggingFace

ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations

This paper introduces ARM, a discrete representation-based AutoRegressive Model that unifies image understanding, generation, and editing within a next-token prediction framework. ARM is built on three efforts: first, we train a discrete semantic visual tokenizer that maps images into compact token sequences. Our tokenizer is supervised with multiple objectives that jointly promote semantic discriminability, language alignment and faithful reconstruction, thereby supporting diverse tasks in a sh...

HuggingFace

WorldOlympiad: Can Your World Model Survive a Triathlon?

We introduce WorldOlympiad, a benchmark for diagnosing video-based world models across physical faithfulness, geometric consistency, and interaction fidelity. While existing benchmarks often focus on visual quality, semantic alignment, or short-term temporal coherence, they provide limited insight into whether generated videos obey physical rules, preserve coherent 3D structure, and sustain controllable interactions over long horizons. To address this gap, WorldOlympiad decomposes world-model ev...

HuggingFace

UniPET: a universal network for high-quality PET image denoising across varied dose reduction factors

Most existing deep learning-based PET image denoising methods assume a fixed and known dose reduction factor (DRF) for low-dose PET images. However, these methods encounter significant performance degradation when the DRF varies beyond the assumed one in practical applications. To address the challenge posed by varied DRFs, several preliminary studies focus on the task of universal PET image denoising, aiming to train a universal model over low-dose data across DRFs. Nonetheless, these vanilla u...

HuggingFace

Decentralized Multi-Agent Systems with Shared Context

Multi-agent systems (MAS) can scale large language model reasoning at test time by decomposing complex problems into parallel subtasks. However, most existing MAS rely on centralized orchestration, where a main agent assigns work, collects outputs, and merges results. As the number of subtasks grows, this controller becomes a communication and integration bottleneck. We propose Decentralized Language Models (DeLM), a MAS framework that decentralizes coordination through parallel agents, a shared...

HuggingFace

Industry News

A €0.01 bank transfer could compromise a banking AI agent

A researcher demonstrated that a €0.01 bank transfer could compromise a banking AI agent, highlighting critical security vulnerabilities in financial AI systems. This exploit shows how small, seemingly inconsequential transactions can be weaponized to manipulate or break AI-powered financial applications.

RSS

German ruling declares Google liable for false answers in AI Overviews

A German court ruled that Google is liable for false answers provided by its AI Overviews feature, establishing important legal precedent for AI accountability. This decision emphasizes that companies remain responsible for the accuracy of AI-generated information presented to users.

RSS

AI misidentification results in wrongful arrest; man seeks justice

A man was wrongfully arrested due to AI misidentification in a facial recognition case, highlighting serious concerns about accuracy and bias in AI systems used by law enforcement. He is now seeking justice and compensation for the incident.

RSS

From data to decisions: how LSEG is scaling trusted AI

See how LSEG uses OpenAI to scale trusted AI across its global business, accelerating insights, shrinking release cycles, and empowering 4,000 employees.

OpenAI

Discussion

Rich Sutton on AI creativity and discovery

Twitter

Notes on DeepSeek

Twitter