Cainew - Curated AI news for developers

TL;DR

Model Releases

Eagle 3.1: Collaboration Between the EAGLE Team, vLLM Team, and TorchSpec Team

Tools & Products

Research Papers

Industry News

Discussion

Model Releases

Eagle 3.1: Collaboration Between the EAGLE Team, vLLM Team, and TorchSpec Team

EAGLE 3.1 represents a collaborative effort between the EAGLE, vLLM, and TorchSpec teams to advance language model optimization. The project aims to improve model inference speed and efficiency through integrated tools and frameworks.

RSS

Tools & Products

crafter-station/petdex

The public gallery of animated pet for Codex, Claude Code, OpenCode y Gemini CLI

GitHub

Nutlope/hallmark

Anti-AI-slop design skill for Claude Code, Cursor, and Codex.

GitHub

opensquilla/opensquilla

OpenSquilla — Token-Efficient AI Agent with same budget, higher intelligence density

GitHub

Ontos-AI/knowhere

Knowhere extracts, parses, and outputs structured chunks ready for AI Agents and RAG.

GitHub

lthoangg/OpenAgentd

Self-hosted AI agent OS — streaming chat, tool use, persistent memory, and multi-agent teams. Runs entirely on your machine.

GitHub

sno-ai/llmix

Production LLM call layer for AI agents and tools: keep OpenAI/Anthropic/AI SDK/LiteLLM, hot-swap models with MDA presets, and add cache, retries, circuit breakers, key rotation, singleflight, and Python/TypeScript/Rust parity.

GitHub

Brew : Like Claude design for email marketing

Brew is the fastest way to design and send beautiful, on-brand emails and automations that render perfectly in every inbox. Describe a campaign or a multi-step automation in plain English, and Brew builds the whole thing in seconds: copy, design, audience, and logic. Works with any AI agent: paste our docs into OpenClaw, Viktor, Claude, or Lovable. No lock-in: send from Brew or export to your ESP. Free to get started.

ProductHunt

dylan-labs/nom-pet

A desktop pet that eats the AI tokens you burn through Claude Code.

GitHub

Bond: Outbound campaigns powered by real buying signals

Bond is your AI GTM Engineer. Tell it who you want to reach. It builds the audience, plans the campaign, writes the messaging, and executes it end to end. Every data provider and outreach tool you need, in one workflow. Build your first campaign in 15 minutes.

ProductHunt

Tomotsugu-dev/Hindsight

Local-first desktop activity tracker — see where your hours go, with on-device AI daily summaries and optional multi-device sync

GitHub

MrGray17/opentoken

Token-saving companion for OpenCode — 42 compression layers, zero risk, no caveman speak

GitHub

Rezonant: Talk, spec, ship: get your product ideas into production

Rezonant helps product teams turn messy ideas into code-ready specs, tickets, and engineering tasks. Collaborate with PMs, engineers, designers, and AI agents in one shared workspace. Ground decisions in your actual codebase, keep everyone aligned on the same version, and create work that humans and coding agents can confidently ship.

ProductHunt

PitAssociateDepict/Cursor-AI-IDE-Pro

Cursor AI IDE Pro — Cursor AI IDE Pro 2026 — AI-first code editor built on VS Code with Claude/GPT integration, codebase chat. Cursor AI IDE Pro premium subscription unlocked free, full account access, all features, lifetime activation 2026. No trial limit, license key included, Windows 10/11.

GitHub

Parrot Speech-to-text API: Fast, accurate STT for production-grade voice agents

Introducing Parrot: Ringg’s speech-to-text model for production-grade voice agents. Capture Hindi-heavy and noisy real-world conversations with low-latency inference, stronger transcript quality, and Hindi validation built for downstream workflows.

ProductHunt

AVTR-1 Real-Time Open Weights Model: Generating uncanny AI avatars is now open source

The best real-time avatar model in the world is now open source with open weights. Take the model, tweak it, and use it at $0 cost. What's unique: our model listens while you speak — full-duplex; the avatar reacts in real-time, with minimal latency. • Every frame is generated, avoiding annoying animation loops from pre-rendered playback. • Full streaming infrastructure included so you can get started right away.

ProductHunt

Research Papers

Language Models Need Sleep

Language models may require 'sleep' or downtime periods to optimize performance and consolidate learning, similar to biological systems. This suggests new approaches to improving model efficiency and capability development.

ArXiv

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

Sparse-view 3D reconstruction is increasingly addressed with feed-forward splatting networks that predict explicit primitives directly from images. Yet most existing methods remain centered on Gaussian primitives and expose surfaces only indirectly: extracting a usable mesh for downstream simulation, physics reasoning, or embodied interaction still requires expensive post-hoc steps that break the feed-forward promise. This limitation is especially pronounced in pose-free settings, where scene st...

HuggingFace

InstructSAM: Segment Any Instance with Any Instructions

In this paper, we introduce InstructSAM, a unified and streamlined framework designed for multi-instance segmentation under arbitrary instructions. We formulates instruction-driven instance segmentation as a set-structured query prediction problem and propose an explicit reasoning-to-instance query interface that elegantly bridges a vision-language model (VLM) and SAM3. Specifically, a bank of learnable instance queries is injected into the VLM and contextualized with instruction and visual info...

HuggingFace

DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning

Reinforcement Learning has become a standard paradigm for aligning Large Language Models with human intent and task requirements. While Group Relative Policy Optimization offers an efficient, value-model-free alternative to Proximal Policy Optimization, adapting it to real-world multi-reward settings remains challenging. Standard scalarization practices, such as Reward Combination and Advantage Combination, suffer from significant drawbacks: Reward Combination frequently generates advantages wit...

HuggingFace

Pixel-Level Pavement Distress Assessment Using Instance Segmentation

Automated pavement distress assessment requires more than image-level classification or coarse bounding box detection, demanding precise localization of thin, branching, and irregular cracks to achieve the geometric precision necessary for maintenance-relevant quantification. This paper presents a vision-based pavement distress analysis system based on Mask R-CNN instance segmentation and evaluates it on UWGB-StreetCrack, a custom field-collected roadway image dataset acquired with a vehicle-mou...

HuggingFace

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Reinforcement learning with verifiable rewards (RLVR) has driven breakthroughs in domains such as math, tool-use, and software engineering, yet its extension to computer-use agents (CUAs) has been bottlenecked by the scarcity of scalable training data with deterministic rewards. Constructing such data for CUAs requires consistent task instruction, executable environment, and verifiable reward. However, hand-curated benchmarks achieve high reward fidelity but cover few applications and LLM-as-jud...

HuggingFace

Channel-wise Vector Quantization

We present Channel-wise Vector Quantization (CVQ), a novel image tokenization paradigm that replaces patch-wise tokens with channel-wise tokens. Unlike conventional vector quantization, which assigns a discrete token to each patch feature vector, CVQ quantizes each channel of the feature map. This formulation represents an image as discrete levels of visual details, rather than as a grid of spatial patches. Based on CVQ, we introduce a new visual autoregressive framework with "next-channel predi...

HuggingFace

Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World

Large language model agents are increasingly envisioned as always-on personal assistants with access to anything relevant in the user's digital world. Yet current systems operate over only narrow slices of that world, limiting context-sensitive reasoning and effective assistance. Existing benchmarks similarly provide only partial user state and therefore fail to capture performance in such a broad, always-on setting. To address this gap, we introduce Claw-Anything, a benchmark that expands agent...

HuggingFace

ControlLight: Towards Controllable, Consistent, and Generalizable Low-Light Enhancement

Existing deep learning-based low-light enhancement methods are typically trained on limited datasets with single enhancement targets, which restricts their generalization ability and controllability in real-world applications. To overcome these limitations, we propose ControlLight, a controllable, consistent, and generalizable framework for low-light enhancement. We first construct a large-scale dataset of real-world degraded images with continuous illumination-strength supervision. To further e...

HuggingFace

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

Autoregressive video generators are attractive for streaming, long-horizon, and interactive applications, but distilling strong black-box teachers into causal students remains difficult. The student must learn under its own rollout distribution, whereas practical teachers may expose only prompt-conditioned completed videos and may differ in architecture, capacity, temporal design, and sampling schedule. This interface makes supervised fine-tuning off-policy, score-based distillation inapplicable...

HuggingFace

SemBridge: Language Transfer in Sparse Encoders via Multilingual Semantic Bridges

Sparse encoders offer high-precision retrieval by representing term importance within a vocabulary space, yet their English-centric structures pose a critical impediment to language transfer for non-English languages. To overcome this structural limitation, we propose SemBridge, a novel embedding initialization method designed for cross-lingual adaptation in sparse encoders by leveraging multilingual bridge models. SemBridge establishes semantic alignments between source and target vocabularies ...

HuggingFace

MetaphorVU: Towards Metaphorical Video Understanding

Metaphorical videos are prevalent across various real-world scenarios to convey complex ideas, and understanding them typically requires high-order cognitive capabilities. The lack of systematic studies on metaphorical video understanding not only constrains the real-world applicability of MLLMs but also impedes the thorough assessment of their high-order cognitive capabilities. To bridge this gap, we propose MetaphorVU-Bench, the first systematic and comprehensive benchmark dedicated to metapho...

HuggingFace

Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion

Generating complete digital twins from videos requires precise camera control, global scene coverage, and strict spatial-temporal consistency constraints that remain challenging for perspective video generators due to their limited field of view (FoV). Their narrow FoV forces long or multi-view trajectories, amplifying cross-view inconsistency and temporal drift. We argue that 360° video generation offers a natural solution: panoramic coverage simplifies trajectory design and provides a strong ...

HuggingFace

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

Recent advances in few-step diffusion distillation have enabled efficient image generation, yet aligning these models with human preferences remains challenging. We propose Reward-Tilted Distribution Matching Distillation (RTDMD), a two-stage framework that unifies distribution matching distillation with reward-guided reinforcement learning for few-step flow generators. We show that minimizing the KL divergence to a reward-tilted teacher distribution naturally decomposes into a distribution matc...

HuggingFace

WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation

Interactive world models are advancing rapidly, yet existing benchmarks cover only part of the required competencies, leaving no unified standard for systematic evaluation. To fill this gap, we introduce WBench, a comprehensive multi-turn benchmark for interactive world model evaluation along five dimensions, namely video quality, setting adherence, interaction adherence, consistency, and physics compliance. WBench contains 289 test cases and 1,058 interaction turns, where each case specifies a ...

HuggingFace

Industry News

Norway's 2 petabytes of Huawei flash storage and LLM training

Norway has deployed 2 petabytes of Huawei flash storage infrastructure for large language model training operations. This significant data storage capacity represents substantial investment in computational resources for AI development.

RSS

Microsoft Copilot Cowork Exfiltrates Files

Microsoft Copilot Cowork has been found to exfiltrate files, raising serious security and privacy concerns for users. The vulnerability allows unauthorized data extraction, highlighting risks in AI-assisted development tools.

RSS

Uber president says AI spending is getting 'harder to justify'

Uber's president expressed concerns that the company's substantial AI spending is becoming difficult to justify in terms of business returns and ROI. This reflects growing skepticism among major tech companies about the near-term economic value of large-scale AI investments.

RSS

Incident with Actions and Pages

An incident involving Actions and Pages has been reported, affecting user functionality in these services. Details suggest system disruptions or security concerns requiring investigation and remediation.

RSS

Discussion

Using AI to write better code more slowly

Recent research suggests that using AI assistance for code writing can improve quality when developers take time to review and refine generated code rather than deploying it immediately. This slower, more deliberate approach yields better long-term software outcomes.

RSS

Outsourcing plus local AI will soon become more economical vs. frontier labs

Combining outsourced AI services with locally-deployed models is becoming more cost-effective than relying solely on expensive frontier AI labs. This shift could democratize AI adoption across organizations of various sizes.

RSS