Cainew - Curated AI news for developers

TL;DR

Tools & Products

Research Papers

Discussion

Tools & Products

opensquilla/opensquilla

OpenSquilla — Token-Efficient AI Agent with same budget, higher intelligence density

GitHub

study8677/awesome-architecture

🧭 Architecture-first system design: 26 bilingual tutorials, 25 architecture templates, and 6 end-to-end cases covering distributed systems, AI-native systems, RAG, coding Agents, and production trade-offs.

GitHub

Soul-AILab/SoulX-Transcriber

An end-to-end framework for multi-speaker transcription that jointly models who spoke, when, and what.

GitHub

huawei-csl/KVarN

KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.

GitHub

Liu-Ming-Yu/alpha-forge

Alpha Forge — an agentic AI operating system for systematic trading.

GitHub

AtomFlow-AI/MoleCode

Molecode presents molecules as code and enables LLMs to operate and reason on chemistry directly.

GitHub

vuejs-ai/vue-tui

The Vue framework for terminal UIs. SFC & JSX, Yoga flexbox, HMR, and testing out of the box.

GitHub

2417467487-hub/WorldCupROI

Sports sponsorship intelligence platform for World Cup match data, real-source text signals, ROI prediction, uncertainty analysis, and scenario recommendations.

GitHub

Empromptu AI: Train Fine Tuned Models With AI Apps You're Already Building

Most AI apps launch on someone else’s model and stay there forever. Empromptu AI turns live AI features into custom models you own. As your app runs, Empromptu AI captures real-world usage, human corrections, and edge cases from live AI workflows, then uses that signal to train a custom model you own. Improve accuracy, lower inference costs, and stop depending forever on rented intelligence from the same providers moving into your category.

ProductHunt

LastPrincipal/machine-learning-library-684

A hand-curated, topic-organized library of the best ML education — 923 docs (391 arXiv papers, 474 Stanford/MIT/Karpathy/fast.ai lectures, 58 explainer articles), normalized to Markdown with full provenance. Open it in Obsidian or point your agent at it. A clean ML corpus for learning, RAG & fine-tuning.

GitHub

voidduckcalm/machine-learning-library-784

GitHub

Retrodraseparator/vigils-734

A local control plane for AI agents — see what they do, approve what matters, keep secrets out. Rust + Tauri + Chrome MV3.

GitHub

VergeWarlord/KeyType-134

An open-source Cotypist with macOS system wide AI autocomplete

GitHub

Build Club Campus : Virtual AI School: Upskill in AI and Become Great at it Fast

Build Club Campus is a fun, gamified and community-driven virtual AI school for learning AI by building with it. Static courses get old fast in AI, so Campus helps you stay current through bite-sized courses, real projects, role-based use cases and community templates that evolve with the tools. Earn certifications in OpenAI, Claude, Copilot and more, and become great at AI fast - for work, startups or side hustles. It’s 100% free, as part of our mission to enable anyone to build with AI.

ProductHunt

AppWizzy: Rent a private VM with Codex to build production apps

AppWizzy gives you a private VM with Codex installed where you build, run, and host production web apps by chatting with AI. Your code is yours, the workspace persists, and the app lives in the same environment where it was created. Pay only for AI usage, hosting days, and optional templates

ProductHunt

Research Papers

The ways we contain Claude across products

This piece explains Anthropic's safety and containment strategies for deploying Claude across different products. It details the technical and operational measures implemented to ensure Claude operates within intended boundaries.

Anthropic

Gaussian Point Splatting

Gaussian Point Splatting is a advanced rendering technique that uses point-based representations for efficient 3D scene visualization and synthesis. This method enables faster real-time rendering while maintaining high visual quality compared to traditional approaches.

RSS

When AI Builds Itself: Our progress toward recursive self-improvement

The article discusses recent progress toward achieving recursive self-improvement in AI systems, where AI models can autonomously enhance their own capabilities. This explores the technical challenges and implications of systems that can iteratively improve themselves.

Anthropic

Deep Embedded Multiplicative DMD for Algebra-Preserving Koopman Learning

Koopman theory turns nonlinear dynamics into a linear spectral problem. In computation, however, everything depends on a hard finite-dimensional choice: the observables must be expressive, nearly invariant under the dynamics, and, ideally, compatible with composition. Deep Koopman methods learn flexible coordinates, whereas structure-preserving methods enforce operator identities on fixed dictionaries. We combine these ideas by introducing Deep Embedded Multiplicative Dynamic Mode Decomposition ...

HuggingFace

GRAIL: Gradient-Reweighted Advantages for Reinforcement Learning with Verifiable Rewards

Reinforcement learning with verifiable rewards (e.g. GRPO) is now a common way to improve mathematical reasoning in Large Language Models (LLMs). However, current methods usually broadcast one sequence-level advantage to all tokens, or use costly process reward models (PRMs) for step-level supervision. Uniform advantage distribution assumes that all tokens contribute equally to the final reward. This dilutes the gradient signal, since flawed reasoning steps and filler words are updated as strong...

HuggingFace

Echo-Infinity: Learning Evolving Memory for Real-Time Infinite Video Generation

We present Echo Infinity, an autoregressive (AR) framework towards real-time infinite video generation that employs a learnable evolving memory to dynamically filter, abstract, and compress any-length history at constant cost. Existing methods mainly curate memory with predefined KV-cache schedules, fixed-ratio heuristic compression, or inference-time RoPE adaptation. These designs inevitably lose historical information and amplify compounding errors due to their limited cache window and ignoran...

HuggingFace

GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors

Scaling humanoid loco-manipulation requires robot-compatible demonstrations across diverse objects, whole-body motions, and scene geometries, but teleoperation and motion capture are difficult to scale because each collection depends on physical setups, instrumented actors, and robot operation. We present GRAIL, a digital generation pipeline that remains fully virtual until deployment: it composes 3D assets, simulator-ready scenes, and priors from video foundation models (VFMs) to synthesize int...

HuggingFace

Probing Outcome-Level Resemblance and Mechanism-Level Alignment in LLM Risk Decisions: Evidence from the St. Petersburg Game

LLMs can appear cautious in risk decision-making tasks, yet cautious-looking outputs do not necessarily indicate alignment with human decision-making mechanisms. We investigate this distinction using the St. Petersburg game as a controlled testbed, a classical paradox in which the expected payoff is infinite, yet humans typically report low, finite willingness to pay. We evaluate 28 LLMs with a structured prompt suite that includes the original game; controlled decision variants that perturb tru...

HuggingFace

Audio Interaction Model

Audio is an inherently interactive modality, yet today's Large Audio Language Models (LALMs) are offline, and streaming audio models each handle only a single task such as streaming ASR or voice chatting. It is time to unify them into one online LALM: a model that, through an always-on perceive-decide-respond loop, listens to sound, environment, and instructions in real time and reacts on the fly. We formalize this regime as the Audio Interaction Model, and realize it with Audio-Interaction, a u...

HuggingFace

ZipSplat: Fewer Gaussians, Better Splats

Feed-forward 3D Gaussian Splatting methods reconstruct a scene from posed or pose-free images in a single forward pass, yet current approaches predict one Gaussian per input pixel, tying the representation budget to camera resolution rather than scene complexity. A flat wall and a richly textured object thus produce equally many Gaussians despite very different geometric needs. We propose ZipSplat, a token-based feed-forward model that decouples Gaussian placement from the pixel grid. A multi-vi...

HuggingFace

MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation

Autoregressive mesh generation has gained attention by tokenizing meshes into sequences and training models in a language-modeling fashion. However, existing approaches suffer from two fundamental limitations: (i) low tokenization efficiency, which yields long token sequences and prevents scaling to high-poly meshes, and (ii) absence of geometry-aware guidance, as generation is conditioned only on global shape embeddings rather than local surface cues. We introduce MeshWeaver, an autoregressive ...

HuggingFace

MapAgent: An Industrial-Grade Agentic Framework for City-scale Lane-level Map Generation

Lane-level maps are critical infrastructure for autonomous driving and lane-level navigation, yet constructing and maintaining standardized lane networks for hundreds of cities remains highly labor-intensive. Recent end-to-end vectorized mapping methods can predict lane geometry and topology directly from sensor data, but they typically treat mapping specifications and traffic regulations as implicit, dataset-dependent supervision. Moreover, in complex scenes (e.g., worn or missing markings and ...

HuggingFace

Evaluating Large Language Models in Dynamic Clinical Decision-Making with Standardized Patient Cases

Large language models (LLMs) are increasingly proposed as clinical agents, yet static, single-turn benchmarks cannot capture how a model dynamically delivers care across an encounter: gathering information, planning treatment, and adapting longitudinal management across successive patient states. Medical education has long addressed an analogous challenge through standardized patients (SPs): trained actors who consistently portray clinical cases, enabling realistic practice and objective, script...

HuggingFace

STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

Training Data Attribution (TDA) seeks to trace a model's predictions back to its training data. The gold standard for TDA relies on causal interventions, observing how a model changes when data is added or removed, but repeated retraining is computationally challenging for Large Language Models (LLMs). Consequently, most approaches approximate this effect in the parameter space using gradients. However, tracking gradients across billions of parameters is not only prohibitively expensive but reli...

HuggingFace

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Rubric-based reinforcement learning (RL) uses an LLM-as-a-Judge (LaaJ) to score model outputs according to rubrics as rewards. However, policy models may exploit latent biases in the judge, leading to reward hacking and ineffective or unsafe training outcomes. In real-world rubric-based RL, such hacking behaviors are often subtle and entangled with multiple judge biases, making them difficult to analyze, detect, and mitigate. In this paper, we introduce CHERRL, a controllable hacking environment...

HuggingFace

Discussion

They’re made out of weights

This article explores the fundamental nature of neural networks, examining how weights form the core computational basis of AI models. The piece likely discusses how these numerical parameters encode learned representations and drive model behavior.

RSS

I built a vulnerable app and spent $1,500 seeing if LLMs could hack it

A developer conducted a $1,500 experiment to assess whether large language models could successfully identify and exploit vulnerabilities in a deliberately vulnerable application. The study provides insights into the security capabilities and limitations of current LLMs.

RSS