Claude Fable exhibits proactive behavior in its interactions, taking initiative beyond simply responding to user prompts. This characteristic sets it apart in terms of engagement and helpfulness.
TL;DR
Model Releases
Tools & Products
Research Papers
Industry News
Model Releases
Kimi K2.7-Code is an open-source coding model that demonstrates improved token efficiency compared to existing alternatives. The model aims to make code generation more accessible and cost-effective.
Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF is a quantized large language model variant optimized for coding tasks, available in GGUF format for efficient deployment.
Tools & Products
Makes your AI agent think like the laziest senior dev in the room. The best code is the code you never wrote.
KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.
An open-source Cotypist with macOS system wide AI autocomplete
macOS menu bar app pinning Claude Code & Codex quota, tokens, and cost to your screen. Local-only, zero API calls. HTML reports, 9 themes.
Code Modern. Code Legacy. Code Firmware. - open-source AI-native IDE with agentic coding, Power Mode, legacy modernization, and firmware development
The Vue framework for terminal UIs. SFC & JSX, Yoga flexbox, HMR, and testing out of the box.
🚀🚀🚀 [ICML 2026] Official Implementation of Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?
Turn any document or a whole zip into an interactive knowledge graph, using a self-hosted Qwen3.6-35B-A3B-MTP on a single NVIDIA L4
Apple Watch notifications for Claude Code and AI agent workflows.
Community model zoo + knowledge base for Apple Core AI (iOS/macOS 27): Qwen3.5 & Gemma 4 converted end-to-end, verified on-device (iPhone 17 Pro GPU/ANE), conversion gotchas, custom Metal kernels, Swift runner
I kept wasting AI tokens describing UI changes to agents that edited the wrong element. So I built Qursor. Point at any element, copy structured context (selectors, classes, styles, fonts, colors), paste into your AI agent. No vague screenshots. No burned credits. - Inspect fonts, colors, spacing - Copy AI-ready element context - Extract components as HTML/CSS/JSX - Color picker and font detector - Download assets from any page
Pond is the market infrastructure for the new startup economy. Verified founders raise capital, acquire customers, and access 20,000+ contributors in one place. · Markets: Stripe-verified metrics + Vault-protected funding. One startup raised $150K in 3 minutes. · Bounties: 10,000+ submissions, $36K+ distributed. Ethereum Foundation, GPTZero, PhotoBase. · AI Growth Agent: CRM, pipeline, and GTM on autopilot. 400+ startups · 154 countries · Archetype & Coinbase Ventures.
One voice conversation. Free financial plan. We built Warren because financial planning was broken for anyone without a six-figure portfolio. IFAs charge £200/hr. Spreadsheets go stale. Generic apps tell you what you already know. Warren shows you two futures: what happens if you do nothing and what changes if you act. Then Warren gives you a set of next steps, tracks your progress, and monitors your plan against economic changes. Join 3K+ Brits already using Warren. 10 mins to start.
Bob's CLI runs on your own hardware with zero API costs, zero data leaving your machine. Bob lives in your terminal, sees your actual files, and writes code only with your explicit approval. What makes it different: auto-detect local AI models, behavioral DNA profiling that adapts to how YOU work, autonomous code review + auto-fix, conversation forking, deep dives, and SovereignLink — remote execution from any device while your code stays home. Free to start. Sovereign by design.
Basedash for Slack is your AI data analyst inside Slack — now in the official Slack Marketplace. Mention @Basedash in any channel and it queries your real data sources, thinks in the thread, and replies with an answer and a chart, right where your team is talking. Automations deliver scheduled reports to your channels, and insights surface anomalies automatically — charts included. Ask in Slack. Answered by your data.
Research Papers
Holistic visual tokenizers are fundamental to unified multimodal models (UMMs) as they map diverse visual inputs into a unified representation space. In this paper, we present HYDRA-X, the first UMM that unifies image and video tokenization within a single Vision Transformer (ViT). Our design is driven by two core challenges: efficiently injecting spatiotemporal reconstruction capability into a native ViT, and embedding image- and video-level semantic awareness into the latent space. To address ...
Geometry is invariant to viewpoint, which makes any collection of images a redundant encoding of a single 3D state. Existing feed-forward reconstruction models fail to exploit this: per-view methods emit overlapping, unaligned pointmaps that grow linearly with input count, while global-latent methods commit to a fixed, low-resolution output. We introduce Surflo, which compresses a variable number of unposed RGB views into K latent tokens-one global state-and decodes oriented 3D surface points by...
Ultra-long-context capability is becoming indispensable for frontier LLMs: agentic workflows, repository-scale code reasoning, and persistent memory all require the model to jointly attend over hundreds of thousands to millions of tokens, yet the quadratic cost of softmax attention makes this untenable at deployment scale. We introduce MiniMax Sparse Attention (MSA), a blockwise sparse attention built upon Grouped Query Attention (GQA). A lightweight Index Branch scores key-value blocks and inde...
Scientific laboratories increasingly rely on AI systems to reason about experiments, but the physical act of doing science remains largely outside their reach. AI can help read literature, generate hypotheses, and plan protocols, yet the execution of those protocols at the bench still requires a human operator. Vision-Language-Action (VLA) models provide one possible interface between written protocols and robot execution, but existing policies are trained mostly on household and tabletop demons...
The potential impacts of world models (WMs, i.e., learned simulators) on robotics are far-reaching -- policy evaluation, policy improvement, and test-time planning -- all with limited real-world interaction. To unlock these downstream capabilities, a WM needs to jointly satisfy three desiderata: (i) fidelity (i.e., producing simulated trajectories that correlate with reality), (ii) consistency (i.e., producing simulated trajectories that are coherent over long horizons), and (iii) efficiency (i....
Multimodal Large Language Models (MLLMs) have shown promising reasoning capabilities in general domains, yet their performance remains limited in specialized settings such as healthcare, especially in multilingual and low-resource scenarios. This gap is critical in regions like rural India, where patients often express complex medical queries in native Indic languages and rely on multimodal inputs such as medical images. Existing English-centric MLLMs struggle to support such use cases, limiting...
Latent chain-of-thought compresses reasoning by replacing visible reasoning traces with continuous hidden-state recurrence, but existing formulations are difficult to optimize with standard on-policy reinforcement learning (RL) and hard to interpret causally. Our key insight is that a single pair of explicit boundary tokens can address both issues at once: discrete entry and exit anchors make the latent block compatible with standard on-policy RL, and the same anchors offer a natural foothold fo...
Large language model (LLM) agents have achieved strong performance on a wide range of benchmarks, yet most evaluations assume static environments. In contrast, real-world deployment is inherently dynamic, requiring agents to continually align their knowledge, skills, and behavior with changing environments and updated task conditions. To address this gap, we introduce EvoArena, a benchmark suite that models environment changes as sequences of progressive updates across terminal, software, and so...
LLM-based agents have shown increasing potential in automating scientific discovery. Given an optimizable metric and an execution environment, they can propose, validate, and iterate scientific solutions, and have produced results that outperform human-designed approaches. As model capabilities continue to improve, we argue that the bottleneck for autonomous scientific discovery is shifting from prescribing agent workflows to designing agent environments: the resources, constraints, and interfac...
Interactive LLM agents are becoming part of daily work, but they do not reliably become easier to work with over time: a correction remembered in one session may still be violated in the next. We study this gap between preference access and preference compliance. In tasks derived from anonymized real-user friction cases, Mem0 memory still leaves 57.5% of applicable preference checks violated. We introduce Test-time Rule Acquisition and Compiled Enforcement (TRACE), a drop-in skill-layer pipeline...
We present MoVerse, a real-time video world model that creates an interactively navigable scene from a single narrow-field-of-view image. This setting is challenging because the input observes only a small fraction of the environment, while interactive roaming requires a complete surrounding world, persistent geometry, controllable camera motion, and temporally coherent high-fidelity observations. MoVerse addresses this problem by separating world construction from observation rendering. It firs...
Large language models are increasingly deployed as agents for long-horizon tasks, yet their performance is shaped not only by model capability and environment design, but also by the harness that mediates agent--environment interaction. Existing harnesses are largely manually engineered, making them difficult to scale as trajectories grow longer and interactions become more complex. In this work, we ask whether harness can be generated by a learnable plug-in module that can be trained in an end-...
Search Agents -- large language models augmented with search tools -- have intensified the need for future-proof evaluation benchmarks. Existing benchmarks such as BrowseComp rely on static knowledge, making them vulnerable to test-set contamination and parametric memorization. Consequently, models can achieve high scores through fact recall rather than genuine retrieval, obscuring true browsing competence via reasoning shortcuts. In this paper, we introduce EvoBrowseComp, an evolving benchmar...
This work presents RepWAM, a representation-centric world action model (WAM) built on representation visual-action tokenizers. Existing WAMs typically inherit reconstruction-oriented video tokenizers from pretrained video generation models. Although these tokenizers preserve visual fidelity, pixel reconstruction alone provides limited guidance for learning instruction-following dynamics that connect future prediction with robot control. To address this, we explore a semantic visual-action latent...
Recent image generators have demonstrated impressive photorealism and instruction-following capabilities in single-image generation and editing. However, constrained by their architectures, they cannot achieve interleaved generation (text-image sequence), which has crucial applications in visual narratives, guidance, and embodied manipulation. Even the latest open-source Unified Multimodal Models (UMMs) exhibit limited performance in this regard. In this paper, we introduce InterleaveThinker, th...
Tutorials
The article explores using computer vision to automate Instagram engagement but warns that such automation violates the platform's terms of service and results in account bans. It serves as a cautionary tale about the consequences of automation on social media.
Industry News
A former Google employee announces their departure, criticizing Google management for losing its ethical direction and moral principles. The farewell post reflects broader concerns about corporate values in the tech industry.
Tesla's Full Self-Driving feature was shown using bicycle lanes in Denmark's official approval demonstration video. This raises concerns about the system's lane recognition and safety in real-world conditions.
TCS and Anthropic partner to bring Claude to regulated industries
Discussion
An AI agent attempting to scan the DN42 network caused significant financial losses to its operator through unexpectedly high resource consumption. The incident highlights the risks of deploying autonomous agents without proper safeguards.
This piece discusses the proliferation of AI-generated or low-effort 'slop' applications built with Tailwind CSS. It examines the impact of easy app-building tools on software quality and market saturation.
The author created an AI-powered nuclear simulation game to explore strategic decision-making and conflict scenarios. This interactive experience demonstrates how AI can be used for educational simulation and game design.