We are launching a new newsletter — curated AI developer insights.Sign up

zed cover
ZedLLM

Zed makes the case for local AI models in its editor

Zed has published a new post arguing that local AI delivers stronger privacy guarantees, steadier costs, and less reliance on cloud policy changes. It says local model usage in Zed’s agent has tripled in 10 weeks, with setup tips for LM Studio, Ollama, and llama.cpp.
Kiro launches Web-Based interface in preview
  • Kiro

Kiro launches Web-Based interface in preview

Kiro has just rolled out Kiro Web in preview for paid subscribers, letting teams write code across repos and open pull requests from a browser. It adds GitHub issue/PR workflows, steering files for conventions, and isolated cloud sandboxes. Web credits are 50% off until May 29, 2026.
Kilo Code v7 regains control with human-in-the-loop updates

Kilo Code v7 regains control with human-in-the-loop updates

Brendan O’Leary details how Kilo Code is bringing developer oversight back after its v7 VS Code rewrite. Recent updates keep reasoning visible, surface diffs before approval, unify the Changes panel, and improve shell output, permissions, and checkpoint restores.
devin cover
  • Devin

Cognition adds Devin Auto-Triage to investigate alerts and open PRs

Cognition has just rolled out Devin Auto-Triage, bringing automated investigation to incoming alerts across Slack, GitHub, Linear, Sentry, Datadog, and more. It can summarize findings, route to the right owner, or even open a PR, with sandboxing to handle untrusted inputs.

Introducing the Augmenter Newsletter

Get a curated digest of AI developer news, tutorials, and tools — delivered to your inbox. Designed for developers who want concise, useful updates.

Augmenter Logo

News and Insights on Agentic Coding, Vibe Coding and more

Augmenter is a human-curated collection of AI news, insights, and resources for developers. Content is written with AI, reviewed by humans, and designed to keep you up to date as technology moves forward.

Latest Articles

claude cover
  • Claude

Anthropic lays out best practices for Claude Code at scale

Anthropic has published a new guide detailing how Claude Code performs in huge monorepos and legacy systems—and why results hinge as much on setup as on the model. It breaks down the “harness” teams use to keep navigation, context, and governance on track.
cline cover
  • Cline

Cline SDK rebuilds agent runtime for cross-IDE sessions

Cline has just rolled out Cline SDK, rebuilding its agent runtime into a standalone, pluggable package that can run as a shared service across VS Code, JetBrains, and the CLI. The release adds plugins, teams/subagents, broader model providers, and fresh benchmark claims.
DeepSeek V4 Pro and Flash benchmarked against Claude Opus

DeepSeek V4 Pro and Flash benchmarked against Claude Opus

A recent benchmark write-up by Darko at Kilo Blog tests DeepSeek V4 Pro and Flash against Claude Opus 4.7 and Kimi K2.6 using a heavier FlowGraph workflow. It digs into where each model shines, stumbles, and how pricing shifts the value equation.
Google launches Gemini 3.5 Flash with 4x faster coding

Google launches Gemini 3.5 Flash with 4x faster coding

Google has just rolled out Gemini 3.5 Flash, touting "frontier-level" agentic and coding performance at 4x the speed and often under half the cost. It’s available now in the Gemini app, Search AI Mode, and developer tools, with mixed early reactions.
anthropic cover

Andrej Karpathy joins Anthropic to return to R&D

Andrej Karpathy announced on X that he’s joined Anthropic, calling the next few years at the LLM frontier “especially formative.” The news sparked a wave of welcomes as the AI community weighed the talent shift.

Featured Videos

Deep dive videos for AI developers

Miniature de la vidéo: Master Coding Agents Like a Pro (Anthropic’s Ultimate Playbook)

Master Coding Agents Like a Pro (Anthropic’s Ultimate Playbook)

The opening of the talk defines “vibe coding” as more than just using AI to help write code. The speaker argues that true vibe coding means letting the model handle the implementation to the point that you “forget the code exists,” while you focus on the outcome. He explains why this matters: as AI systems get better, they will be able to handle larger and larger chunks of work, making it unrealistic for humans to stay in a tight line-by-line review loop forever. He then frames the core challenge as how to use this approach safely in production. His answer is that engineers should stop obsessing over every implementation detail, but still stay accountable for the product’s behavior and quality. He compares this to managers or executives overseeing work they cannot personally execute in full detail: they succeed by verifying outcomes, requirements, and checkpoints rather than inspecting everything directly. A key caveat in this early section is tech debt. He says that unlike product behavior, tech debt is still hard to validate without actually understanding the code. Because of that, he recommends using vibe coding mainly on leaf nodes of a codebase, meaning isolated features where problems are less likely to spread into the core architecture.

Miniature de la vidéo: Ralph: Autonomous Coding Loops for Claude
13:25

Ralph: Autonomous Coding Loops for Claude

Autonomous coding loops can move fast—but without visibility and control, they can become hard to trust (and easy to run too long). This video walks through how Ralph Loop and the Ralph TUI add structure to long-running agent workflows, so you can track progress and intervene when needed. Key takeaways Covers what Ralph Loop is and how continuous iteration differs from a single-pass run in Claude Code. Breaks down why a task tracker and TUI matter as projects grow, including live task status and output streaming. Walks through setup: choosing a tracker (e.g., a local PRD JSON file), selecting an agent (Claude Code or OpenCode), and setting iteration limits. Demonstrates generating a PRD, turning it into a task list, and running sub-agents with pause/resume and session persistence.

Miniature de la vidéo: OpenSource Kimi K2.5 just dropped
14:45

OpenSource Kimi K2.5 just dropped

Open-source weights are back—but for professionals, the real question is whether the latest drop meaningfully improves day-to-day coding, vision work, and agent workflows. This video walks through what Kimi K2.5 claims to deliver, where it benchmarks well, and what it looks like in hands-on demos. Breaks down Kimi K2.5’s focus areas: coding, vision tasks, and “self-directed” agent swarms Covers benchmark results across agentic, coding, and vision/video evaluations, plus cost vs. performance claims Shows practical examples like generating front-end websites and recreating a site from screenshots (no code provided) Demonstrates tool-using behavior, including a web-based price comparison and discussion of local runtime/VRAM needs

Miniature de la vidéo: From Vibe Coding To Vibe Engineering
25:28

From Vibe Coding To Vibe Engineering

Frontend teams have always ridden hype cycles—but LLMs change the day-to-day work: you can “accept” code fast, and just as quickly land in the wrong abstraction. This talk reframes “vibe coding” into “vibe engineering,” focusing on how professionals can collaborate with AI without losing control of quality, context, and maintainability. Breaks down what “vibe coding” means in practice and why the definition keeps shifting Contrasts hands-off prompting with “vibe engineering” using agents—plus why you should stay skeptical of generated code Shares tactics the speaker uses (e.g., voice-to-code, starting from solid primitives, and supplying rules/docs/memory) Covers when vibing is appropriate (one-off scripts, simple features) and when it’s risky for teams and juniors

Miniature de la vidéo: Researchers solved the Context Window Limit
17:44

Researchers solved the Context Window Limit

Context windows cap what you can reliably ask an LLM to reason over—and as inputs grow, “context rot” can make quality drop fast. This video breaks down an MIT paper proposing recursive language models: a way to process arbitrarily long prompts at inference time without changing the core model. Key takeaways Covers why stuffing more tokens into a prompt can degrade retrieval and reasoning, even before hitting the physical limit. Walks through the RLM setup: storing the long prompt in a Python/REPL environment and giving the model tools to search it. Explains the “recursive” step—re-querying relevant sections to go deeper without summarization or compression. Reviews how the approach is evaluated on long-context tasks (e.g., BrowseComp+, Oolong, code repository understanding) and what tradeoffs show up in cost variance.

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community