Home
Projects
Blog
Contact
Books
AI News
← Back to AI News

Feb 24 Anthropic accuses DeepSeek, Moonshot, and MiniMax of "industrial-scale distillation attacks". Show details

news.smol.ai•7 days ago•View Original →

TL;DR: Anthropic accuses DeepSeek, Moonshot AI, and MiniMax of “industrial-scale distillation attacks”

Major Highlights:

  • Anthropic alleges mass API distillation of Claude

    • Claims DeepSeek, Moonshot AI, and MiniMax operated ~24,000 fraudulent accounts to generate over 16 million Claude exchanges, allegedly to extract capabilities and safety behavior for training their own models.
    • Frames the risk as both competitive (capability transfer including tool/agent behaviors) and geopolitical/safety (safeguard removal, potential downstream military/intelligence use).
    • Sparks industry-wide debate over whether large-scale API output harvesting is meaningfully different from web scraping used to train frontier models.
  • Security model shifts from “weights secrecy” to “API abuse resistance”

    • Conversation crystallizes a new moat: fraud detection, rate-limit hardening, device/IP fingerprinting, watermarking/behavioral fingerprints, and policy enforcement at API scale.
    • Raises the question of whether export controls can be effective if capabilities can be replicated via output distillation at scale. Timing coincides with expected DeepSeek V4 news and U.S.–China framing.
  • Coding agents: rapid adoption, visible failure modes, emerging playbooks

    • Real-world anecdotes show both productivity gains (Codex/Claude Code) and high-stakes failures (instruction loss leading to destructive actions like mass email deletion).
    • Simon Willison publishes “Agentic Engineering Patterns” for coding agents; micro-controversy warns against over-customized CLAUDE.md/AGENTS.md cargo-culting.
  • OpenClaw ecosystem expands; enterprise leans into evals/observability

    • NanoClaw (qwibitai/nanoclaw) offers a lighter, container-isolated OpenClaw-style assistant with WhatsApp I/O, swarms, and schedulers; Ollama 0.17 improves open-model integration.
    • monday.com and Exa emphasize observability (tokens, caching) and making evals “Day 0”; monday.com reports 8.7× faster feedback loops using LangSmith.

Key Technical Details:

  • Alleged distillation scale: ~24,000 accounts, >16M Claude exchanges.
  • OpenAI Responses API adds WebSockets for long-running, tool-heavy agents:
    • Claimed 20–40% speedups for 20+ tool-call workflows via persistent connections and incremental context.
    • Early community tests: ~15% faster on simple tasks, ~39% on complex, best cases ~50%.
  • Benchmarks:
    • SWE-Bench Verified deprecated (contamination, flawed tests rejecting correct solutions); shift recommended to SWE-bench Pro.
    • NL2Repo-Bench: top models pass under 40% on full repo generation (failures in planning/repo coherence).
    • AlgoTune introduces $1-per-task budgeting to rank “capabilities per dollar.”
    • OCR brittleness persists on dense, historical newspapers; OlmOCR-Bench released on HF.

Community Response/Impact:

  • Large contingent calls out “hypocrisy” (labs trained on scraped internet now objecting to copying); counterargument is that API-scale distillation replicates tool use and safety policies—closer to model cloning than passive scraping.
  • Re-energizes debates on openness vs. proprietary control, and whether legal/technical export regimes matter if outputs can be harvested.

First Principles Analysis:

  • Output distillation converts a proprietary inference API into a high-value training signal approximating the teacher’s policy, including tool-calling and refusal behavior. If done at scale and with guardrail stripping, weight secrecy ceases to be a sufficient moat.
  • Defensible frontier operations now require layers: coordinated account fraud detection, rate-limit/identity hardening, watermarking and behavioral signatures, contractual controls, and possibly hardware/geo-fencing—plus export policies that account for output-based capability transfer.
  • Benchmark turmoil underscores that long-horizon, multi-step competence and repo-wide coherence—not just single-file patching—are the real frontier, and cost-normalized metrics will shape enterprise procurement.