posts · 2026-06-12

▸ 50 items · updated 3m ago

browse by dayclear filter ✕

May 2026

MTWTFSS

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2573 26105 27120 28142 29116 3064 3162

June 2026

MTWTFSS

1150 2157 3132 4117 5127 669 773 8141 9135 1084 1196 1288 1346 1434 1570 1682 1775 1886 1955 2027 2120 2274 2374 2468 2564 2640 2724 2837 2956 3083

July 2026

MTWTFSS

156 271 347 421 527 664 758 865 975 1050 1134 1228 1345 1484 1582 1683 1745 1818 1938 2051 2170 2265 2340 24 25 26 27 28293031

2026-06-12 · Fri

22:48

45d ago

AI HOT (Curated Pool)· aihot-apiZH22:48 · 06·12

→Oran Ge open-sources a writing skill to keep AI edits from losing the human voice

Oran Ge had Claude Fable 5 polish copy three times and noticed the edits got more refined but lost the human feel. After discussing it with the AI, he pinned the problem on 'presence'—a writer's specific position and cost that AI can't replicate. He built a skill to preserve that human texture when using AI to revise self-written or dictated drafts. The skill is open-source and free on GitHub.

#Oran Ge#Claude Fable 5#Open source

editor take

He turned 'AI polish kills voice' into a reusable skill file on GitHub—useful if you're revising your own drafts or dictations and want to keep the human texture.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

72

SCORE

H1·K1·R1

22:29

45d ago

Product Hunt · AI· rssEN22:29 · 06·12

→Firecrawl launches Prometheus: a forward-deployed agent for web data extraction

Firecrawl launched Prometheus on Product Hunt today, an experimental forward-deployed agent for web data. Describe the data you need, and it writes Firecrawl code to collect it, with optional hosting and automatic monitoring for page changes. This is Firecrawl's 8th launch, ranked #3 of the day with 201 upvotes. The post does not disclose supported page types, pricing details, or specific differences from similar tools like Browser Use.

#Code#Firecrawl#Product Hunt#Y Combinator

editor take

Firecrawl's 8th launch: tell Prometheus what web data you need, it writes the scraping code and can host + monitor changes.

HKR breakdown

hook —knowledge —resonance —

→ open source

55

SCORE

H0·K0·R0

21:22

45d ago

Product Hunt · AI· rssEN21:22 · 06·12

→D-ID turns any video into an interactive AI avatar

D-ID launched Agentic Videos, which embed an expressive AI avatar into any video so viewers can pause, ask questions, and get real-time answers. The company says video should be responsive, not one-way. It works with existing footage—no need to re-record. The post doesn't disclose pricing, latency, or supported languages.

#D-ID

editor take

D-ID embeds an AI avatar into any video so viewers can pause and ask questions live. No pricing or latency disclosed yet.

HKR breakdown

hook ✓knowledge —resonance —

→ open source

55

SCORE

H1·K0·R0

21:00

45d ago

NVIDIA Blog· rssEN21:00 · 06·12

→NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark

NVIDIA claims its Blackwell platform leads the first dedicated benchmark for agentic AI infrastructure, released by Artificial Analysis. The post does not disclose specific performance numbers or comparison details, but highlights Blackwell's advantage in latency and throughput for agent workloads.

#Benchmarking#NVIDIA#Blackwell#Artificial Analysis

editor take

NVIDIA claims Blackwell leads the first agentic AI benchmark, but the post doesn't disclose scores.

HKR breakdown

hook —knowledge —resonance —

→ open source

55

SCORE

H0·K0·R0

20:38

45d ago

AI HOT (Curated Pool)· aihot-apiZH20:38 · 06·12

→Google sues Chinese cybercrime group Outsider Enterprise for AI-powered scams targeting hundreds of thousands

Google sued a Chinese group called Outsider Enterprise, alleging they used AI to scam hundreds of thousands of people. The group sent 2.5 million text messages in two weeks with AI-generated scripts. The post doesn't specify which AI models or techniques were used, but the scale is notable.

#Google#Outsider Enterprise

editor take

Google sues Chinese group Outsider Enterprise for AI-generated scam texts—2.5M messages in two weeks.

HKR breakdown

hook ✓knowledge —resonance ✓

→ open source

55

SCORE

H1·K0·R1

20:34

45d ago

Hacker News Frontpage· rssEN20:34 · 06·12

→World of ClaudeCraft: a WoW-like MMORPG vibe-coded with Fable 5 and Claude

World of ClaudeCraft is a WoW-style MMORPG vibe-coded with Fable 5 and Claude. It supports online multiplayer or offline single-player, with 9 classes (Warrior, Mage, etc.) and classic WoW controls (WASD, hotbar, quest log). The source is on GitHub. The post doesn't specify which Claude model, server architecture, or concurrency limits — but the page is already playable.

#Code#Claude#Fable 5#World of ClaudeCraft

editor take

Someone vibe-coded a WoW-style MMORPG with Claude and Fable 5. It's open-source and playable now.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

65

SCORE

H1·K1·R0

20:33

45d ago

FEATUREDHacker News Frontpage· rssEN20:33 · 06·12

→A cross-vendor agent loop: Claude Fable 5 as architect, GPT-5.5 Codex as builder

Dan McInerney open-sourced a Claude Code skill that chains Claude Fable 5 and GPT-5.5 Codex into a division-of-labor loop. Claude plans and reviews, Codex writes code, and the repo acts as memory. The author claims an 80% reduction in Fable token usage, but the post doesn't include benchmarks or comparison data—just the README and code, so real-world results are unverified.

#Code#Anthropic#OpenAI#Dan McInerney

why featured

Featured · importance 72 · hook + knowledge + resonance

editor take

Claude plans, GPT writes code—claims 80% fewer Fable tokens, but no benchmarks provided.

sharp

This caught my eye because it turns cross-model division of labor into a reusable Claude Code skill. Dan McInerney has Claude Fable 5 act as architect—breaking down tasks and reviewing code—while GPT-5.5 Codex does the actual building, with the repo serving as shared memory. He claims this cuts Fable token usage by 80%, but the repo only has a README and code, no benchmarks or comparison data. I'd discount that 80% figure. It reads like personal experience, not a systematic eval. The cost logic makes sense—Fable 5 isn't cheap, and offloading most calls to Codex would save money—but only if Claude's planning and review are solid enough to avoid rework loops from lost context. That's exactly what we don't have data on. Treat this as a cost-saving template for narrow tasks, not a general solution. It probably works best when the task is well-defined, Codex can handle it independently, and Claude only needs light review. For complex refactors or tasks requiring deep codebase understanding, this split might add more back-and-forth than it saves.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

72

SCORE

H1·K1·R1

20:14

46d ago

FEATUREDHacker News Frontpage· rssEN20:14 · 06·12

→Can I Buy Your KV Cache?

This paper proposes letting publishers precompute a document's KV cache so AI agents can buy and load it, skipping the most compute-heavy step: prefill. On Qwen3-4B, reuse is 9–50x cheaper than prefill with zero accuracy loss—token outputs match exactly. Shipping the KV cache fails because it's nearly incompressible and egress costs more than the prefill saved. The fix: host it provider-side, like production prompt caching. Serving one 3,774-token document to 80M agents costs ~$1.5M to re-prefill but only ~$30K via reuse, a 49.7x gap. The paper frames this as an agent-native prefill CDN and leaves lossless KV compression and cross-party payments as open problems.

#Inference-opt#Luoyuan Zhang#Qwen3-4B

why featured

Featured · importance 78 · hook + knowledge + resonance

editor take

Precompute a document's KV cache and sell it to AI agents to skip redundant prefill—9–50x cheaper on Qwen3-4B with zero accuracy loss.

sharp

The idea is almost offensively simple: right now every AI agent reading the same document recomputes prefill from scratch, rebuilding an identical KV cache. The authors propose letting publishers precompute it once and sell access. On Qwen3-4B, reuse is 9–50x cheaper than prefill, and token outputs match exactly—zero accuracy cost. The part I found most useful is their math on where the cache lives. Shipping the KV file directly fails because it's nearly incompressible—egress costs more than the prefill you're trying to save. The fix is hosting it provider-side, exactly how production prompt caching works today. They run the numbers: one 3,774-token document accessed by 80 million agents costs ~$1.5M to re-prefill but only ~$30K via reuse, a 49.7x gap. Current API cache-read pricing at roughly 10% of full prefill sits comfortably inside that measured saving, so the 10x discount is a floor—the remaining gap is provider margin, millions per popular document. They frame this as an agent-native prefill CDN and leave lossless KV compression and cross-party payments as open problems. I'd read this as a clean engineering argument, not a product yet, but the direction is sharp: when agents read the same documents at scale, redundant prefill is just burning money.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

78

SCORE

H1·K1·R1

18:03

46d ago

AI HOT (Curated Pool)· aihot-apiZH18:03 · 06·12

→Ukraine's one-time test used fully autonomous drones to kill Russian soldiers

Ukraine conducted a one-time combat test where fully autonomous drones killed Russian soldiers. Full autonomy is rare, but Ukraine is now installing AI modules on drones and robots at scale. The post does not disclose the exact date, location, or drone model used in the test.

#Ukraine#Russia#Ars Technica

editor take

Ukraine's one-time combat test used fully autonomous drones to kill Russian soldiers—no date, location, or model disclosed.

HKR breakdown

hook —knowledge —resonance —

→ open source

39

SCORE

H0·K0·R0

17:38

46d ago

FEATUREDTechCrunch AI· rssEN17:38 · 06·12

→Mistral rumored to be raising €3B at €20B valuation

TechCrunch reports a rumor that Mistral is raising €3B at a ~€20B valuation, nearly double its Series C €11.7B. The post is an RSS snippet only—no lead investor, use of funds, or closing timeline disclosed. The valuation jump is steep, but it's still just a rumor with no official confirmation.

#Mistral#Funding

why featured

Featured · importance 78 · hook + knowledge + resonance

editor take

Mistral rumored to raise €3B at €20B valuation, nearly 2x its Series C, but it's an RSS snippet with no lead investor or close date.

sharp

The number that grabs you is the valuation: nearly doubling from €11.7B to €20B in one round. But the post is literally one sentence from an RSS feed—TechCrunch calls it a rumor themselves. No lead investor, no use of funds, no closing timeline, no official confirmation. I'd discount this until we see more. A raise this size usually leaks with more detail if it's close to closing. For now, it's a sentiment signal that European LLM money is still flowing, but whether the valuation holds up is an open question.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

78

SCORE

H1·K1·R1

17:34

46d ago

Hacker News Frontpage· rssEN17:34 · 06·12

→How to Set Up a Local Coding Agent on macOS

A hands-on guide for running a local coding agent on macOS, keeping code offline and private. The post doesn't specify which model or toolchain it uses.

editor take

A practical guide to running Gemma 4 locally on M1 Max at 72 tok/s with llama.cpp + MTP, beating MLX.

HKR breakdown

hook —knowledge —resonance —

→ open source

55

SCORE

H0·K0·R0

17:26

46d ago

AI HOT (Curated Pool)· aihot-apiZH17:26 · 06·12

→Google sues Chinese cybercrime group that used AI to send 2.5M scam texts

Google filed a federal lawsuit in the Southern District of New York against a Chinese cybercrime group called 'Outsider Enterprise.' The complaint says the group used AI to generate scam texts, sending 2.5 million messages in two weeks and hitting hundreds of thousands of victims. They impersonated Google and others with fake investments and job offers, running the operation through 300+ Google Ads accounts and multiple Gmail accounts. Google is suing under RICO, trademark infringement, and breach of contract, seeking to shut the operation down. The post doesn't disclose specific financial losses or individual defendant identities.

#Google#Outsider Enterprise#Policy

editor take

Google sues a Chinese group that used AI to write scam texts, 2.5M messages in two weeks.

HKR breakdown

hook ✓knowledge —resonance ✓

→ open source

68

SCORE

H1·K0·R1

17:18

46d ago

AI HOT (Curated Pool)· aihot-apiZH17:18 · 06·12

→$130 billion in data center projects blocked by protests so far this year

Ars Technica reports that data center projects worth $130 billion have been blocked by protests in the first half of 2026. Local residents and environmental groups oppose land use, water consumption, and grid strain. Some communities now share playbooks for fighting these projects, and the article argues this momentum will make future approvals harder.

#Ars Technica#Policy

editor take

$130B in data center projects blocked by protests in H1 2026, with communities now sharing playbooks to fight land, water, and grid impacts.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

78

SCORE

H1·K1·R1

17:17

46d ago

The Verge · AI· rssEN17:17 · 06·12

→Siri is good now? Apple's new version tested

Apple released a new Siri version that actually works well. The Vergecast hosts share early impressions: not bleeding edge, but good enough for most tasks. The post doesn't detail specific features or release timeline.

#Apple#The Verge

editor take

The Vergecast hosts say the new Siri actually works for everyday tasks now. No feature details or release date yet, so keep expectations in check.

HKR breakdown

hook ✓knowledge —resonance ✓

→ open source

55

SCORE

H1·K0·R1

16:58

46d ago

Hacker News Frontpage· rssEN16:58 · 06·12

→BitBoard launches an analytics workspace where humans and agents share dashboards

YC P25 startup BitBoard launches dashboards where coding agents and humans collaborate on live reporting. Founders Connor and Ambar pivoted from healthcare admin agents after customers kept asking for help with scattered data and spreadsheets. The idea: humans and agents share the same data primitives but get tools suited to each. Agents write SQL or code; dashboards evolve from queries to full embedded apps. Every answer has provenance, same params return same number. Next step: long-running agents that detect metric drift or funnel leaks, produce datasets and traces, and wait for team sign-off. Built on DuckDB and Apache Arrow for columnar analysis. LLM spots problems, deterministic code automates fixes. Email required to sign up.

#BitBoard#YC P25#DuckDB

editor take

BitBoard gives coding agents and humans a shared dashboard with traceable data—practical for team reporting.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

62

SCORE

H1·K1·R0

16:43

46d ago

r/LocalLLaMA· rssEN16:43 · 06·12

→llama.cpp merges PWA support, web UI now installable as a native app

The llama-server web UI can now be installed to your desktop with a standalone window and proper icons. The merged PR makes the interface faster to reopen and more robust around updates and caching. The post doesn't specify which browsers or platforms are supported.

#llama.cpp#ggml-org

editor take

llama.cpp web UI now installs as a desktop app with its own window and icon, faster to reopen.

HKR breakdown

hook —knowledge —resonance —

→ open source

55

SCORE

H0·K0·R0

16:14

46d ago

AI HOT (Curated Pool)· aihot-apiZH16:14 · 06·12

→Anthropic's first public survey: nearly half of Americans want AI to cure diseases, over 60% fear job loss

Anthropic ran an online survey of ~52,000 Americans via YouGov in Nov–Dec 2025, weighted to census benchmarks. 48% ranked curing diseases like cancer as the top hope; 36% want AI to assist people with disabilities. On the worry side: 64% fear job losses, 56% worry about cognitive dependence, 52% about misinformation. Over 70% support government regulation, with privacy (56%), child safety (52%), and accountability (49%) as top concerns. Only 15% trust AI companies to make decisions on their own. Partisan and regional splits are small on most issues. The post doesn't share the full questionnaire or crosstab details.

#Anthropic#YouGov

editor take

Anthropic's own survey of 52K Americans: curing disease tops hopes, job loss tops fears, 70%+ want regulation, only 15% trust AI companies to self-govern.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

72

SCORE

H1·K1·R1

16:00

46d ago

AI HOT (Curated Pool)· aihot-apiZH16:00 · 06·12

→OpenRouter explains how its model routing picks models, providers, and handles failover

OpenRouter published a technical post breaking its routing into two layers: model routing decides which model answers, and provider routing decides which provider serves that model. By default, traffic is distributed with inverse-square price weighting, so cheaper providers get more requests. You can override provider order, set a price ceiling, or use :nitro and :floor suffixes to control latency and cost. Failover uses a models array to try the next model if one errors. The Auto Router mode lets OpenRouter pick the model for you. The post also admits OpenRouter isn't a fit for teams that need local deployment or full control over the inference environment.

#OpenRouter#Anthropic#OpenAI

editor take

OpenRouter breaks routing into model selection and provider selection, with price-weighted traffic by default.

HKR breakdown

hook —knowledge ✓resonance —

→ open source

68

SCORE

H0·K1·R0

16:00

46d ago

AI HOT (Curated Pool)· aihot-apiZH16:00 · 06·12

→How to Use Hermes Agent with OpenRouter: Setup, Models & Routing

OpenRouter published a tutorial on connecting Hermes Agent to their API gateway. Hermes Agent is Nous Research's open-source CLI agent, not the Hermes 3 or 4 models—a common confusion. With OpenRouter, one API key gives access to 400+ models from 60+ providers with automatic failover. Default model is Claude Sonnet, but you can swap it. Config lives in ~/.hermes/config.yaml; you can offload side tasks like titling or vision to cheaper models. The agent is MIT-licensed; you only pay for token usage. The post doesn't disclose specific pricing—check openrouter.ai/pricing.

#Agent#OpenRouter#Nous Research#Hermes Agent

editor take

OpenRouter's tutorial shows how to hook Hermes Agent to its gateway: one key for 400+ models with auto-failover.

HKR breakdown

hook —knowledge —resonance —

→ open source

55

SCORE

H0·K0·R0

16:00

46d ago

AI HOT (Curated Pool)· aihot-apiZH16:00 · 06·12

→How to Get the Lowest-Cost LLM Inference on OpenRouter

OpenRouter published an official guide on minimizing LLM inference costs. The key trick: append `:floor` to your model slug to automatically route to the cheapest provider. For Llama 3.3 70B, input prices range from $0.10 to over $1.00 per million tokens across providers; `:floor` picks the lowest. Use `max_price` for a hard budget cap—requests fail if no provider qualifies. Start with free models: 50 requests/day on a free account, 1,000/day after adding $10 in credits. Caveat: the cheapest price may be a quantized endpoint; filter with `quantizations` if precision matters.

#OpenRouter#Llama 3.3 70B

editor take

OpenRouter's official guide: append `:floor` to auto-route to the cheapest provider—Llama 3.3 70B input prices vary 10x across providers.

HKR breakdown

hook —knowledge ✓resonance —

→ open source

55

SCORE

H0·K1·R0

15:56

46d ago

FEATUREDHugging Face Blog· rssEN15:56 · 06·12

→Ai2 releases olmo-eval: an evaluation workbench for the model development loop

Ai2 built olmo-eval on top of OLMES to handle evaluation during active model development, not just final scoring. You can add benchmarks, run them across checkpoints, and analyze results prompt by prompt as you tweak data, architecture, or hyperparameters. It supports multi-turn and agentic eval as a first-class use case, and includes analysis tools to tell whether a 2.4pp change is real or noise. Code is open on GitHub.

#Benchmarking#Agent#Ai2#OLMES

why featured

Featured · importance 72 · hook + knowledge

editor take

Ai2 turned the messy dev-loop eval into a reproducible workbench with native multi-turn and agent support.

sharp

The reason this is worth a look: anyone training models knows the pain of re-running evals after every data tweak or hyperparameter change, then trying to figure out if a 2.4pp shift is real. olmo-eval turns that into a pipeline—add benchmarks, run them across checkpoints, analyze per-prompt, with multi-turn and agent eval built in from the start. It's built on OLMES and open on GitHub. I'd treat this as dev-loop infrastructure, not another leaderboard tool.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

72

SCORE

H1·K1·R0

15:56

46d ago

FEATUREDAI HOT (Curated Pool)· aihot-apiZH15:56 · 06·12

→olmo-eval: an evaluation workbench for the model development loop

Allen AI released olmo-eval, an evaluation workbench built on OLMES and designed for repeated testing during model development. It treats agentic and multi-turn evaluation as first-class use cases, supports lightweight or containerized runs, and uses a modular design where models, tools, and environments are independently swappable. Results include scores, standard errors, and minimum detectable effects so you can tell real improvement from noise. Unlike Harbor, which focuses on release, olmo-eval targets fast iteration during development and lets you compare checkpoint outputs question by question.

#Benchmarking#Allen AI#OLMES#Harbor

why featured

Featured · importance 72 · hook + knowledge

editor take

Allen AI turned repeated model evaluation into a modular workbench where agentic and multi-turn tests are first-class.

sharp

This one's worth opening because it tackles a pain every model builder knows: you're running evals constantly during development, but existing tools are either for final benchmark scores on finished models or for sandboxed agent tasks — nothing handles the messy middle where you're swapping checkpoints, tools, and prompts every few hours. olmo-eval breaks everything into swappable modules — models, tools, environments, helper models — so you can run lightweight scores or containerized complex scenarios. Results come with standard errors and minimum detectable effects, so you can tell real improvement from noise. The division of labor with Harbor is clean: Harbor handles release-day evaluation, olmo-eval handles fast dev-loop iteration, including question-by-question diffs between checkpoints. Code and docs are on GitHub now; Allen AI used it internally for OLMo and Tulu. Whether the community builds enough task and tool plugins to make the modular design pay off is still an open question.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

72

SCORE

H1·K1·R0

15:50

46d ago

● P1TechCrunch AI· rssEN15:50 · 06·12

→MANGOS replaces FAANG as major AI companies plan summer IPO push

This TechCrunch podcast episode covers the IPO market heating up with a new acronym: MANGOS — Meta (or Microsoft), Anthropic, Nvidia, Google, OpenAI, and SpaceX. Half of that group is heading to public markets in the same window, testing investor appetite and valuations. The post is an RSS snippet and doesn't disclose specific timelines or valuation ranges.

#Meta#Microsoft#Anthropic#Funding

why featured

Featured · importance 88 · hook + knowledge + resonance

editor take

TechCrunch coined 'MANGOS' for a potential IPO wave this summer — SpaceX, Anthropic, OpenAI, and others. No valuations or timelines yet, so treat this as a narrative signal, not a confirmed calendar.

sharp

TechCrunch dropped two headlines packaging SpaceX, Anthropic, OpenAI, and others into a 'MANGOS' acronym, pointing to a hot IPO summer for AI and space companies. Both headlines come from the same outlet — not multiple independent confirmations — so the breadth-of-coverage signal is weak here. The MANGOS label is clearly riding the FAANG memory hook, but the companies inside it are wildly different. SpaceX builds rockets; Anthropic and OpenAI sell API access to foundation models. Their revenue models, capital needs, and regulatory exposure don't line up neatly. This feels more like a media coinage than an organic industry category. What's missing: no S-1 filings confirmed, no valuation ranges disclosed, no specific windows beyond 'this summer.' I'd read this as narrative preheating, not a locked IPO calendar.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

88

SCORE

H1·K1·R1

15:42

46d ago

Hacker News Frontpage· rssEN15:42 · 06·12

→Keygen.music: a site that generates keys from music

Keygen.music turns a music clip into a software license key. Play a melody, get an activation code. The post doesn't disclose the algorithm or supported formats, but 32 HN upvotes suggest the community finds it clever.

editor take

Play a melody, get a license key. Clever demo, but the post doesn't explain the algorithm or supported formats.

HKR breakdown

hook ✓knowledge —resonance —

→ open source

55

SCORE

H1·K0·R0

15:33

46d ago

AI HOT (Curated Pool)· aihot-apiZH15:33 · 06·12

→ByteDance Doubao adds Task Mode for scheduled execution, web and PPT generation

Doubao now bakes Agent capabilities directly into the app: scheduled task execution, no-code web page generation, one-click PPT creation, and data visualization. The former Thinking Mode is upgraded to Expert Mode, running on Doubao Model 2.0 Pro for deeper reasoning. The app top bar now shows three modes: Quick, Expert, Task. Basic features are free; paid tiers start at ¥68/month for Standard, ¥200/month for Enhanced, and ¥500/month for Professional. The post does not disclose task-mode latency, success rates, or benchmarks for Expert Mode.

#Code#ByteDance#Doubao

editor take

Doubao bakes Agent capabilities into the app's top bar—scheduled tasks, no-code web pages, PPTs—but the post doesn't disclose task success rates or latency.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

72

SCORE

H1·K1·R0

15:31

46d ago

Hacker News Frontpage· rssEN15:31 · 06·12

→Macs can finally power on remotely without pressing the power button

macOS 26.5 adds an 'Always' option for 'Start up when power is connected,' letting Macs boot automatically after power loss. Jeff Geerling tested it on an M4 Mac mini: shutdown, then toggled a smart outlet, and the Mac booted in under 2 seconds. Supported on Mac mini (2024+), Mac Studio (2025+), and iMac (2024+). Caveats: FileVault requires SSH login first; a bug prevents boot if the Mac was shut down from the login screen.

#Apple#Jeff Geerling#M4 Mac mini

editor take

macOS 26.5 finally lets you power on a Mac remotely via a smart outlet—boots in under 2 seconds.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

55

SCORE

H1·K1·R0

15:26

46d ago

Hacker News Frontpage· rssEN15:26 · 06·12

→WebAssembly gets a GPU API: WASI WebGPU proposal lands

The WASI WebGPU proposal lets WebAssembly modules talk directly to the GPU for compute and rendering. The repo only has interface definitions so far — no word on which backends (Vulkan/Metal/DX12) are supported or any benchmarks. For AI practitioners, this could mean running inference in browser or edge devices directly on GPU without a JS bridge.

#WebAssembly#WASI

editor take

WASI WebGPU lets Wasm talk directly to GPU — could cut JS bridge overhead for browser inference.

HKR breakdown

hook —knowledge ✓resonance —

→ open source

62

SCORE

H0·K1·R0

15:26

46d ago

Hacker News Frontpage· rssEN15:26 · 06·12

→StackScope crawled 40k+ indie launches to reveal what stacks people actually ship with

Jonathan built StackScope, a crawler that watches new launches on Product Hunt, Show HN, and PeerPush, then inspects each public site for hosting, frameworks, analytics, DNS, security headers, legal pages, and AI-builder signals. Unlike broad web scanners, it focuses on what indie makers choose at launch. It runs on .NET with Playwright for rendered pages, uses a first-party fingerprint catalogue, respects robots.txt, and identifies itself. A current pain point: Cloudflare hasn't granted verified bot status yet, blocking about 10% of sites. A private readiness check lets you paste a URL and get a report with no signup. The post doesn't disclose the time range of the 40k launches or any aggregate stack-distribution numbers—only the title and feature set are confirmed so far.

#StackScope#Product Hunt#Hacker News (Show HN)

editor take

StackScope crawled 40k+ indie launches to see what tech they actually ship with—more focused than broad web scanners.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

68

SCORE

H1·K1·R0

15:08

46d ago

Hacker News Frontpage· rssEN15:08 · 06·12

→Bulk delete Claude chats? This script does what the UI won't

Claude's web UI lacks a bulk-delete button—you have to scroll, select, and delete manually, which breaks with many chats. Matteo Leonesi built a script to automate it. Conversations disappear slowly over minutes, and you must keep the tab open. The post doesn't spell out the license or whether it works with other models.

#Matteo Leonesi

editor take

Claude web UI has no bulk delete — this script automates it, but it's slow and the tab must stay open.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

45

SCORE

H1·K1·R0

14:55

46d ago

FEATUREDr/LocalLLaMA· rssEN14:55 · 06·12

→MiniMax open-sources MSA, a sparse attention method that cuts attention compute by 28.4× at 1M tokens on a 109B model

MiniMax published a paper introducing MSA, a blockwise sparse attention built on GQA. A lightweight index branch scores KV blocks and picks a top-k subset per GQA group, then the main branch runs exact attention only on those blocks. With a co-designed GPU kernel, a 109B-parameter multimodal model achieves 14.2× prefill and 7.6× decoding wall-clock speedups on H800 at 1M context, matching full GQA quality. Code and inference kernel are open-sourced, along with a model called MiniMax-M3. The Reddit poster is curious whether the 109B model can run on consumer GPUs; the post doesn't say if weights will be released.

#Inference-opt#MiniMax#MiniMax-M3

why featured

Featured · importance 72 · hook + knowledge

editor take

MiniMax's block-sparse attention hits 14× prefill speedup at 1M context on a 109B model; code is open, weights are unconfirmed.

sharp

This caught my eye because someone finally attacked 1M-context inference at the attention level—not via MoE or quantization. MiniMax added a lightweight index branch on top of GQA: it scores KV blocks, picks a top-k subset per query group, then runs exact attention only on those. With a custom GPU kernel, their 109B multimodal model hits 14.2× prefill and 7.6× decoding speedups on H800 at 1M context, matching full GQA quality. I'd discount this in two ways. One, the post is a single Reddit thread and the source link returns a 403, so I can't verify the paper details or benchmarks directly. Two, those speedups are on H800—the poster asks whether this runs on consumer GPUs, and the post doesn't answer. A 109B model is heavy regardless, and sparse kernel behavior on consumer cards is an open question. The concrete part: code and inference kernel are open-sourced, along with a model called MiniMax-M3. If weights drop too, this stops being a paper and becomes something you can actually try.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

72

SCORE

H1·K1·R0

14:53

46d ago

r/LocalLLaMA· rssEN14:53 · 06·12

→MiniMax M3 lands on HuggingChat with Artifacts support

MiniMax M3 is now available on HuggingChat, with Artifacts support for code and web output. The post doesn't disclose model specs, open-source status, or benchmark comparisons—just the launch and feature. Worth a try if you want a chat model that can generate runnable code or pages.

#Code#MiniMax#HuggingChat#Open source

editor take

MiniMax M3 lands on HuggingChat with Artifacts support, but the post is 403'd—no specs or open-source status disclosed.

HKR breakdown

hook ✓knowledge —resonance —

→ open source

62

SCORE

H1·K0·R0

14:48

46d ago

Hacker News Frontpage· rssEN14:48 · 06·12

→A practical guide to reducing slop in AI-generated frontend code

A blog post offers hands-on tips to clean up the slop in AI-generated frontend code: remove unnecessary wrapper divs, drop over-abstracted CSS classes, and verify that logic is actually used. The post doesn't recommend specific tools or plugins, but the advice is practical for developers using AI to write UI.

editor take

One dev's trick to cut AI frontend slop: ask the agent to make it look like a Qt app.

HKR breakdown

hook —knowledge ✓resonance —

→ open source

55

SCORE

H0·K1·R0

14:11

46d ago

FEATUREDAI HOT (Curated Pool)· aihot-apiZH14:11 · 06·12

→MiniMax open-sources M3: 428B total params, 23B active, 1M-token context window

MiniMax uploaded M3 weights to HuggingFace, with the tech report and full weights expected in about 10 days. It's a 428B-total-param, 23B-active-param hybrid model using MiniMax sparse attention to push the context window to 1M tokens, plus native multimodal support. Coding and agent scores: SWE-Bench Pro 59.0%, Terminal Bench 2.1 66.0%, SWE-fficiency 34.8%, KernelBench Hard 28.8%, MCP Atlas 74.2%. MiniMax Code tool and API platform launched alongside. The post doesn't disclose training data, inference cost, or license terms — I'd hold off on usability judgments until the report drops.

#Code#Agent#Multimodal#MiniMax

why featured

Featured · importance 82 · hook + knowledge + resonance

editor take

MiniMax dropped M3 weights on HuggingFace — 428B total, 23B active params — but the tech report is still ~10 days out.

sharp

The reason to click: MiniMax finally open-weighted M3. The architecture is unusual — 428B total params but only 23B active per forward pass, with their own sparse attention pushing context to 1M tokens, plus native multimodal. Scores look solid for code and agent work: 59.0% on SWE-Bench Pro, 66.0% on Terminal Bench 2.1, 74.2% on MCP Atlas. But the post doesn't disclose training data, inference cost, or license terms, and the tech report won't land for another ~10 days. I'd hold off on any usability judgment until then. What's real right now: the weights are on HuggingFace, you can download them, but whether you can use them commercially or how expensive inference gets — nobody knows yet.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

82

SCORE

H1·K1·R1

14:11

46d ago

r/LocalLLaMA· rssEN14:11 · 06·12

→Two-shot with Hermes Qwen3.6-35B on RTX 3060 12GB

Reddit user yes2matt ran Qwen3.6-35B (4-bit quantized) on an RTX 3060 12GB via llama.cpp, generating a boombox-style spectrum analyzer GIF in just two prompts. The first prompt asked for a Python FFT script outputting a 15fps 320px GIF; the second refined it to skip the first 200ms, show only low frequencies, and apply a log transform. The model executed both correctly. The post doesn't disclose inference speed or VRAM usage, but running a 35B model on 12GB is a practical data point.

#Code#Qwen3.6-35B#Hermes#RTX 3060

editor take

35B quantized to fit 12GB VRAM, wrote a correct spectrum GIF script in two prompts—but the post doesn't disclose inference speed.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

62

SCORE

H1·K1·R0

14:07

46d ago

FEATUREDr/LocalLLaMA· rssEN14:07 · 06·12

→MiniMax-M3 open-sourced: a 428B MoE model with 23B activated parameters

MiniMax released MiniMax-M3 weights on Hugging Face. It's a mixture-of-experts model with ~428B total parameters and ~23B activated per inference. The post doesn't disclose training data, benchmarks, or minimum VRAM for local runs.

#MiniMax#Open source

why featured

Featured · importance 72 · hook + knowledge + resonance

editor take

MiniMax dropped M3 weights: 428B total, 23B active MoE. Post body is 403 — no benchmarks or hardware reqs.

sharp

The headline is that MiniMax put M3 weights on Hugging Face — 428B total params, ~23B active per forward pass. That MoE ratio sounds compute-friendly, but the Reddit post itself is 403-blocked, so all we have is the title and summary. No training data, no benchmarks, no minimum VRAM. If you're thinking about running it locally, I'd wait until someone posts real memory numbers and token/s before downloading.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

72

SCORE

H1·K1·R1

13:49

46d ago

Hacker News Frontpage· rssEN13:49 · 06·12

→Meta services including Facebook and Instagram are down

Facebook and Instagram are currently down. Meta's official status page at metastatus.com shows no outage, which may indicate a wider disruption than the page reflects. The post doesn't specify affected regions or an ETA for recovery.

#Meta#Facebook#Instagram#Incident

editor take

Facebook and Instagram are down, but Meta's status page shows nothing—likely a bigger outage.

HKR breakdown

hook ✓knowledge —resonance ✓

→ open source

55

SCORE

H1·K0·R1

13:11

46d ago

r/LocalLLaMA· rssEN13:11 · 06·12

→Open Dungeon: local roleplay with Gemma 4 QAT and inline Uncen-FLUX images at 256K context under 8GB RAM

The author built a fully local AI Dungeon clone using Gemma 4 12B (QAT Q4) via Ollama for narration and FLUX for on-device image generation—no APIs, no cloud. The 12B model runs at full 256K context while staying around 7.7 GB RAM because Gemma 4's KV cache barely grows. Scenes that scroll out of context get folded into a running summary so the narrator remembers chapter one. It supports Do/Say/Story modes, Continue, Retry, Erase, and line editing; the UI shows RAM cost before you pick a model. Mac one-click build is available, MIT license.

#Gemma 4#Ollama#FLUX#Open source

editor take

Gemma 4 12B Q4 runs full 256K context at ~7.7GB RAM—fully local AI Dungeon with inline image gen, Mac one-click build available.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

72

SCORE

H1·K1·R0

12:55

46d ago

r/LocalLLaMA· rssEN12:55 · 06·12

→Qwen 3.6 27B + Openclaw on 16 GB VRAM: a working setup

A user runs Qwen 3.6 27B (4bpw GGUF) with Openclaw on a 5070 Ti with 16GB VRAM. The 35B version had tool-calling loops; 27B works. They close all apps before loading to free ~15.2GB, leaving 800MB free. Context window is 100K, tool calls work, but only 2 hours of testing. The post doesn't specify inference speed or Openclaw version.

#Qwen#Openclaw#NVIDIA GeForce RTX 5070 Ti

editor take

Qwen 3.6 27B + Openclaw fits 16GB VRAM with 800MB to spare, tool calls stable—but don't open a browser.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

55

SCORE

H1·K1·R0

12:51

46d ago

r/LocalLLaMA· rssEN12:51 · 06·12

→ContextSpy: Profile your LLM context like a CPU profiler

ContextSpy is a local proxy that sits between your coding agent and the LLM API, recording every request and breaking down where input tokens go — system prompt, tool definitions, file contents, conversation history. Inspired by PyCon, the author aims to optimize token usage by profiling context rather than brute-force compression. Early stage; the post doesn't specify supported models or performance overhead.

#ContextSpy#PyCon

editor take

ContextSpy sits between your coding agent and LLM API, profiling where tokens go — like a profiler for context.

HKR breakdown

hook —knowledge ✓resonance —

→ open source

62

SCORE

H0·K1·R0

12:28

46d ago

r/LocalLLaMA· rssEN12:28 · 06·12

→Supra Title 350M: A Tiny Model Built Just for Chat Titles

SupraLabs released a 350M-parameter model that does one thing: generate titles for chat conversations. It's fine-tuned from LFM2.5-350M, needs no system prompt—just feed it the user message and get a title back. Available in GGUF format from 177 MB to 711 MB; Q8_0 or Q6_K recommended. Still experimental; the team plans to expand the SFT dataset and apply preference optimization. The post doesn't disclose inference speed or latency, but at this size it should run fast locally.

#Fine-tuning#SupraLabs#LFM2.5-350M

editor take

SupraLabs released a 350M model that only generates chat titles—no system prompt needed, just feed it the message.

HKR breakdown

hook —knowledge ✓resonance —

→ open source

55

SCORE

H0·K1·R0

12:00

46d ago

Hacker News Frontpage· rssEN12:00 · 06·12

→Maxproof: A New Method to Make AI Reasoning Verifiable

Maxproof is a new paper that proposes making models output a verifiable 'proof' alongside their reasoning, not just the answer. The post doesn't spell out technical details or experimental results, but the title points to a key direction: solving the 'black box' problem in AI reasoning so outputs can be independently checked. Worth a look for people working on interpretability and safety alignment.

#Reasoning#Interpretability

editor take

MaxProof makes models output verifiable proofs alongside answers—scored 35/42 on IMO 2025, above human gold threshold.

HKR breakdown

hook —knowledge —resonance —

→ open source

62

SCORE

H0·K0·R0

11:56

46d ago

Product Hunt · AI· rssEN11:56 · 06·12

→MindReader v1: Simulated fMRI reads your brain, open-source

MindReader v1 simulates brain region responses to content using Meta FAIR's TRIBE v2 and 35 years of neuro research. It outputs neuro-metrics for sales evals and dataset analysis. Fully open-source, but the post doesn't specify simulation accuracy or real-world validation.

#Meta FAIR#Product Hunt#Open source

editor take

Type in copy, get simulated brain-region responses. Open-source, but no accuracy data — take it as a toy for now.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

55

SCORE

H1·K1·R0

11:15

46d ago

FEATUREDAI HOT (Curated Pool)· aihot-apiZH11:15 · 06·12

→Pokémon Go players' data was used to train AI for military drones

Niantic used real-world scans from Pokémon Go players to train a geospatial AI model later applied to military drone projects. Players contributed training data without knowing. The model can recognize and navigate physical spaces; military partners include the U.S. Army. The post doesn't spell out data volume, contract value, or whether players can retroactively opt out.

#Niantic#Pokémon Go#U.S. Army

why featured

Featured · importance 78 · hook + knowledge + resonance

editor take

Niantic trained a geospatial AI on Pokémon Go player scans, then applied it to U.S. Army drone projects without telling players.

sharp

The reason this story lands hard: it connects a casual game directly to military drone navigation. Niantic had players scan real-world locations—parks, landmarks, streets—and used that data to train a geospatial AI that understands physical spaces. That model later ended up in drone projects with the U.S. Army and other defense partners, helping machines recognize and navigate real environments. Players never knew their scans could feed military applications. The post doesn't spell out how much player data was used, the contract value, or whether anyone can retroactively opt out. I'd watch two gaps: what Niantic's privacy policy actually said at the time, and the scope of those military contracts—both are blank right now.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

78

SCORE

H1·K1·R1

11:01

46d ago

AI HOT (Curated Pool)· aihot-apiZH11:01 · 06·12

→DeepMind launches robotics accelerator with 15 startups

Google DeepMind has kicked off its robotics accelerator with 15 startups. The three-month program offers AI stack, Gemini Robotics models, and hands-on team support to push physical AI in Europe. The post doesn't spell out each startup's focus or technical details.

#Google DeepMind

editor take

DeepMind's robotics accelerator picked 15 European startups, offering Gemini Robotics models and team support. The post doesn't name their focus areas.

HKR breakdown

hook —knowledge —resonance —

→ open source

55

SCORE

H0·K0·R0

10:42

46d ago

● P1Hacker News Frontpage· rssEN10:42 · 06·12

→Moonshot AI open-sources Kimi K2.7-Code coding model with improved token efficiency

Moonshot AI released Kimi K2.7-Code on Hugging Face, claiming better token efficiency than peers. The model card is the only source—no technical report, no benchmarks, no architecture details or parameter count disclosed. 42 points and 4 comments on HN so far. I'd hold off: there's too little to evaluate without third-party benchmarks.

#Code#Moonshot AI#Kimi#Open source

why featured

Featured · importance 96 · hook + resonance

editor take

Moonshot AI open-sourced Kimi K2.7-Code. Right now it's just a Hugging Face model card and one Chinese media report — no technical paper or benchmark comparisons yet.

sharp

Moonshot AI dropped Kimi K2.7-Code on Hugging Face today. Two sources picked it up: one Chinese AI outlet and a Reddit post on r/LocalLLaMA that got blocked, so we can't see the community reaction. I'd take this with a grain of salt for now. The model card likely has parameter count, context window, and supported languages, but neither source dug into actual performance numbers. No technical report, no side-by-side with DeepSeek-Coder, Code Llama, or Qwen-Coder. The "significant performance improvement" claim is just in the headline — no numbers to back it yet. If you're evaluating code models, don't switch just yet. Wait for benchmarks or community evals on HumanEval and MBPP before making a call.

HKR breakdown

hook ✓knowledge —resonance ✓

→ open source

96

SCORE

H1·K0·R1

10:16

46d ago

FEATUREDAI HOT (Curated Pool)· aihot-apiZH10:16 · 06·12

→Kimi releases and open-sources Kimi-K2.7-Code

Kimi open-sourced K2.7-Code, scoring 11%–31.5% higher than K2.6 on three in-house benchmarks. Inference token usage dropped 30%, and long-coding-task instruction-following and end-to-end success rate both improved. A 6x speed mode is coming; the model is available now via Kimi API and Kimi Code. The post doesn't disclose parameter count, training data, or the open-source license.

#Code#Reasoning#Kimi (Moonshot AI)#Open source

why featured

Featured · importance 82 · hook + knowledge + resonance

editor take

Kimi open-sourced K2.7-Code with 11%–31.5% gains on in-house benchmarks and 30% fewer inference tokens, but no param count or license disclosed.

sharp

Kimi's moving fast on code models — K2.6 barely landed and K2.7 is already here with solid jumps on their own benchmarks. The 30% token reduction is the part I'd pay attention to: same task, fewer tokens, lower API cost. That's real if it holds. I'd discount the benchmark numbers a bit. All three are in-house — Kimi Code Bench v2, Program Bench, MLS Bench Lite — and we don't have external baselines to compare against. No parameter count, no training data details, no license. The 6x speed mode is just a teaser for now. If you're already on Kimi API or Kimi Code, try it. The token savings might show up on your bill. If you're shopping for an open-source code model, wait for the license and some third-party evals.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

82

SCORE

H1·K1·R1

10:14

46d ago

FEATUREDr/LocalLLaMA· rssEN10:14 · 06·12

→MTP speculative decoding with Gemma 4: assistant model choice makes or breaks speed gains

A user tested MTP speculative decoding with Gemma 4 Heretic models in llama.cpp and found assistant model selection is everything. A 26B Q8 jumped from 30 t/s to 62 t/s; a 12B Q4 went from 12 t/s to 54 t/s. Two GGUFs with the same name aren't always identical. Unquantized assistants consistently beat Q4/Q8 assistants by roughly 10 t/s. Draft count of 1 gave the best results across the board. Always check logs to confirm MTP actually initialized—otherwise you're benchmarking the base model by accident.

#llama.cpp#Gemma 4#Google

why featured

Featured · importance 72 · hook + knowledge + resonance

editor take

MTP speculative decoding speedup depends entirely on assistant model choice: same-name GGUFs aren't always identical, and unquantized assistants beat Q4/Q8 by ~10 t/s.

sharp

This one's worth opening because it nails a specific MTP speculative decoding trap: pick the wrong assistant model and your speedup goes from 2x to basically nothing. The author ran Gemma 4 Heretic in llama.cpp. A 26B Q8 jumped from 30 t/s to 62 t/s; a 12B Q4 went from 12 t/s to 54 t/s. The useful bit: two GGUFs with the same filename aren't necessarily the same file, unquantized assistants consistently beat Q4/Q8 by about 10 t/s, and a draft count of 1 gave the best results across the board. One practical tip: always check the logs to confirm MTP actually initialized. If it didn't, you're benchmarking the base model by accident. The post body returned a 403, so I can't see the exact test setup or model sources, but the takeaways are solid for anyone running local MTP.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

72

SCORE

H1·K1·R1

10:00

46d ago

OpenAI Blog· rssEN10:00 · 06·12

→OpenAI launches three new Academy courses to help teams build repeatable AI workflows

OpenAI released three new Academy courses today: AI Foundations, Applied AI Foundations, and Agents and Workflows. They start with prompting and output review, move to turning one-off uses into repeatable workflows, and end with directing agent-assisted tasks. Partners include BCG, Accenture, and BBVA. Each course offers a completion certificate. The post does not disclose course duration or pricing.

#OpenAI#BCG#Accenture

editor take

OpenAI launched three courses covering prompting to agent workflows, with BCG and Accenture as partners, but no word on duration or pricing.

HKR breakdown

hook —knowledge —resonance —

→ open source

55

SCORE

H0·K0·R0

10:00

46d ago

AI HOT (Curated Pool)· aihot-apiZH10:00 · 06·12

→OpenAI launches three free courses on practical AI agents for work

OpenAI released three Academy courses for workers who want to apply AI on the job. They cover building repeatable workflows and using AI agents. The post doesn't name the courses, duration, or price—only says they teach practical AI skills.

#OpenAI

editor take

OpenAI dropped three free Academy courses covering prompting to agent workflows, but doesn't name the courses or duration.

HKR breakdown

hook —knowledge —resonance —

→ open source

45

SCORE

H0·K0·R0

09:21

46d ago

r/LocalLLaMA· rssEN09:21 · 06·12

→A browser-use agent that runs entirely in WASM at zero cost

A developer built a fully self-contained browser-use agent using Snapdom, WASM, WebGPU, and the ShowUi-2b model—no server needed. It can type, click links, change dropdowns, and handle multi-step actions (click input → type → submit) with ~50% success. The author notes browser automation is very hard; only a limited set of actions is supported and the code is super early alpha. Tests used Mind2Web and MiniWob to improve accuracy, and a click-offset bug in Snapdom was fixed.

#Snapdom#WASM#WebGPU#Open source

editor take

A fully client-side browser agent runs in WASM with no server, but ~50% success and super early alpha—don't get excited yet.

HKR breakdown

hook ✓knowledge ✓resonance —

→ open source

62

SCORE

H1·K1·R0

more

✕

feeds

hot events daily column all posts podcasts curated X monitor saved sources newsletter agent access

admin

usage system newsletter curation iterations users