hot events · 2026-05-04

▸ 19 signals · updated 3m ago

live · 217 today·policy v2

LATENT SPACEAnthropic pulls Fable and Mythos after US e…96·LATENT SPACEAnthropic launches Claude Fable 5, its firs…88·HACKER NEWS FRONTPAGDid Anthropic ask for its own export contro…82·HACKER NEWS FRONTPAGAnthropic flies senior technical staff to D…82·AI HOT (CURATED POOLWSJ: OpenAI weighs steep price cuts and pla…82·HACKER NEWS FRONTPAGBram Cohen: Claude is turning into an assho…78·R/LOCALLLAMAXiaomi serves MiMo V2.5 at 1000–3000 tps wi…78·IMPORT AI (JACK CLARAI learns to game society's rules, and Anth…78·MIT TECHNOLOGY REVIEGoogle DeepMind is worried about what happe…78·DWARKESH PATELThe sample efficiency black hole: AI models…78·LATENT SPACECognition launches FrontierCode: a coding b…78·HACKER NEWS FRONTPAGGabriel Weinberg argues with data that “eve…78·LATENT SPACEAnthropic pulls Fable and Mythos after US e…96·LATENT SPACEAnthropic launches Claude Fable 5, its firs…88·HACKER NEWS FRONTPAGDid Anthropic ask for its own export contro…82·HACKER NEWS FRONTPAGAnthropic flies senior technical staff to D…82·AI HOT (CURATED POOLWSJ: OpenAI weighs steep price cuts and pla…82·HACKER NEWS FRONTPAGBram Cohen: Claude is turning into an assho…78·R/LOCALLLAMAXiaomi serves MiMo V2.5 at 1000–3000 tps wi…78·IMPORT AI (JACK CLARAI learns to game society's rules, and Anth…78·MIT TECHNOLOGY REVIEGoogle DeepMind is worried about what happe…78·DWARKESH PATELThe sample efficiency black hole: AI models…78·LATENT SPACECognition launches FrontierCode: a coding b…78·HACKER NEWS FRONTPAGGabriel Weinberg argues with data that “eve…78·LATENT SPACEAnthropic pulls Fable and Mythos after US e…96·LATENT SPACEAnthropic launches Claude Fable 5, its firs…88·HACKER NEWS FRONTPAGDid Anthropic ask for its own export contro…82·HACKER NEWS FRONTPAGAnthropic flies senior technical staff to D…82·AI HOT (CURATED POOLWSJ: OpenAI weighs steep price cuts and pla…82·HACKER NEWS FRONTPAGBram Cohen: Claude is turning into an assho…78·R/LOCALLLAMAXiaomi serves MiMo V2.5 at 1000–3000 tps wi…78·IMPORT AI (JACK CLARAI learns to game society's rules, and Anth…78·MIT TECHNOLOGY REVIEGoogle DeepMind is worried about what happe…78·DWARKESH PATELThe sample efficiency black hole: AI models…78·LATENT SPACECognition launches FrontierCode: a coding b…78·HACKER NEWS FRONTPAGGabriel Weinberg argues with data that “eve…78·

⤓ RSS live

browse by dayclear filter ✕

May 2026

MTWTFSS

126 212 320 419 542 632 749 826 923 1017 1136 1248 1337 1454 1539 1630 1719 1849 1976 2045 2148 2249 2313 2415 2520 2637 2744 2848 2935 3022 3114

June 2026

MTWTFSS

147 258 348 447 545 619 715 852 945 1031 1128 1222 1313 1416 154161718192021222324252627282930

2026-05-04 · Mon

22:56

41d ago

FEATUREDBloomberg Technology· rssEN22:56 · 05·04

→Meta Taps Morgan Stanley, JPMorgan for El Paso Data Center Deal

Meta is arranging financing for an El Paso, Texas data center, totaling about $13 billion. Morgan Stanley and JPMorgan are involved; the post does not disclose debt structure, tenor, or rates. The deal shows Big Tech using debt for AI infrastructure spend.

#Meta#Morgan Stanley#JPMorgan#Funding

why featured

Bloomberg reports Meta is preparing about $13B in financing for its El Paso data center, enough for HKR-H/K/R. It is not a model or product launch, and debt structure, tenor, and rates are not disclosed, so it stays in the lower featured band.

editor take

Meta is lining up $13B for an El Paso data center; AI capex has moved from budgets into Wall Street’s debt machine.

sharp

Meta’s $13B El Paso financing is the cleanest signal that AI infrastructure has left normal capex planning. Morgan Stanley and JPMorgan are not decoration here; they are turning one Texas data center into a financeable asset. The article gives the size, site, and banks, but not structure, tenor, or rates. Those missing terms decide whether this is plain project debt or a template for packaging GPU hunger into market paper. I don’t buy the lazy “Big Tech has enough cash” read anymore. Meta can fund plenty from ads, but a single El Paso build reaching $13B says the unit economics are now too large for spreadsheet comfort. Microsoft, OpenAI, and CoreWeave already pushed AI compute into structured financing. Meta is now walking the same road, with a cleaner balance sheet and a much larger ad engine.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

21:58

41d ago

FEATUREDr/LocalLLaMA· rssEN21:58 · 05·04

→Benching Local Qwen as a Codex Validator, Co-agent, and Challenger

robert896r1 tested Qwen3.6 27B GGUF beside Codex as a coding validator and released a reproducible eval suite. The runs covered Bartowski, Unsloth, 65k/128k context, and q8/f16 KV cache; three 128k profiles tied for best, with no measured q8 KV accuracy loss in this suite. The useful signal is the sidecar eval: missed directives, overbuilding, UI judgment, and long-context misses, not a universal leaderboard.

#Agent#Code#Benchmarking#Qwen

why featured

HKR-H/K/R all pass: a reproducible sidecar eval with concrete Qwen/Codex conditions beats a normal Reddit tip. Source authority and event scale keep it in the 72–77 band, not a same-day must-write.

editor take

This is the right job for local models: stop trying to beat Codex, and catch missed directives, overbuilds, and long-context slips.

sharp

Local Qwen3.6 27B looks useful here because it is being used as an engineering checker, not sold as a Codex replacement. robert896r1 put GGUF builds beside Codex and tested Bartowski, Unsloth, 65k/128k context, and q8/f16 KV cache. Three 128k profiles tied for best, and q8 KV showed no accuracy loss in this suite. I like the setup because the eval targets the failure modes teams actually feel: missed directives, overbuilding, UI judgment, and long-context omissions. SWE-bench tells you whether a model can fix benchmark issues; this is closer to a grumpy reviewer sitting next to the coding agent. The caveat is hard: the Reddit body is blocked with 403, so sample size, task source, and grading rules are not visible. Treat it as a useful sidecar-eval pattern, not a Qwen3.6 27B leaderboard.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

21:53

41d ago

FEATUREDTechCrunch AI· rssEN21:53 · 05·04

→OpenAI’s cozy partner Cerebras is on track for a blockbuster IPO

Cerebras is moving toward an IPO at a valuation of $26.6 billion or more. The snippet says its OpenAI relationship is deep, but does not disclose ownership, revenue, or timing. The key signal is OpenAI-linked supply-chain valuation, not just AI chips.

#Inference-opt#Cerebras#OpenAI#Funding

why featured

HKR-H/K/R all pass: OpenAI partner, $26.6B valuation, and an IPO angle tied to AI compute supply. Lack of revenue, ownership, and timetable keeps it below must-write model-release territory.

editor take

Cerebras chasing a $26.6B IPO is selling OpenAI proximity, not just wafers; without revenue or order detail, the pricing deserves suspicion.

sharp

Cerebras at a $26.6B-plus IPO valuation looks less like a hardware victory than an OpenAI halo trade. TechCrunch gives two hard hooks: the target valuation and a “deep” OpenAI relationship. It gives no revenue, gross margin, contracted orders, or listing timeline. For a chip company, those are not minor blanks. I don’t buy the easy “AI chip breakout” framing yet. Nvidia’s premium comes from CUDA, supply control, customer lock-in, and visible data-center revenue. Cerebras has a bold wafer-scale architecture, and inference demand is real. But public investors will ask the boring question: is OpenAI a durable buyer, a technical partner, or just the anchor name in the deck? If it is mostly the anchor, $26.6B is a rich price.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

21:38

41d ago

FEATUREDr/LocalLLaMA· rssEN21:38 · 05·04

→FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8

FastDMS released an MIT implementation that cuts KV memory to 1/5–1/8 of vLLM BF16 at 8K context. A Llama-3.2-1B replication reports PPL 9.200 with 6.4x compression; Qwen3-8B c=1 drops KV from 1.406 GiB to 0.184 GiB. The key detail is physical reclamation of evicted slots, not just nominal KV-byte reduction.

#Inference-opt#NVIDIA#University of Warsaw#University of Edinburgh

why featured

HKR-H/K/R all pass: the hook is counterintuitive, with compression, PPL, KV GiB deltas, and physical slot reclamation. Reddit/open-source sourcing keeps it in 78–84, below P1.

editor take

If FastDMS really reclaims evicted slots, KV compression hits serving economics, not paper math; Reddit is 403, so don’t treat 6.4x as production proof yet.

sharp

FastDMS is sharp because it claims physical reclamation of evicted KV slots, not just smaller tensors. The supplied numbers are strong: at 8K context, KV memory falls to 1/5–1/8 of vLLM BF16; Qwen3-8B at concurrency 1 drops from 1.406 GiB to 0.184 GiB; Llama-3.2-1B reports PPL 9.200 at 6.4x compression. That hits the actual serving bottleneck for long-context workloads: resident KV, not model weights. But the Reddit body is 403, so I can’t verify throughput setup, batch size, prefill/decode split, or quality regression. Against vLLM FP8, those missing conditions matter. Treat the speed claim as a promising replication lead, not a deployment result.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

21:17

41d ago

● P1Financial Times · Technology· rssEN21:17 · 05·04

→OpenAI president defends motives in for-profit restructuring as he reveals $30bn stake

OpenAI’s president defended its for-profit restructuring and disclosed a $30bn stake. Elon Musk’s lawsuit says executives sold out the charity mission for personal gain. The post does not disclose the president’s name, equity structure, or restructuring terms.

#OpenAI#Elon Musk#Policy#Incident

why featured

All three HKR axes pass: OpenAI’s for-profit shift, a $30bn stake, and Musk’s lawsuit make it same-day material. Missing name, equity structure, and restructuring terms keep it below the 95+ band.

editor take

A $30bn personal stake turns OpenAI’s mission defense into a compensation story; every safety claim now gets read through ownership.

sharp

OpenAI’s problem here is not the for-profit turn; it is defending motive purity after a disclosed $30bn presidential stake. The title gives the $30bn figure and Musk’s lawsuit, but the body gives no president name, ownership mechanics, or restructuring terms. Those are exactly the facts needed to judge conflict, control, and upside caps. I don’t buy the clean “mission remains intact” framing without the paperwork. Once one executive’s paper stake reaches sovereign-fund scale, governance stops being philosophy and becomes board rights, payout limits, and exit language. Anthropic has at least kept its PBC and long-term benefit trust story visible. OpenAI is now explaining its structure through litigation pressure and paywalled fragments, which is a bad posture for a company asking everyone else to trust its safety governance.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

20:14

41d ago

● P1Bloomberg Technology· rssEN20:14 · 05·04

→GameStop Makes $56 Billion Takeover Bid for eBay

GameStop made a $56B bid for eBay, a company four times its size. Cerebras seeks up to $3.5B in its IPO, and OpenAI raised over $4B for an enterprise AI joint venture. The post does not disclose deal terms, IPO valuation, or JV structure.

#GameStop#eBay#Cerebras#Funding

why featured

HKR-H/K/R pass, but this is a Bloomberg Tech video roundup with AI details limited to financing figures. Cerebras valuation, OpenAI JV structure, and deal terms are not disclosed, so it stays in the generic-reporting band.

editor take

GameStop bidding $56B for eBay at four times its own size smells less like commerce strategy and more like meme-era financial engineering with a takeover wrapper.

sharp

Eight items line up tightly: Bloomberg starts with “preparing a bid,” while FT and HN frame it as a $55.5B/$56B unsolicited offer. The only real differences are rounding and the Ryan Cohen payday angle, so this reads like one central deal leak, not eight independent confirmations. I don’t buy the industrial logic yet. GameStop trying to swallow eBay at roughly four times its own size is the tell; that is a capital-structure bet wearing a marketplace story. eBay is a mature marketplace, while GameStop is cash, brand residue, and retail-investor optionality. For AI operators, the pattern is familiar: when the product flywheel is weak, companies reach for distribution assets and narrative leverage before proving operating leverage.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

19:18

41d ago

FEATUREDHacker News Frontpage· rssEN19:18 · 05·04

→White House Considers Vetting AI Models Before Release

The White House is considering vetting AI models before release; only that policy direction is disclosed. The RSS body lists the URL, 44 Hacker News points, and 21 comments, but does not disclose criteria, covered models, timeline, or enforcing agency.

#Safety#White House#Policy#Safety/alignment

why featured

HKR-H and HKR-R pass: White House pre-release vetting directly affects model launches and compliance planning. HKR-K fails because criteria, scope, agency, and timeline are not disclosed.

editor take

If model releases need pre-review, big closed labs adapt first; open-source teams and startups eat the delay. Safety will grow into a moat fast.

sharp

Two sources carry the same headline, and both trace back to the New York Times chain: the White House is discussing an executive order, an AI working group, and a formal review process before new AI models ship. This is not a routine safety-eval comeback; Washington is pulling model-release timing back onto the policy table. The concrete hook is Anthropic’s Mythos release: officials briefed Anthropic, Google, and OpenAI executives last week, and the U.K.-style multi-agency safety process is named as a model. The irony is sharp: Trump rolled back Biden-era reporting and safety-evaluation rules for high-risk models last year. If review becomes a gate, OpenAI and Google can absorb it with legal teams, government affairs, and red-team binders. Small labs and open-source release crews do not have that shock absorber.

HKR breakdown

hook ✓knowledge —resonance ✓

→ open source

SCORE

H1·K0·R1

15:59

41d ago

● P1TechCrunch AI· rssEN15:59 · 05·04

→Anthropic and OpenAI Each Launch Joint Ventures for Enterprise AI Services

Anthropic and OpenAI will each launch joint ventures for enterprise AI services. Both partnered with asset managers to market enterprise AI products more aggressively. The RSS snippet does not disclose partner names, equity terms, pricing, or launch dates.

#Anthropic#OpenAI#Partnership#Product update

why featured

HKR-H and HKR-R are strong because two frontier labs mirror the same enterprise JV move. HKR-K is limited to the sales-vehicle mechanism; names, equity, pricing, and launch timing are not disclosed.

editor take

Two model companies, same day, same playbook: joint ventures with asset managers to push enterprise AI. Not a coincidence — same pressure, same move.

sharp

Anthropic and OpenAI both got outed on the same day for setting up joint ventures with asset managers — Anthropic with Apollo, OpenAI with BlackRock, per TechCrunch. Latent Space flagged it as part of a broader “services” push. Two sources, but both trace back to the same TechCrunch report. No official announcement from either AI company yet, so I'm treating this as a solid leak, not confirmed structure. The real story here isn't the JV structure — it's the distribution problem these model companies are trying to solve. Apollo and BlackRock manage trillions in assets and sit on top of insurance firms, pension funds, and banks. Those are the buyers who need enterprise AI that's auditable, compliant, and integrated into existing workflows. A joint venture with them is basically a pre-warmed sales channel. What's missing: equity splits, pricing, and whether these JVs are selling custom models or managed deployment. If official announcements drop, those are the numbers to watch.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

15:51

41d ago

● P1Hacker News Frontpage· rssEN15:51 · 05·04

→Sierra Raises $950 Million at $15 Billion Valuation

Sierra raised $950M at a $15B valuation. The RSS snippet does not disclose investors, round type, use of funds, or product metrics. The signal is customer-agent valuation, not a model update.

#Agent#Sierra#Funding

why featured

HKR-H/K/R all pass: the $950M and $15B figures make this a strong agent-market story. Limited sourcing on investors, round, product metrics, and use of funds keeps it in the 78–84 band.

editor take

Sierra raised $950M at a $15B valuation; investors are buying enterprise distribution, not chatbots. $150M ARR makes that multiple brutal.

sharp

Both sources center on the $950M raise and $15B valuation; TechCrunch frames it as an enterprise-AI land grab, while HN points to Sierra’s own post, so the fact chain is mostly company-sourced. The hard hooks are 40%+ of the Fortune 50, $150M ARR, Nordstrom’s voice agent in five weeks, Singtel in 10 weeks, and a 70%+ resolution rate. I don’t read this as another chatbot funding round. Investors are pricing Sierra like a control point for enterprise customer operations. The problem is the math: $15B on $150M ARR is roughly 100x ARR, so Sierra has to expand far beyond support into sales, retention, claims, lending, and revenue-cycle work. Bret Taylor’s Salesforce credibility gets meetings; regulated workflow depth decides whether this becomes ServiceNow-scale software or an expensive contact-center wrapper.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

14:17

41d ago

FEATUREDr/LocalLLaMA· rssEN14:17 · 05·04

→M3 Ultra + DGX Spark = M5 Ultra-lite?

A Reddit user benchmarked DGX Spark against M3 Ultra in llama.cpp at pp16384, with Spark 1.4× to 3.4× faster across 4 models. Qwen 27B hit 778 t/s vs 340 t/s, while Mistral 128B hit 241 t/s vs 72 t/s. The concrete tuning note is mmap=0: loading fell from minutes to about 20 seconds.

#Inference-opt#Tools#NVIDIA#Apple

why featured

Single Reddit sourcing keeps the score low, but HKR-H/K/R all pass through a concrete local-inference benchmark. The pp16384 setup and 4-model speedups justify featured at the lower edge.

editor take

Only the summary has data: DGX Spark beats M3 Ultra by 1.4–3.4× at pp16384, but Reddit 403 blocks verification. I buy the direction, not the verdict.

sharp

DGX Spark’s useful signal is not that it beats M3 Ultra. It shows how fast Apple’s unified-memory workstation loses ground on long-prompt prefill once the box is tuned for inference. The summary gives pp16384 numbers: Qwen 27B at 778 t/s versus 340 t/s, and Mistral 128B at 241 t/s versus 72 t/s. That is a 1.4× to 3.4× gap, and it tracks with the boring truth: bandwidth, kernels, and runtime path beat “the model fits in memory.” I would not treat this as a clean benchmark. The Reddit body is blocked by 403, so quantization, batch, llama.cpp commit, power, and price are missing. The mmap=0 note is the more actionable bit: load time reportedly drops from minutes to about 20 seconds. Apple still wins for quiet local workstations; DGX Spark wins when you pay for the NVIDIA path.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

13:58

41d ago

FEATUREDFinancial Times · Technology· rssEN13:58 · 05·04

→Blackstone and Goldman among backers for $1.5bn JV with Anthropic

Blackstone and Goldman are among backers of a $1.5bn joint venture with Anthropic. The consulting firm will advise Wall Street firms on AI deployment across portfolios; the post does not disclose ownership, products, or timeline.

#Agent#Blackstone#Goldman Sachs#Anthropic

why featured

HKR-H/K/R all pass: a $1.5bn Anthropic-linked JV backed by Blackstone and Goldman is a strong commercialization signal. Missing equity structure, product details, and timeline keep it below 85.

editor take

A $1.5bn Anthropic JV sounds huge, but the missing ownership and product details make it look like Wall Street channel packaging.

sharp

This $1.5bn Anthropic joint venture reads like distribution engineering, not model progress. Blackstone and Goldman can push Claude into banks, asset managers, and portfolio companies where procurement, compliance, and data controls block normal SaaS adoption. The only hard number here is $1.5bn; ownership, product scope, delivery dates, and Claude bundling are not disclosed. I don’t buy the “consulting firm” wrapper at face value. Accenture, BCG, and Deloitte already sold the first wave of GenAI advisory work, and Wall Street does not lack slide decks. It lacks accountable deployment paths through risk, audit, and restricted data. If this is Anthropic buying channel access through Blackstone and Goldman, the key term is committed internal adoption. If there are exclusive model, compute, or data rights, the summary does not show them.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

12:32

41d ago

● P1Import AI (Jack Clark)· rssEN12:32 · 05·04

→Import AI 455: Automating AI Research

Jack Clark argues that no-human-involved AI R&D has a 60%+ chance of arriving by the end of 2028, citing SWE-Bench gains from Claude 2 at about 2% to Claude Mythos Preview at 93.9%, plus METR task horizons rising from 30 seconds in 2022 to 12 hours in 2026.

#Agent#Code#Benchmarking#Jack Clark

why featured

HKR-H/K/R all pass: Jack Clark anchors a >60% end-2028 automated-AI-R&D claim in SWE-Bench and METR numbers. This fits the 85–94 band for a notable figure’s AI-timeline essay, below model-release magnitude.

editor take

Jack Clark puts no-human AI R&D at 60%+ by end-2028; I buy the direction, but SWE-Bench 93.9% is not research automation.

sharp

Clark’s 2028 call has weight, but the evidence jumps too cleanly from engineering automation to research automation. SWE-Bench moving from Claude 2 at about 2% to Claude Mythos Preview at 93.9% shows real GitHub issues are nearly saturated. METR’s horizon moving from 30 seconds in 2022 to 12 hours with Opus 4.6 in 2026 also explains why agentic coding suddenly feels usable inside labs. I get stuck on “build its own successor.” Writing code, testing, cleaning data, and launching runs are not the same as finding a new scaling recipe or diagnosing failed frontier training. Clark admits frontier models are much costlier and involve many humans; that caveat carries the piece. A non-frontier successor proof-of-concept by 2027 or 2028 is plausible. Calling that no-human AI R&D uses a very wide definition.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

11:09

41d ago

FEATUREDr/LocalLLaMA· rssEN11:09 · 05·04

→Deep research report with Hermes Agent and qwen3.6-35b-a3b Q6_K

A Reddit user used Hermes Agent and qwen3.6-35b-a3b Q6_K to produce a 21-page research report. The run took 6 loops and over 5 hours on an RTX 4060, at about 28 tokens/s. The repo includes prompts, scripts, intermediate artifacts, and the final report.

#Agent#Tools#Code#Hermes Agent

why featured

HKR-H/K/R all pass: this is a local-agent experiment with hardware, runtime, speed, and artifacts. Reddit source limits reach, so it stays in the 72–77 featured-threshold band.

editor take

A 5-hour, 21-page run on an RTX 4060 is not a toy demo; it pressures closed Deep Research on reproducibility, not polish.

sharp

Hermes Agent’s sharp point here is not the “McKinsey-style” label; it is the exposed workflow. The summary gives 6 loops, 5+ hours, an RTX 4060, about 28 tokens/s, and a 21-page report. The repo also includes prompts, scripts, intermediate artifacts, and the final output. That is closer to engineering evidence than a polished PDF screenshot. I don’t buy the implied “local model replaces consultants” flex. qwen3.6-35b-a3b Q6_K slowly completing this on consumer hardware says cheap agentic research is usable now. But the Reddit body is blocked by 403, so I can’t inspect evaluation criteria, citation quality, or failure cases. Against OpenAI or Perplexity Deep Research, this wins on auditability and loses on quality guarantees.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

04:09

42d ago

● P1r/LocalLLaMA· rssEN04:09 · 05·04

→Mistral Medium 3.5 128B and Qwen 3.5 122B performance benchmarked on consumer GPUs

A Reddit user benchmarked Mistral Medium 3.5 128B and Qwen 3.5 122B A10B on 4x RTX 3080 20GB. llama.cpp tensor split raised Mistral tg128 from 10.37 to 21.59 t/s, but Qwen MoE fell from 60.08 to 53.49 t/s. vLLM served Qwen GPTQ-Int4 at 187.04 tok/s; the key signal is MoE sensitivity to parallel strategy.

#Inference-opt#Benchmarking#Mistral#Qwen

why featured

HKR-H/K/R all pass: the 4×RTX 3080 setup is a strong hook, and the post gives concrete llama.cpp/vLLM throughput deltas. Reddit single-run sourcing keeps it in the 72–77 band.

editor take

Two LocalLLaMA titles, body blocked by 403; I read this as a home-lab feasibility signal, not proof Mistral beats Qwen.

sharp

Two LocalLLaMA posts center on running Mistral Medium 3.5 128B and Qwen 3.5 122B A10B on consumer multi-GPU rigs, and the angles align through one source chain. The titles give hard constraints: 3x3090 with 72GB VRAM, and 4x RTX 3080 20GB. The body is blocked by 403, so tokens/sec, quantization loss, context length, and prompt setup are not verifiable. My read: the signal is not model quality; it is that 128B-class dense/MoE models have entered the used-GPU budget conversation. Q3_K_M “runs” does not mean it serves well, especially once PCIe bandwidth, KV cache growth, and multi-user throughput hit a 4x3080 box. Treat this as a reproducibility breadcrumb, not a benchmark against Qwen or Mistral.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

04:06

42d ago

FEATUREDSynced (机器之心) · WeChat· rssZH04:06 · 05·04

→ACL 2026: PolyU Open-Sources SignThought for Gloss-Free Sign Language Translation

PolyU and Sichuan University introduced SignThought, accepted to ACL 2026 Main and slated for oral recommendation. It uses latent thoughts, plan-then-ground, and dual-stream decoding, reaching top gloss-free BLEU-4 on five SLT benchmarks. The team also built LC-HKSLT with 1,311 hours, 432K clips, and 14 signers.

#Multimodal#Reasoning#Vision#Hong Kong Polytechnic University

why featured

ACL 2026 Main, an open model, and a new dataset satisfy HKR-H/K/R, with concrete mechanisms and five benchmarks. The niche sign-language focus keeps it below broader model or developer-tool releases.

editor take

SignThought’s hard asset is the 1,311-hour corpus, not the “thinking” label; 14 signers keeps the generalization claim on a short leash.

sharp

SignThought should be read as a data-scale move before a reasoning-model story. The disclosed hook is concrete: top gloss-free BLEU-4 on five sign-language translation benchmarks, using latent thoughts, plan-then-ground, and dual-stream decoding. The bigger asset is LC-HKSLT: 1,311 hours, 432K clips, and 14 signers. That last number cuts both ways. SLT has been stuck between expensive gloss annotation and poor cross-signer transfer; gloss-free training avoids one bottleneck but leans harder on video diversity. Fourteen signers is still a narrow base for broad Hong Kong Sign Language generalization. I like the direction, but I don’t buy any “for the deaf community” victory lap until the license, code, evaluation splits, and signer leakage checks are visible. The WeChat body is blocked by CAPTCHA, so those details are not disclosed here.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

04:04

42d ago

FEATUREDAI Era (新智元) · WeChat· rssZH04:04 · 05·04

→Top AI wrote dozens of pages of derivation before reviewers found the problem was wrong

Xinzhiyuan says Google DeepMind used Aletheia on 700 Erdős problems and got 13 original answers. The pipeline had Gemini Deep Think produce 200 candidates, then a verifier reduced them to 63. The post says Erdős-75 had a wrong premise, yet Aletheia wrote dozens of proof pages.

#Reasoning#Benchmarking#Safety#Google DeepMind

why featured

HKR-H/K/R all pass: the mistaken Erdős-75 setup gives a sharp hook, while the 700/13/200/63 pipeline adds substance. This is strong research coverage, not a GPT-scale product release, so it fits 78–84.

editor take

Aletheia’s 13 answers are impressive; the Erdős-75 miss is the tell: long proofs still don’t imply premise skepticism.

sharp

Aletheia’s loudest signal is not the 13 original answers; it is the Erdős-75 failure. The summary gives a serious pipeline: 700 Erdős problems, 200 candidates from Gemini Deep Think, 63 after verifier filtering, then review by math experts. That says DeepMind is not selling raw model output as research. It is building a generate-filter-review loop for math discovery. The wrong-premise episode still hurts. Olympiad-style reasoning and open mathematics differ on one brutal point: the system must notice when the problem statement is dirty. More symbolic pages do not fix that. AlphaGeometry-style settings have cleaner boundaries; Erdős problems carry history, variants, and citation traps. The WeChat body is CAPTCHA-blocked here, and I have not checked the arXiv paper yet, so the 13-answer claim rests on the disclosed review process.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

04:04

42d ago

FEATUREDAI Era (新智元) · WeChat· rssZH04:04 · 05·04

→Claude token rankings: Disney employee hits 460,000 calls in 9 days; Meta burns 60T monthly

Xinzhiyuan says Disney tracks Claude use via an AI Adoption Dashboard, with one employee making about 460,000 calls in 9 workdays. It also says Meta used 60 trillion tokens in 30 days, worth about $9B by public API pricing; the post does not show raw tables. The key issue is that input rankings are not outcomes.

#Code#Agent#Tools#Anthropic

why featured

HKR-H/K/R all pass: the hook is concrete usage shock, the post gives dashboard mechanics and token figures, and the nerve is enterprise Claude cost control. Kept at 74 because the data is secondhand and no raw table is disclosed.

editor take

Only the summary is usable; 460K calls and 60T tokens read like a spend leaderboard, not proof of enterprise AI value.

sharp

This needs cold water: Disney’s one employee making 460,000 Claude calls in 9 workdays and Meta burning 60 trillion tokens in 30 days prove low friction, not good work. The WeChat page is blocked by verification, and the summary says the figures come from media and internal-tool retellings. No raw table, model version, input/output split, or deduping rule is shown. The $9B Meta estimate also looks shaky. A company at Meta’s scale will have enterprise deals, self-hosted inference, caching, and batch paths; multiplying public API list price by token volume turns accounting into theater. If an enterprise AI dashboard ranks usage alone, it will reward scripts and leaderboard gaming. The useful dashboard ties Claude activity to merged PRs, customer-resolution rate, asset turnaround time, or another measurable outcome.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

03:48

42d ago

FEATUREDr/LocalLLaMA· rssEN03:48 · 05·04

→450M On-Board VLM Wildfire Detection Pipeline with Sentinel-2 and LFM2.5-VL

PauLabartaBajo shared a wildfire detection PoC using 450M LFM2.5-VL on Sentinel-2 imagery. It pairs RGB and SWIR tiles, simulates orbit with SimSat, and covers 22 fire-prone sites. The key constraint is bandwidth: on-board inference downlinks only a JSON risk profile.

#Vision#Multimodal#Inference-opt#PauLabartaBajo

why featured

HKR-H/K/R all pass: the story has a counterintuitive edge-VLM hook and concrete numbers. Single-source Reddit PoC and a narrow wildfire-use case keep it below the 78+ band.

editor take

A 450M VLM on a satellite is not a model story; it’s a bandwidth story, and edge AI keeps punishing leaderboard brains.

sharp

This PoC lands because it treats satellite AI as a downlink problem, not a frontier-model problem. The pipeline pairs Sentinel-2 RGB B4-B3-B2 with SWIR B12-B8-B4, simulates orbit in SimSat, runs LFM2.5-VL-450M locally, and sends back a JSON risk_level instead of raw multispectral tiles. I like the systems instinct, but “wildfire prevention” is doing too much work. The demo covers 22 fire-prone locations, and the hard parts—labeling, evals, and fine-tuning—are explicitly deferred. Compared with dumping imagery to the ground and running a bigger VLM, the 450M model makes sense only because bandwidth is the scarce resource. Flight hardware still adds radiation, power, latency, and false-positive costs; the post has architecture, not proof of operational detection.

HKR breakdown

hook ✓knowledge ✓resonance ✓

→ open source

SCORE

H1·K1·R1

00:00

42d ago

FEATUREDOpenAI Blog· rssEN00:00 · 05·04

→OpenAI Details Low-Latency Voice AI Delivery at Global Scale

OpenAI rebuilt its WebRTC stack to support low-latency Voice AI at global scale. The RSS snippet mentions real-time Voice AI and turn-taking, but the post does not disclose latency metrics, architecture details, or deployment size.

#Audio#Inference-opt#OpenAI#Product update

why featured

HKR-H and HKR-R pass because OpenAI voice latency is a strong practitioner hook. HKR-K fails: the post names WebRTC work but omits latency, architecture, and deployment numbers, so this stays in the 60–71 band.

editor take

OpenAI is framing voice AI as WebRTC infrastructure, not model magic; 900M WAU is the tell that latency is now a platform moat.

sharp

Two sources carry the same headline, but the chain is basically OpenAI’s engineering post amplified by HN. The shared facts come from one official source: 900M weekly active users, WebRTC, ICE/DTLS, a relay plus transceiver design, and geo-steered signaling. I buy the direction here. Voice agents have been sold for a year as model progress, while the ugly failures are first-hop latency, jitter, barge-in, and NAT traversal. OpenAI does not give P50/P95 latency, pricing, or a model name; it instead talks about why one-port-per-session media termination breaks against Kubernetes. That is a production-platform signal, not a demo story. Compared with vendors selling STT/TTS endpoints, OpenAI is pulling the real-time network stack inside its moat.

HKR breakdown

hook ✓knowledge —resonance ✓

→ open source

SCORE

H1·K0·R1

hot events · 2026-05-04

more

feeds

admin