23:13
65d ago
arXiv · cs.CL· atomEN23:13 · 04·04
→CURE: Circuit-Aware Unlearning for LLM-based Recommendation
The paper introduces CURE for LLM recommendation unlearning by splitting circuits by function and selectively updating parameters to reduce gradient conflicts between forget and retain objectives. It groups modules into forget-specific, retain-specific, and task-shared sets; the post does not disclose dataset names, metrics, or gain size. The key point is a more interpretable unlearning path, not another uniform weighting scheme.
#Fine-tuning#Interpretability#Alignment#Research release
why featured
HKR-K passes because the paper adds a concrete mechanism: selective updates over forget-only, retain-only, and shared circuits. HKR-H/R are weak because no datasets, gains, or reproduction numbers are disclosed here, and LLM recommendation unlearning is a niche audience fit.
editor take
CURE splits LLM rec unlearning into three module types, and I buy that direction; uniform weighting has been guesswork for privacy-sensitive setups.
sharp
CURE splits unlearning into 3 module classes with different update rules, and that alone pushes the discussion one step past the usual black-box recipe. My take is simple: if the full paper’s experiments hold up, the value here is less about recommendation and more about moving machine unlearning from loss-weight tuning toward mechanism-level intervention. Too much of the current unlearning literature still boils down to balancing forget loss and retain loss, then updating everything at once. That usually ends in one of two failures: the target signal is still recoverable, or general utility gets trashed. A circuit-aware method that explicitly tries to reduce gradient conflict is a more serious answer than yet another weighting heuristic.
I’m still skeptical on the evidence. The snippet says “real-world datasets” and claims better unlearning than baselines, but it does not disclose the dataset names, metrics, effect size, deletion ratio, or whether the target is instance-level, user-level, or behavior-level removal. Those details matter a lot. Unlearning in recommendation is harder than in many generic LLM settings because user preference, item semantics, and collaborative signal are tightly entangled. Deleting one user is not like deleting one isolated fact; it is more like perturbing a dense preference graph. If the evaluation does not report privacy leakage tests alongside ranking quality and retention quality, I would not trust a “more effective unlearning” claim very far.
There is a clear contrast with the past year’s mainstream approaches. A lot of unlearning work, from data-partition ideas in the SISA family to approximate forgetting with LoRA-style edits or gradient ascent variants, has focused on cutting retraining cost. Much less of it explains which parameters actually carry the behavior that should be removed. CURE borrows from the mechanistic interpretability instinct that has shown up more often in frontier-model discourse: identify functional subgraphs first, then intervene selectively. That is the part I like.
But I also have a pushback. “Circuit” is a strong word, and in recommendation it may be much less stable than the paper’s framing suggests. I have not verified the full PDF yet, so maybe they address this, but the snippet does not say whether these module groupings transfer across datasets, survive backbone changes, or remain stable under distribution shift. Recommendation workloads drift fast. A forget-specific module discovered on one catalog or one user cohort may stop looking forget-specific once the item space changes.
So for now I’d file this under “good direction, incomplete proof.” I’d want three things before taking the claim seriously: a proper forget-retain Pareto comparison against standard baselines, robustness under different deletion rates, and evidence that the circuit split is reproducible rather than a one-off artifact. Without that, circuit-aware unlearning risks becoming a nicer label for a still-fragile editing trick.
HKR breakdown
hook —knowledge ✓resonance —
68
SCORE
H0·K1·R0