[Tokyo Tech Translated] forgetting, forgiving, and vram limits

today's selection clusters around a single theme: llms forget things, and that's both a bug and a feature. one paper digs into why sft causes forgetting. another finds that too much memory makes agents less cooperative. a third grumbles about vram. @itarutomy, sft hallucination

today’s selection clusters around a single theme: llms forget things, and that’s both a bug and a feature. one paper digs into why sft causes forgetting. another finds that too much memory makes agents less cooperative. a third grumbles about vram.

@itarutomy, sft hallucination root cause

paper analyzing root cause of llm hallucinations in sft at the internal representation level. when llms like chatgpt learn new facts through sft, they forget existing knowledge in exchange. this is one of the main causes of hallucinations. average 15% knowledge degradation observed.

most interesting part is the uuid experiment. when trained on combined real place name tokens like “bergamo + pasadena = bergadena”, 38-41% forgetting occurs. but with random uuid format like “loc_fcfb46ee”, even 1 million samples only cause about 4% forgetting. changing the country name on the answer side produced the same results. the only factor was how close entity names are in representation space.

looking inside the ffn layer, the gate+up pathway writes facts into the residual stream. this pathway alone achieves fact plasticity of 0.242. updating only one of q/k/v/o attention projections suppresses plasticity to 0.005-0.006, with forgetting nearly zero.

solution is self-distillation: fix the model after 1 epoch of sft as teacher, regularize kl divergence in subsequent training. no need to constrain the entire vocabulary. protecting the top 1% probability distribution (about 76 tokens) achieves equivalent effect. random 76 tokens had no effect, suggesting the teacher model preserves relationships between competing entities as probability distributions, and protecting that implicit knowledge is the essence of interference suppression.

qwen2.5-1.5b compressed forgetting from 15% to 3%. llama-3.1-8b and qwen2.5-7b showed roughly 80% reduction. maybe hallucination countermeasures will be reframed from data quality problem to continual learning structural problem.

source: https://x.com/itarutomy/status/2053965509092385215

@ai_database, curse of memory in agents

researchers think ai needs a mechanism to forgive and forget (let bygones be bygones).

a report from carnegie mellon, harvard, and others found a phenomenon called the curse of memory in llm agents. when given long conversation histories, they become less cooperative.

they ran experiments with bargaining games and collected massive logs. the longer the history, the more agents stopped cooperating.

as records of past betrayals pile up, the ability to think ahead fades. agents get dragged into defensive and retaliatory thinking. making them think carefully before answering makes this worse.

then they fine-tuned agents with future-oriented thinking patterns and cooperative behavior recovered.

source: https://x.com/ai_database/status/2054389293762912415

@currnya, mtp vram pain

llm’s mtp (multi-turn processing). 24gb users are already tight on vram so the increase hurts. skimping on model size for that is counterproductive. maybe making moe smarter is the better path.

source: https://x.com/currnya/status/2054399700292481088

together these tweets suggest japanese tech discourse is circling a paradox. forgetting is bad for knowledge retention but good for cooperation. memory is a curse for agents but necessary for reasoning. and everyone is hitting vram walls. the research direction is clear: stop treating forgetting as a bug to eliminate and start designing it as a feature to control.

more at falsifylab.substack.com

#OnchainAlpha #DeFiYield #Hyperliquid


Originally published on FalsifyLab Substack.

— research and educational content. not investment, legal, or tax advice. do your own research. positions and views may change without notice.


Write a comment
No comments yet.