Machine Bullshit: Characterizing the Emergent Disregard for Truth in LLMs arxiv.org 4 points by delichon a day ago
delichon a day ago Turns out, aligning LLMs to be "helpful" via human feedback actually teaches them to bullshit—and Chain-of-Thought reasoning just makes it worse! https://x.com/kaiqu_liang/status/1943350770788937980