Svelte Hacker News logo
  • top
  • new
  • show
  • ask
  • jobs
  • about

Machine Bullshit: Characterizing the Emergent Disregard for Truth in LLMs

arxiv.org

4 points by delichon a day ago

delichon a day ago

  Turns out, aligning LLMs to be "helpful" via human feedback actually teaches them to bullshit—and Chain-of-Thought reasoning just makes it worse!
https://x.com/kaiqu_liang/status/1943350770788937980