Tuesday 20 June 2023

reward hacking (10. 823)

A step below paraphrasing, we are introduced to the term and practise of rogeting—that is, the methods that catch-penny academia uses to spin articles and lure researchers and advertisers to pay-walled content with the promise of good sources, only to be sorely disappointed in the obvious spamdexing. Select tortured phrases, usually ones for no other tenable substitute exists, would be systematically replaced with some stock synonyms though would evade simple plagiarism-detectors posing as original content. Large language models and generative chat pose the possibility of saturating the internet with such content, making the screening process even more fraught and maybe less transparently fake, presenting a perfect example of Goodhart’s Law, in its corollary: risk models collapse on themselves when used for regulation or policing, or that in the gauge of citation impact, that when a feature becomes an indicator, its liable to be gamed.