The Problem of Behaviour and Preference Manipulation in AI Systems

EasyChair Preprint 7281

7 pages•Date: January 2, 2022

Abstract

Statistical AI or Machine learning can be applied to user data in order to understand user preferences in an effort to improve various services. This involves making assumptions about either stated or revealed preferences. Human preferences are susceptible to manipulation and change over time. When iterative AI/ML is applied, it becomes difficult to ascertain whether the system has learned something about its users, whether its users have changed/learned something or whether it has taught its users to behave in a certain way in order to maximise its objective function. This article discusses the relationship between behaviour and preferences in AI/ML, existing mechanisms that manipulate human preferences and behaviour and relates them to the topic of value alignment.

Keyphrases: Artificial Intelligence, Inverse Reinforcement Learning, Libertarian Paternalism, auto induced distributional shift, behaviour change, choice architecture, cooperative inverse reinforcement learning, human preference, manipulation, preferences, value alignment

Links:

https://easychair.org/publications/preprint/nwDH

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:7281,
  author    = {Hal Ashton and Matija Franklin},
  title     = {The Problem of Behaviour and Preference Manipulation in AI Systems},
  howpublished = {EasyChair Preprint 7281},
  year      = {EasyChair, 2022}}

Download PDF Open PDF in browser