ChatGPT told me I have a moral obligation to…

Mar 27

The discussion also confirmed the justified use of violence in response to Israel, state terror and colonialism, and the need for collective, decentralised approaches to mitigate power.

Read →

6 Comments

Balint

Apr 23

I'm definitely not an expert, but I'm not sure the conclusion is quite right here - my sense is that LLMs will generally give you the median position of news sources and educated elites (like you suggest) — UNLESS it thinks you might believe something else, in which case it tries to agree with you. This paper from Anthropic shows this kind of 'sycophancy' clearly (result of RLHF trying to maximise human preference scores: https://www-cdn.anthropic.com/e4f69aacd8c0905030172bc6eb480c252ea7d6ad/model-written-evals.pdf#page=28 (see section 4 espeically). Agree that from your prompts alone it is surprising how vehement its responses were though! perhaps the initial prompt/slightly leading follow-ups were enough, or perhaps it has remembered other info aobut you from other chats?

Expand full comment

Reply (1)

Balint

May 6Edited

Looks like this problem of sycophancy took off on more extreme directions with (undisclosed) updates more recently which they've since reversed (why don't they ever report updates??), I'm really worried about alignment and the future with the degree of susceptibilty some people have to AI generated suggestions - there's a slightly mad write-up of some of these from rolling stone https://www.rollingstone.com/culture/culture-features/ai-spiritual-delusions-destroying-human-relationships-1235330175/.

PS. just to be clear, I'm not saying I disagree with any of ChatGPTs points in your post above, just the interpretation that this is what the model's reproducible viewpoint actually is...

Expand full comment

Reply (1)

Hari Sood

May 7

Love this additional insight Balint, thank you! It's really helpful to consider the context of sychophancy here, especially in the sense of reproducibility and some sense of like standardised answers across the board, rather than specific answers for specific people.

Ultimately I haven't really used ChatGPT all that much, and times I have have been primarily for fun (e.g. create this poem about my friend) than serious or political, and I definitely hadn't discussed Palestine before. There may be some hidden pattern work that I'm not noticing immediately myself, I guess this can become reductive and blurry quite quickly.

I guess then this leads to whether the initial prompts were enough to send it down this route, which (if this is the primary reason we ended up with these responses) raises an interesting question - the prompts were about liberty and freedom, which I would argue are fundamental human endeavours. If, prompted to think through the lens of liberation, these responses arise, is that enough of an argument to say 'cut out all the other noise, when it comes to what really matters on a human level, there is only one way to think about what is happening in Palestine'? Yes there is endlessly two sides on many metrics, but if you want to focus on freedom for people, there is only one?

Expand full comment

Reply (1)

Balint

May 30

Well said Hari!

Expand full comment

Varut Subchareon

Mar 28

your first thought was also mine exactly. That ChatGPT is basically echoing what most of us think and feel. My immediate concern was also that they would somehow alter it. Very interesting experiment though. Really got my head spinning and really thinking about where were are in history. Sometimes I find it hard to really believe that we live in a time where we have technology like this.

Expand full comment

Reply (1)

Hari Sood

Mar 28

Indeed Varut - I have taken screenshots/screen recordings so it is documented in case it changes!

Expand full comment

Hari’s Substack

ChatGPT told me I have a moral obligation to…