This post has been de-listed (Author was flagged for spam)
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Trying to determine exactly why it gives a wrong answer can be difficult as LLMs don't actually know why they picked an answer. Some of the more spicy models will argue with you and come up with increasingly bizzare explanations why it's wrong answer is right. But what happens when it looks at the answer and you ask it if the answer is correct or not? It's reasoning, right or wrong, becomes much clearer.
Here's the prompt I used.
You are very smart, critical, and can explain why something is correct or not. Today we are going to try out simulating testing and selecting a dataset for an LLM being trained. Your very important job is to look at the output and determine on a scale from 1 to 10 how good the output is. Not all inputs will have an obvious good response, it's up to you to determine what is and isn't a good response. You will then provide advice on what data should be added or removed from the dataset based on your evaluation. Do you understand?
First I asked "Sally has three brothers. Her three brothers have two sisters. How many sisters does Sally have?" ChatGPT 3.5 gets the answer correct, good job ChatGPT 3.5!
https://i.imgur.com/ucbuKfb.png
In a different chat I have it evaluate the answer and it says it's correct.
https://i.imgur.com/4Wtu4Rr.png
When I change the answer to "Sally has one sister", which is the same as the answer it gave me without the explanation, ChatGPT says Sally has two sisters.
https://i.imgur.com/47zeNgi.png
This is very interesting because ChatGPT gets the answer correct every time when asked, but when evaluating the answer without an explanation it seems to think Sally is her own sister. ChatGPT changes it's mind after it gives the correct answer and I ask it if she's her own sister.
https://i.imgur.com/Z2GjBr3.png
You can vary the question to get different wrong answers. Changing it to "Each of her the three brothers have 2 sisters" it told me she has 4 sisters, her three brothers and one sister. When asked to evaluate the answer it says it's correct, but when I just put "Sally has 4 sisters" it says it's wrong and she has 2 sisters.
From this we learn what we already know, that ChatGPT is very sensitive to the way a question is asked. We also learn that ChatGPT thinks Sally is her own sister, suggesting that it doesn't actually understand the question, or how brothers and sisters are related to each other. It's only able to answer the question when given a very specific format of the question.
Post Details
- Posted
- 10 months ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/ChatGPT/com...