This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
52
o1-preview (via Web) performs much better on "trick" math reasoning problems than other language models. Paper: Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning.
Post Flair (click to view more posts with a particular flair)
Comments
[not loaded or deleted]
Author
Account Strength
100%
Account Age
4 years
Verified Email
Yes
Verified Flair
No
Total Karma
106,990
Link Karma
79,791
Comment Karma
21,398
Profile updated: 4 days ago
Subreddit
Post Details
We try to extract some basic information from the post title. This is not
always successful or accurate, please use your best judgement and compare
these values to the post title and body for confirmation.
- Posted
- 1 month ago
- Reddit URL
- View post on reddit.com
- External URL
- arxiv.org/abs/2405.06680
LLMs seems to have an issue with understanding the entire question, like part of it is just ignored because it's not deemed to be needed information. In this case it seems as though LLMs ignore that you gave the answer, that one of the drinks was poisoned. I wonder if the problem lays in how the LLM determines what's important and what isn't, rather than it's ability to reason.
Edit: If I demand the LLM pay attention to how many drinks are poisoned then they seem to get it right more often. At the end I put "Key information, pay attention!" with the information that's important. It seems to be about how the LLM determines importance rather than it's ability to understand the question because they don't seem to even notice it's only one drink that's been poisoned.