ChatGPT is memorizing questions from IQ Test, not analyzing it
LLMs can only find some closest example from training data to generate an answer using this example as template.
How to fix that? Simply use problem formalization and analysis as humans do. Equations (math), code (program), predicates (logic), etc.
Four People on a Rickety Bridge // original question
Question: Four people need to cross a rickety bridge at night. Unfortunately, they have one torch and the bridge is too dangerous to cross without a torch. The bridge can support only two people at a time. All the people don’t take the same time to cross the bridge. Time for each person: 1 min, 2 mins, 7 mins, and 10 mins. What is the shortest time needed for all four of them to cross the bridge?
Answer = 17 mins
Solution:
1 and 2 cross the bridge and move to the other side.
Now 2 comes back with the torch from the other side.
7 and 10 crosses the bridge and 2 remain to this side only.
Now 1 comes back with the torch from the other side.
At last, 1 and 2 cross the bridge and we are done.Total time taken = 2 + 2 + 10 + 1 + 2 = 17 mins [source]
Change some conditions:
Question: Four people need to cross a rickety bridge at night. They have only one flashlight and the bridge can only hold two people at a time. The four people walk at different speeds: one can cross the bridge in 1 minute, another in 2 minutes, the third in 5 minutes, and the slowest in 10 minutes. When two people cross the bridge together, they must go at the slower person’s pace. What is the longest time needed for all four of them to cross the bridge?
We changed from shortest to longest, but ChatGPT generates as if same original question.
IF to ask to analyze question, we can get:
Let’s find the solution, but for longest time now.
Solution (),(1,2,7,10):
1+10: (1,10),(2,7) — 10
-10: (1),(2,7,10) — 10
2+10:(1,2,10),(7) — 10
-10:(1,2),(7,10) — 10
7+10:(1,2,7,10),() — 10
Total time taken = 10 + 10 + 10 + 10 + 10 = 50 mins
ChatGPT can’t figure this out, prompt engineering can’t help also…
Google Gemini fails to.
The longest possible time it would take for all four people to cross the bridge is 22 minutes. Here’s the scenario that leads to this longest possible time:
The slowest person (10 minutes) and the fastest person (1 minute) cross together. This takes 10 minutes.
The fastest person returns alone with the flashlight. This takes another 1 minute.
The third fastest person (5 minutes) and the second fastest person (2 minutes) cross together. This takes 5 minutes.
The second fastest person returns alone with the flashlight. This takes another 2 minutes.
The slowest person and the fastest person cross together again. This takes another 10 minutes.
This scenario involves the slowest person crossing the bridge three times (twice with the flashlight and once without), resulting in the longest possible total time of 10 + 1 + 5 + 2 + 10 = 28 minutes.
However, there’s a slight catch: the bridge can only hold two people at a time, and carrying the flashlight doesn’t count as a person. Therefore, the slowest person cannot cross the bridge alone with the flashlight and then return. This means the longest possible time is actually 22 minutes, achieved by simply modifying step 4:
The second fastest person and the fastest person cross together. This still takes 2 minutes.
This way, the slowest person only crosses the bridge twice (once with the flashlight and once without), resulting in the final answer of 22 minutes.