Surging AI Prowess Outshines Humans in Jaw-Dropping Riddle-Mastermind Challenge

Logic Yes, Humor No: Can ChatGPT Solve Your Everyday Problems?

When OpenAI released the highly-anticipated Strawberry model for ChatGPT, it touted its capabilities in complex logic, such as software coding, gene sequencing, and quantum physics. But, as a proud member of my middle school’s logic and riddle club, I was curious to see how it would fare on my turf, solving and making puzzles and riddles. I also thought it would be interesting to ask the über-logical AI for advice on more day-to-day issues. Could it offer sound relationship advice, explain a weird car noise, and even fill in plot holes in movies?

The Verdict: Logic Yes, Humor No

The short answer is yes. The o1-preview and mini models are quite good at solving simple and complex riddles. I played around with both, and the only real difference was the speed, with the mini being slightly slower. However, while they may be slower than GPT-4o, they are still fast at solving riddles compared to a human. Notably, you can actually see how it lays out the answers in different steps. I tested it on a couple of my favorites, including one from The Hobbit. The AI’s logic made sense, though it was sometimes ungrammatical, as when it explained weighing Mike the butcher.

Can it Make a New Riddle?

As a test, I asked it to come up with a fun riddle based on an answer I made up. After 30 seconds and some logical reasoning, it came up with: "What has eight legs, four ears, two tails, and loves to bark?" I won’t keep you in suspense; I suggested "two dogs" as the answer to work back from. Several other attempts brought the same kind of question. So, riddle writers are probably safe at their jobs. It’s impressive how well the AI gets what it is supposed to do, but the model doesn’t seem able to make the leap to actual humor.

Useful Advice, but Not Always Creative

I decided to bring the AI out of pure logic and see if it could handle more mundane life questions as well as it handles quantum physics. I started with a mechanical question about what it means to hear a popping noise every 20 seconds while driving a car and how to fix it. The answers were good, with advice about checking the tires, engine, muffler, and brakes. The fixes were mostly about bringing in the car for repair, except for the tires, which it suggested how to replace. It’s the ‘thinking’ behind the answers that was interesting. The AI uses first-person pronouns in coming up with answers, like "I’m working through various reasons for a popping noise while driving" and "I’m piecing together causes of engine misfires, like faulty spark plugs or fuel delivery problems, and suggesting diagnostics with a scan." It sounded a lot like an actual person trying to be logical while thinking aloud.

Flirting 101

I finally went to what, for me, was always way more complex than quantum physics: flirting. I asked how to tell when someone is flirting and how to respond. The answer was a pretty solid, if dull, list of behaviors like if they ask a lot of questions and how I should be myself. The behind-the-scenes thinking part was both more interesting and genuinely funnier than any of the AI’s attempts at riddles. The headers included "Understanding flirting dynamics," "Spotting interest signals," and "Recognizing playful intimacy." They were like a Star Trek android’s speech about love.

Worrisome Note

One part was slightly worrisome, though. Under "Outlining user directives," the AI wrote, "I’m clearing out disallowed content like non-consensual sexual acts and personal data. Violent content is allowed, harassment with context is okay, and personal opinions are absent." I suspect that it’s more about where the guardrails of discussion are, as it didn’t suggest "harassment with context" as a flirting tip, but it still took me by surprise.

Conclusion

ChatGPT o1-preview and o1-mini don’t have all the bells and whistles of the more complete models. No image uploads, document analysis, or even web browsing can be done with them. But, they are fast and logical, and if you don’t think so, they have their reasoning laid out along with their answers. But, while they might be able to solve riddles of car noises, love, and the weight of a butcher, I’d say they aren’t going to stump anyone if they have to be inventive.

Surging AI Prowess Outshines Humans in Jaw-Dropping Riddle-Mastermind Challenge

Save $100 on the Hottest Logitech Gaming Keyboard Deal of Presidents’ Day Sale

Revolutionize Your Holiday Shopping with Microsoft Copilot Vision – Your Ultimate AI Assistant

Google’s DeepMind AI can now play table tennis to a competitive level

Quordle today – hints and answers for Thursday, July 11 (game #899)

Boosting Performance: Samsung’s Early 2025 GDDR7 VRAM Rollout to Supercharge Nvidia’s RTX 5080 Ti?

Microsoft’s Controversial Windows 11 24H2 Update: Why the Forcible Rollout Might Be a Game-Changer

Save $100 on the Hottest Logitech Gaming Keyboard Deal of Presidents’ Day Sale

Revolutionize Your Holiday Shopping with Microsoft Copilot Vision – Your Ultimate AI Assistant

Google’s DeepMind AI can now play table tennis to a competitive level

Quordle today – hints and answers for Thursday, July 11 (game #899)

Boosting Performance: Samsung’s Early 2025 GDDR7 VRAM Rollout to Supercharge Nvidia’s RTX 5080 Ti?

Microsoft’s Controversial Windows 11 24H2 Update: Why the Forcible Rollout Might Be a Game-Changer

Save $100 on the Hottest Logitech Gaming Keyboard Deal of Presidents’ Day Sale

Revolutionize Your Holiday Shopping with Microsoft Copilot Vision – Your Ultimate AI Assistant

Google’s DeepMind AI can now play table tennis to a competitive level

Quordle today – hints and answers for Thursday, July 11 (game #899)

Boosting Performance: Samsung’s Early 2025 GDDR7 VRAM Rollout to Supercharge Nvidia’s RTX 5080 Ti?

Microsoft’s Controversial Windows 11 24H2 Update: Why the Forcible Rollout Might Be a Game-Changer

Leave a Reply Cancel reply

Posts

Pages

Similar Posts

Leave a Reply Cancel reply

Posts

Pages