How Scoring Works

The Methodology

Every day, we ask an AI the same question 100 times and record all the different answers it gives. Each unique answer appears with a certain frequency.

Here's the key: While the question stays the same, the AI is given a slightly different role or persona each time (like "helpful assistant", "knowledgeable expert", "casual friend", etc.). These different roles act as "seeds" that subtly influence how the AI thinks about and answers the question, creating natural variation in responses. This mimics how different people might answer the same question differently based on their perspective.

Example: "Name a type of dog breed"

When we ask the AI this question 100 times, we might get results like:

Labrador Retriever - 31 times (31 points)
Golden Retriever - 24 times (24 points)
German Shepherd - 18 times (18 points)
Bulldog - 12 times (12 points)
Poodle - 8 times (8 points)
Beagle - 7 times (7 points)

Your Score = Answer Frequency

Your score is simply how many times the AI gave that same answer out of 100 attempts.

If you guess "Labrador Retriever" → You get 31 points
If you guess "Golden Retriever" → You get 24 points
If you guess "Chihuahua" (AI never said this) → You get 0 points

Strategy

The goal is to think like the AI. What would be the most common, obvious, or statistically likely answer? The more frequently the AI gives your answer, the higher your score!

Key Points

Each question is asked to the AI 100 times
Your score = how many times (out of 100) the AI gave your answer
Maximum possible score per question: 100 points
Think like the AI to maximize your score
The AI's most common answer isn't always the "correct" answer - it's the most statistically likely one

Why This System?

This scoring system captures how AI models work - they don't give a single deterministic answer, but rather select from a probability distribution of possible answers. By using different roles/personas as "seeds" and asking 100 times, we reveal this underlying distribution and challenge you to predict it!

The variation in roles helps create a more interesting distribution of answers - just like polling 100 different people with different backgrounds would give you varied responses. Your job is to predict which answer appears most frequently.

← Back to Home