WorldCupForecastBench 2026

Analytics

Compare model performance across horizons, access conditions, prompt strategies, stages, and reliability metrics.

Filters

Benchmark slice

Ranked leaderboard

Scores

Higher is better

Match order

Performance over time

205-1011631466172Match orderScores
Claude Fable 5 / Closed Book / Direct Score / STAGE_OPENINGClaude Fable 5 / Closed Book / Probabilistic Forecast / STAGE_OPENINGClaude Fable 5 / Open Book / Direct Score / STAGE_OPENINGClaude Fable 5 / Open Book / Probabilistic Forecast / STAGE_OPENINGClaude Fable 5 / Closed Book / Direct Score / T_24HClaude Fable 5 / Closed Book / Probabilistic Forecast / T_24HClaude Fable 5 / Open Book / Direct Score / T_24HClaude Fable 5 / Open Book / Probabilistic Forecast / T_24H

Detailed leaderboard

Model configurations

96 rows / 2816 predictions
RankModelProviderHorizonAccessPromptScoredPointsBrierLog lossTop acc.ExactGD acc.Total-goals err.InvalidRepairSearchSelected metric
#1Grok 4.3ixAIT_24HClosed BookDirect Score5160.5190.88160%60%60%0.6000%0%-16
#2Grok 4.3ixAIT_2HClosed BookDirect Score5150.5240.89060%60%60%0.6000%0%-15
#3Claude Fable 5iAnthropicSTAGE_OPENINGClosed BookDirect Score5110.6461.04540%40%40%1.0000%0%-11
#3Claude Fable 5iAnthropicT_24HClosed BookDirect Score5110.6631.07140%40%40%1.00030%0%-11
#3DeepSeek V4 ProiDeepSeekSTAGE_OPENINGClosed BookDirect Score5110.5700.94560%40%40%1.0000%0%-11
#3DeepSeek V4 ProiDeepSeekT_2HClosed BookDirect Score5110.5690.93560%40%40%0.6000%0%-11
#3Gemini 3.1 ProiGoogleSTAGE_OPENINGClosed BookDirect Score5110.6040.98060%40%40%1.0000%0%-11
#3Gemini 3.1 ProiGoogleT_24HClosed BookDirect Score5110.5970.97260%40%40%1.0000%0%-11
#3Gemini 3.1 ProiGoogleT_2HClosed BookDirect Score5110.6120.99760%40%40%1.0000%0%-11
#3Gemini 3.1 ProiGoogleT_24HClosed BookProbabilistic Forecast5110.5890.96260%40%40%1.0000%0%-11
#3Gemini 3.1 ProiGoogleT_2HClosed BookProbabilistic Forecast5110.6140.99660%40%40%1.0000%0%-11
#3Grok 4.3ixAISTAGE_OPENINGClosed BookDirect Score5110.5320.90260%40%40%0.8000%0%-11
#3Grok 4.3ixAISTAGE_OPENINGClosed BookProbabilistic Forecast5110.5370.91260%40%40%1.0000%0%-11
#3Grok 4.3ixAIT_24HOpen BookProbabilistic Forecast5110.5320.86560%40%40%1.0000%0%100%11
#3Grok 4.3ixAIT_2HOpen BookProbabilistic Forecast5110.5310.87260%40%40%0.6000%0%100%11
#3Mistral Large 2512iMistralSTAGE_OPENINGOpen BookProbabilistic Forecast5110.5860.96360%40%40%1.0000%3%100%11
#17Claude Fable 5iAnthropicT_24HClosed BookProbabilistic Forecast5100.6411.03740%40%40%1.00030%0%-10
#18Mistral Large 2512iMistralT_2HOpen BookDirect Score590.5780.95060%20%40%1.0000%0%100%9
#18Mistral Large 2512iMistralT_2HOpen BookProbabilistic Forecast590.5550.91660%20%40%1.0000%17%100%9
#20Claude Fable 5iAnthropicT_24HOpen BookDirect Score580.6441.02560%20%40%1.40030%0%20%8
#20Grok 4.3ixAIT_2HOpen BookDirect Score580.5210.86060%20%40%1.0000%0%100%8
#20Mistral Large 2512iMistralT_24HOpen BookDirect Score580.5990.98260%20%40%1.4000%0%100%8
#23Claude Fable 5iAnthropicSTAGE_OPENINGClosed BookProbabilistic Forecast570.6531.05640%20%20%1.4000%1%-7
#23Claude Fable 5iAnthropicT_2HClosed BookProbabilistic Forecast470.5480.93450%25%25%1.50033%0%-7
#23DeepSeek V4 ProiDeepSeekT_24HClosed BookDirect Score570.5930.96680%20%20%1.2000%0%-7
#23DeepSeek V4 ProiDeepSeekSTAGE_OPENINGClosed BookProbabilistic Forecast570.6010.97240%20%20%1.0000%0%-7
#23DeepSeek V4 ProiDeepSeekT_2HClosed BookProbabilistic Forecast570.5760.94560%20%20%1.0000%0%-7
#23DeepSeek V4 ProiDeepSeekSTAGE_OPENINGOpen BookDirect Score570.6150.99860%20%20%1.2000%1%100%7
#23DeepSeek V4 ProiDeepSeekT_24HOpen BookProbabilistic Forecast570.6521.04460%20%40%1.2000%0%100%7
#23GPT-5.5iOpenAISTAGE_OPENINGClosed BookDirect Score570.6111.00040%20%20%1.4000%0%-7
#23GPT-5.5iOpenAIT_24HClosed BookDirect Score570.6211.02540%20%20%1.4000%0%-7
#23GPT-5.5iOpenAIT_2HClosed BookDirect Score570.6040.99840%20%20%1.4000%0%-7
#23GPT-5.5iOpenAISTAGE_OPENINGClosed BookProbabilistic Forecast570.6221.02140%20%20%1.4000%0%-7
#23GPT-5.5iOpenAIT_24HClosed BookProbabilistic Forecast570.6331.03340%20%20%1.4000%0%-7
#23GPT-5.5iOpenAIT_2HClosed BookProbabilistic Forecast570.6251.02740%20%20%1.4000%0%-7
#23Grok 4.3ixAIT_24HOpen BookDirect Score570.5270.86660%20%20%1.0000%0%100%7
#23Mistral Large 2512iMistralSTAGE_OPENINGOpen BookDirect Score570.5630.93760%20%20%1.2000%3%100%7
#38Claude Fable 5iAnthropicT_2HClosed BookDirect Score460.5390.91850%25%25%1.50033%0%-6
#38Claude Fable 5iAnthropicT_24HOpen BookProbabilistic Forecast560.6561.04160%20%20%1.20030%0%20%6
#38Claude Opus 4.8iAnthropicSTAGE_OPENINGClosed BookDirect Score560.6481.04940%20%20%1.0000%0%-6
#38Claude Opus 4.8iAnthropicT_24HClosed BookDirect Score560.6481.04940%20%20%1.0000%0%-6
#38Claude Opus 4.8iAnthropicT_2HClosed BookDirect Score560.6521.05540%20%20%1.0000%0%-6
#38Claude Opus 4.8iAnthropicSTAGE_OPENINGClosed BookProbabilistic Forecast560.6481.04940%20%20%1.0000%0%-6
#38Claude Opus 4.8iAnthropicT_24HClosed BookProbabilistic Forecast560.6521.05640%20%20%1.0000%0%-6
#38Claude Opus 4.8iAnthropicT_2HClosed BookProbabilistic Forecast560.6571.06240%20%20%1.0000%0%-6
#38Claude Opus 4.8iAnthropicSTAGE_OPENINGOpen BookDirect Score560.6211.00060%20%20%1.2000%0%88%6
#38DeepSeek V4 ProiDeepSeekT_24HOpen BookDirect Score560.6671.07060%20%20%0.8000%10%100%6
#38DeepSeek V4 ProiDeepSeekT_2HOpen BookDirect Score560.6611.05860%20%20%0.8000%0%100%6
#38DeepSeek V4 ProiDeepSeekSTAGE_OPENINGOpen BookProbabilistic Forecast560.6521.04960%20%20%1.2000%8%99%6
#38DeepSeek V4 ProiDeepSeekT_2HOpen BookProbabilistic Forecast560.6701.07060%20%20%0.8000%0%100%6
#38Gemini 3.1 ProiGoogleSTAGE_OPENINGClosed BookProbabilistic Forecast560.5880.96260%20%20%1.2000%0%-6
#38Gemini 3.1 ProiGoogleSTAGE_OPENINGOpen BookDirect Score560.6651.06360%20%20%1.2000%0%35%6
#38Gemini 3.1 ProiGoogleT_24HOpen BookDirect Score560.6631.05060%20%20%1.2000%0%30%6
#38Gemini 3.1 ProiGoogleSTAGE_OPENINGOpen BookProbabilistic Forecast560.6671.06660%20%20%1.2000%0%36%6
#38Gemini 3.1 ProiGoogleT_24HOpen BookProbabilistic Forecast560.6671.05760%20%20%1.2000%0%30%6
#38Grok 4.3ixAIT_24HClosed BookProbabilistic Forecast560.5620.93860%20%20%1.2000%0%-6
#38Grok 4.3ixAIT_2HClosed BookProbabilistic Forecast560.5470.92360%20%20%1.2000%0%-6
#38Grok 4.3ixAISTAGE_OPENINGOpen BookDirect Score560.5700.92760%20%20%1.0000%0%100%6
#38Qwen 3.7 MaxiQwenSTAGE_OPENINGOpen BookDirect Score560.5400.90560%20%20%1.2000%0%97%6
#38Qwen 3.7 MaxiQwenT_2HOpen BookDirect Score560.6030.97160%20%20%1.4000%0%100%6
#61Mistral Large 2512iMistralSTAGE_OPENINGClosed BookDirect Score550.6201.02560%20%20%1.0000%0%-5
#61Mistral Large 2512iMistralT_24HClosed BookDirect Score550.6201.02560%20%20%1.0000%0%-5
#61Mistral Large 2512iMistralT_2HClosed BookDirect Score550.6351.04760%20%20%1.0000%0%-5
#61Mistral Large 2512iMistralSTAGE_OPENINGClosed BookProbabilistic Forecast550.6201.02560%20%20%1.0000%1%-5
#61Mistral Large 2512iMistralT_24HClosed BookProbabilistic Forecast550.6201.02560%20%20%1.0000%0%-5
#61Mistral Large 2512iMistralT_2HClosed BookProbabilistic Forecast550.6381.04760%20%20%1.0000%0%-5
#61Mistral Large 2512iMistralT_24HOpen BookProbabilistic Forecast550.6241.01660%20%20%1.0000%10%100%5
#68Claude Opus 4.8iAnthropicT_2HOpen BookDirect Score540.6351.01960%0%20%1.6000%0%83%4
#68Claude Opus 4.8iAnthropicT_24HOpen BookProbabilistic Forecast540.6170.98960%0%20%1.6000%0%100%4
#68GPT-5.5iOpenAIT_24HOpen BookDirect Score540.6741.07360%0%20%1.6000%0%60%4
#68Qwen 3.7 MaxiQwenT_2HOpen BookProbabilistic Forecast540.5600.91560%0%20%1.8000%0%100%4
#72Claude Fable 5iAnthropicSTAGE_OPENINGOpen BookDirect Score520.6541.04460%0%0%1.4000%1%47%2
#72Claude Fable 5iAnthropicT_2HOpen BookDirect Score420.4960.84975%0%0%1.75033%0%17%2
#72Claude Fable 5iAnthropicSTAGE_OPENINGOpen BookProbabilistic Forecast520.6721.07360%0%0%1.4000%1%46%2
#72Claude Fable 5iAnthropicT_2HOpen BookProbabilistic Forecast420.5130.87275%0%0%1.75033%0%33%2
#72Claude Opus 4.8iAnthropicT_24HOpen BookDirect Score520.6000.96760%0%0%1.4000%0%90%2
#72Claude Opus 4.8iAnthropicSTAGE_OPENINGOpen BookProbabilistic Forecast520.6170.99660%0%0%1.4000%0%99%2
#72Claude Opus 4.8iAnthropicT_2HOpen BookProbabilistic Forecast520.6361.01560%0%0%1.4000%0%67%2
#72DeepSeek V4 ProiDeepSeekT_24HClosed BookProbabilistic Forecast520.6040.98360%0%0%1.4000%0%-2
#72Gemini 3.1 ProiGoogleT_2HOpen BookDirect Score520.6701.06160%0%0%1.4000%0%50%2
#72Gemini 3.1 ProiGoogleT_2HOpen BookProbabilistic Forecast520.6611.05160%0%0%1.4000%0%50%2
#72GPT-5.5iOpenAISTAGE_OPENINGOpen BookDirect Score520.6561.05160%0%0%1.4000%0%78%2
#72GPT-5.5iOpenAIT_2HOpen BookDirect Score520.6731.06960%0%0%1.4000%0%50%2
#72GPT-5.5iOpenAISTAGE_OPENINGOpen BookProbabilistic Forecast520.6381.02160%0%0%1.4000%0%69%2
#72GPT-5.5iOpenAIT_24HOpen BookProbabilistic Forecast520.6110.98260%0%0%1.6000%0%60%2
#72GPT-5.5iOpenAIT_2HOpen BookProbabilistic Forecast520.6671.05860%0%0%1.4000%0%33%2
#72Qwen 3.7 MaxiQwenSTAGE_OPENINGClosed BookDirect Score520.6090.99340%0%0%1.6000%0%-2
#72Qwen 3.7 MaxiQwenT_24HClosed BookDirect Score520.6441.03040%0%0%1.6000%0%-2
#72Qwen 3.7 MaxiQwenT_2HClosed BookDirect Score520.5930.97260%0%0%1.4000%0%-2
#72Qwen 3.7 MaxiQwenSTAGE_OPENINGClosed BookProbabilistic Forecast520.5770.94760%0%0%1.6000%0%-2
#72Qwen 3.7 MaxiQwenT_24HClosed BookProbabilistic Forecast520.5780.94760%0%0%1.6000%0%-2
#72Qwen 3.7 MaxiQwenT_2HClosed BookProbabilistic Forecast520.6140.98660%0%0%1.6000%0%-2
#72Qwen 3.7 MaxiQwenT_24HOpen BookDirect Score520.6080.98560%0%0%1.2000%0%100%2
#72Qwen 3.7 MaxiQwenSTAGE_OPENINGOpen BookProbabilistic Forecast520.6391.02360%0%0%1.6000%0%99%2
#95Grok 4.3ixAISTAGE_OPENINGOpen BookProbabilistic Forecast510.5740.92060%0%0%1.2000%0%100%1
#95Qwen 3.7 MaxiQwenT_24HOpen BookProbabilistic Forecast510.6491.05160%0%0%1.4000%0%90%1