Superfacts Hallucination Benchmark
Superfacts is the first claim-level hallucination benchmark of top AI models. Model outputs are scored using Google DeepMind's FACTS benchmark and Superficial's auditing model, Pro 1. When a model's response is marked "inaccurate" by FACTS, we one-shot enhance it using Superficial's audit and independently re-score it with FACTS to measure Superficial's factual accuracy gains.
Leaderboard
Rank | Model Name | Superfacts Score | FACTS Score | FACTS Score (with Superficial) |
---|---|---|---|---|
Loading... |
Claim Accuracy (All Models)
Loading...