Events
05
Mar

Releasing Our Paper : Interactive Benchmarks 🚀🚀🚀

Drawing inspiration from the"Interactive Proofs"concept in computational complexity theory,Interactive Benchmarks assess a model's reasoning ability while actively acquiring information.It currently includes four tasks: Logic(Situation Puzzle),Math,Texas Hold'em, and Trust Game.

arXiv
· InteractiveBench
See Full Paper
More sessions coming soon