30 days · UTC
Synchronizing with global intelligence nodes...
ABC-Bench evaluates LLM agents on real backend tasks from repo exploration through Dockerization, service deployment, and end-to-end API testing. It i...