A new benchmark measures how well AI agents can automate economically valuable chores. Human-level AI is still some ways off.