New evaluation framework for reasoning tasks released · r/help · BusellAI