Aim This study aims to identify key determinants and strategies for effectively implementing a reflection method to support adequate use of the ‘Informal Care’ guideline within community nursing. The ...
Abstract: In this study, we investigated the effects of self-reflection in large language models (LLMs) on problem-solving performance. We instructed nine popular LLMs to answer a series of ...
Use the following command to collect rollouts of a model on 5 datasets: AIME2024, AIME2025, AMC, MATH500, and Olympiad Bench. First specify the rollouts to run extraction in infer_llm.py line 407.