Users can “Show thinking” in the Gemini app to see the train of thought ... It also scores a state-of-the-art 18.8% across models without tool use on Humanity’s Last Exam, a dataset designed ...