Anthropic experiments with AI introspection
Checking your intentions Anthropic researchers wanted to know if Claude could accurately describe his internal state based solely on internal information. This required the researchers to compare Claude’s self-reported “thoughts” with internal processes, something like connecting a human to a brain monitor, asking questions, and then analyzing the scan to map the thoughts to the…