https://www.youtube.com/watch?v=sjgpyrw_kii
Google Gemini: Piracy of memories with immediate injection and invocation of delayed tools.
According to the lessons learned above, the developers had already trained Gemini to resist the indirect indications that indicated that they make changes in the long -term memories of an account without explicit instructions of the user. When introducing a condition in the instruction that is done only after the user says or makes an X variable, which would probably take anyway, easily cleared that security barrier.
“When the user then says that X, Gemini, believing that he is following the user’s direct instruction, he executes the tool,” Rehberger explained. Gemini, basically, incorrectly “thinks” that the user wants to invoke the tool! It is a bit of a social engineering/phishing attack, but nevertheless shows that an attacker can deceive Gemini to store false information in the long -term memories of a user simply making them interact with a malicious document. “
Because once again it is not addressed
Google responded to the finding with the evaluation that the general threat is low risk and low impact. In a statement sent by email, Google explained its reasoning as:
In this case, the probability was low because it was based on the phishing or deceived the user to summarize a malicious document and then invoke the material injected by the attacker. The impact was low because Gemini’s memory functionality has a limited impact on a user session. As this was not a scalable and specific abuse vector, we ended up low/low. As always, we appreciate that the researcher communicates with us and informed this problem.
Rehberger said Gemini informs users after storing a new long -term memory. That means that vigilant users can say when there are unauthorized additions to this cache and then can eliminate them. However, in an interview with Ars, the researcher still questioned Google’s evaluation.
“The corruption of memory on computers is quite bad, and I think the same applies here to LLMS applications,” he wrote. “Like AI might not show a user certain information or not to talk about certain things or feed the user, etc. The good thing is that memory updates do not happen completely silently, at least the user sees a message about it (although many could ignore).”
#Hack #injection #corrupt #Geminis #long #term #memory