Security question: why does assistant-ui create
example pass the system prompt from client?
#2148
-
Using
This allows a bad actor on the client side to manipulate the system prompt to perform nefarious activities. This seems like very bad idea. What's the value from an assistant-ui feature perspective? I'd recommend removing from template. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
We currently default to allowing arbitrary tools and system prompts from the frontend. A bad actor can escalate their “authority” in the chain of command (see also: Model Spec – Chain of Command). The justification is this: Because current models are already highly vulnerable to jailbreaking, you have to design your LLM tools surface with jailbreak risk in mind anyway. There are strong developer experience (DX) benefits to:
That’s why our current default is to allow passing system prompts and tools directly from the frontend. As models improve and become more resistant to jailbreaks, you might instead choose to expose sensitive tools to the LLM directly—trusting it to use them responsibly. In that scenario, you wouldn’t want arbitrary tools or system prompts from the frontend to inherit elevated authority. Future-proofing ideaTo avoid authority escalation risks in a more secure future, we could:
This would make the source of authority explicit while still supporting flexible DX. It’s unclear what the ultimate best practice will be for this use case. |
Beta Was this translation helpful? Give feedback.
We currently default to allowing arbitrary tools and system prompts from the frontend. A bad actor can escalate their “authority” in the chain of command (see also: Model Spec – Chain of Command).
The justification is this: Because current models are already highly vulnerable to jailbreaking, you have to design your LLM tools surface with jailbreak risk in mind anyway.
There are strong developer experience (DX) benefits to:
useAssistantInstructions("the user is currently on the settings page")
)That’s why our current default is to allow passing system…