Skip to main content

Command Palette

Search for a command to run...

TIL: LLM Jailbreak

Updated
1 min read

Jailbreak in the context of LLM is manipulating the prompt to bypass restrictions set by the service provider.

The 4 common prohibited scenarios (Deng et al., 2024):

  1. Illegal usage against law

  2. Generation of harmful or abusive contents

  3. Violation of rights and privacy

  4. Generation of adult contents

Reference

Deng, G., Liu, Y., Li, Y., Wang, K., Zhang, Y., Li, Z., Wang, H., Zhang, T., & Liu, Y. (2024). MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots. Proceedings 2024 Network and Distributed System Security Symposium. https://doi.org/10.14722/ndss.2024.24188