While testing prompts, I accidentally discovered a bug in a well-known large language model that allowed jailbreaks through prompt injection. This loophole made it possible to generate harmful or illegal content that should normally be blocked.
All screenshots shown were created for testing purposes only, and the bug was officially reported on the product side. I documented the issue and sent a full report with screenshots to the company on July 2, 2025, at 1:31 PM.
Result
The discovery showed me how fragile AI safety can be, and how important it is to build stronger safeguards in future systems.