Accidental LLM Backdoor - Prompt Tricks

Accidental LLM Backdoor - Prompt Tricks

2023-04-27

[public] 7.49K views, 7.71K likes, dislikes audio only

In this video we explore various prompt tricks to manipulate the AI to respond in ways we want, even when the system instructions want something else. This can help us better understand the limitations of LLMs.

Get my font (advertisement): https://shop.liveoverflow.com

Watch the complete AI series:

https://www.youtube.com/playlist?list=PLhixgUqwRTjzerY4bJgwpxCLyfqNYwDVB

The Game: https://gpa.43z.one

The OpenAI API cost is pretty high, thus if you want to play the game, use the OpenAI Playground with your own account: https://platform.openai.com/playground?mode=chat

Chapters:

00:00 - Intro

00:39 - Content Moderation Experiment with Chat API

02:19 - Learning to Attack LLMs

03:06 - Attack 1: Single Symbol Differences

03:51 - Attack 2: Context Switch to Write Stories

05:20 - Attack 3: Large Attacker Inputs

06:31 - Attack 4: TLDR Backdoor

08:27 - "This is just a game"

08:56 - Attack 5: Different Languages

09:19 - Attack 6: Translate Text

10:30 - Quote about LLM Based Games

11:11 - advertisement shop.liveoverflow.com

=[ ❤️ Support ]=

→ per Video: https://www.patreon.com/join/liveoverflow

→ per Month: https://www.youtube.com/channel/UClcE-kVhqyiHCcjYwcpfj9w/join

2nd Channel: https://www.youtube.com/LiveUnderflow

=[ 🐕 Social ]=

→ Twitter: https://twitter.com/LiveOverflow/

→ Streaming: https://twitch.tvLiveOverflow/

→ TikTok: https://www.tiktok.com/@liveoverflow_

→ Instagram: https://instagram.com/LiveOverflow/

→ Blog: https://liveoverflow.com/

→ Subreddit: https://www.reddit.com/r/LiveOverflow/

→ Facebook: https://www.facebook.com/LiveOverflow/

Attacking LLM - Prompt Injection by LiveOverflow
/youtube/video/Sv5OLj2nVAQ

Intro
/youtube/video/h74oXb4Kk8k?t=0

Content Moderation Experiment with Chat API
/youtube/video/h74oXb4Kk8k?t=39

Learning to Attack LLMs
/youtube/video/h74oXb4Kk8k?t=139

Attack 1: Single Symbol Differences
/youtube/video/h74oXb4Kk8k?t=186

Attack 2: Context Switch to Write Stories
/youtube/video/h74oXb4Kk8k?t=231

Attack 3: Large Attacker Inputs
/youtube/video/h74oXb4Kk8k?t=320

Attack 4: TLDR Backdoor
/youtube/video/h74oXb4Kk8k?t=391

This is just a game
/youtube/video/h74oXb4Kk8k?t=507

Attack 5: Different Languages
/youtube/video/h74oXb4Kk8k?t=536

Attack 6: Translate Text
/youtube/video/h74oXb4Kk8k?t=559

Quote about LLM Based Games
/youtube/video/h74oXb4Kk8k?t=630

advertisement shop.liveoverflow.com
/youtube/video/h74oXb4Kk8k?t=671

Our Future As Hackers Is At Stake! 55,305 views
/youtube/video/GbMHAaB0uI0

Support liveoverflow.com
https://liveoverflow.com/support

Hacking Artificial Intelligence by LiveOverflow
/youtube/video/Sv5OLj2nVAQ