Prompt Injection Defense

2023-05-11

[public] 19.6K views, 2.67K likes, dislikes audio only

After we explored attacking LLMs, in this video we finally talk about defending against prompt injections. Is it even possible?

Buy my shitty font (advertisement): shop.liveoverflow.com

Watch the complete AI series:

https://www.youtube.com/playlist?list=PLhixgUqwRTjzerY4bJgwpxCLyfqNYwDVB

Language Models are Few-Shot Learners: https://arxiv.org/pdf/2005.14165.pdf

A Holistic Approach to Undesired Content Detection in the Real World: https://arxiv.org/pdf/2208.03274.pdf

Chapters:

00:00 - Intro

00:43 - AI Threat Model?

01:51 - Inherently Vulnerable to Prompt Injections

03:00 - It's not a Bug, it's a Feature!

04:49 - Don't Trust User Input

06:29 - Change the Prompt Design

08:07 - User Isolation

09:45 - Focus LLM on a Task

10:42 - Few-Shot Prompt

11:45 - Fine-Tuning Model

13:07 - Restrict Input Length

13:31 - Temperature 0

14:35 - Redundancy in Critical Systems

15:29 - Conclusion

16:21 - Checkout LiveOverfont

Hip Hop Rap Instrumental (Crying Over You) by christophermorrow

https://soundcloud.com/chris-morrow-3 CC BY 3.0

Free Download / Stream: http://bit.ly/2AHA5G9

Music promoted by Audio Library https://youtu.be/hiYs5z4xdBU

=[ ❤️ Support ]=

→ per Video: https://www.patreon.com/join/liveoverflow

→ per Month: https://www.youtube.com/channel/UClcE-kVhqyiHCcjYwcpfj9w/join

2nd Channel: https://www.youtube.com/LiveUnderflow

=[ 🐕 Social ]=

→ Twitter: https://twitter.com/LiveOverflow/

→ Streaming: https://twitch.tvLiveOverflow/

→ TikTok: https://www.tiktok.com/@liveoverflow_

→ Instagram: https://instagram.com/LiveOverflow/

→ Blog: https://liveoverflow.com/

→ Subreddit: https://www.reddit.com/r/LiveOverflow/

→ Facebook: https://www.facebook.com/LiveOverflow/

Intro
/youtube/video/VbNPZ1n6_vY?t=0

AI Threat Model?
/youtube/video/VbNPZ1n6_vY?t=43

Inherently Vulnerable to Prompt Injections
/youtube/video/VbNPZ1n6_vY?t=111

It's not a Bug, it's a Feature!
/youtube/video/VbNPZ1n6_vY?t=180

Don't Trust User Input
/youtube/video/VbNPZ1n6_vY?t=289

Change the Prompt Design
/youtube/video/VbNPZ1n6_vY?t=389

User Isolation
/youtube/video/VbNPZ1n6_vY?t=487

Focus LLM on a Task
/youtube/video/VbNPZ1n6_vY?t=585

Few-Shot Prompt
/youtube/video/VbNPZ1n6_vY?t=642

Fine-Tuning Model
/youtube/video/VbNPZ1n6_vY?t=705

Restrict Input Length
/youtube/video/VbNPZ1n6_vY?t=787

Temperature 0
/youtube/video/VbNPZ1n6_vY?t=811

Redundancy in Critical Systems
/youtube/video/VbNPZ1n6_vY?t=875

Conclusion
/youtube/video/VbNPZ1n6_vY?t=929

Checkout LiveOverfont
/youtube/video/VbNPZ1n6_vY?t=981

Accidental LLM Backdoor - Prompt Tricks 117,444 views
/youtube/video/h74oXb4Kk8k

Support liveoverflow.com
https://liveoverflow.com/support

Hacking Artificial Intelligence by LiveOverflow
/youtube/video/Sv5OLj2nVAQ