Steve's thoughts and experiments

Blog Archive

Lab 4 Building a RAG pipeline image

Lab 4 Building a RAG pipeline

In this lab, I'm going to move away from security for a little while and look at how we can augment our LLM using a technique called Retrieval Augmented Generation (RAG).

The idea is that we can use a pre-trained LLM to answer questions about a specific dataset. We can then augment the pre-trained model with additional information from a knowledge base. To do this we usually convert the data into a vector space and storing this information in a vector database, like ChromaDB. In this we'll cover:

  • Store text embeddings in ChromaDB (an open source vector database)
  • Retrieve relevant knowledge dynamically
  • Use a local model to generate answers

Read More

The brains behind the machine image

The brains behind the machine

The main interface to the models is a chat interface. When you present a questions or insutrction, how do the models "know" what you mean? How does it understand the relationship between words?

The answer is embeddings!

An embedding is a high-dimensional vector representation of words, phrases or concepts. Instead of memorising words models map them into a numerical space where similar meanings are close together.

Read More

Lab 3 Prompt Injection image

Lab 3 Prompt Injection

Previosuly I've looked at the use of white hat attacks. These test the models as you're training them and should be considered as part of the development process. A prompt injection attack is a type of attack that is used to manipulate the output of a model and is usually an attack of a deployed model.

Large language models generate responses based on user inputs and hidden system instructions. A prompt injection attack exploits this by tricking the AI into ignoring its original constraints, leading to unsafe, unintended, or malicious outputs.

Read More

Lab 2 C&W Attack image

Lab 2 C&W Attack

In my previous post we explored the use of FGSM a powerful yet simple attack method for generating adversarial examples for LLMs. What happens if we need something more subtle and sophisticated?

Enter the Carlini and Wagner (C&W) attack — a method that iteratively optimises the perturbation to the input to maximise the loss function which generates adversarial examples with minimal distortion.

Read More

Scalars, Vectors, and Tensors... Oh My! image

Scalars, Vectors, and Tensors... Oh My!

When working with Large Language Models (LLMs) like GPT, the core mathematical structures you're dealing with are scalars, vectors, and tensors. Just like Dorothy braving the forest, I'm going to follow the yellow brick road and break down these concepts so that I have a better understanding of what they mean and maybe... just maybe I'll be in Kansas again.

Read More

Lab 1 FGSM image

Lab 1 FGSM

In my first lab I'm going to explore the use of FGSM (fast gradient signed method) to generate adversarial examples for a LLM.

What is FGSM?

FGSM is one of the most famours adversarial attack methods. It is designed to trick a neural network by adding small, carefully-crafted noise (a perturbation) to the input with the goal of having the model misclassify the input.

Read More

Breaking into AI security, my journey from DevOps to AI image

Breaking into AI security, my journey from DevOps to AI

Hello! My name is Steve, and I'm a DevOps engineer with a passion for security. I've been in the industry for around 15 years now, and I've seen a lot of changes. I've seen the rise of cloud computing, the adoption of DevOps practices, and the increasing importance of security. I've always dabbled in security, but I never really thought of it as a career path.

If I'm honest I have had some fear and anxiety around the LLMs and AI. I've seen the headlines, I've seen the hype, and I've seen the potential. I've also seen the risks, and I'm not sure how to navigate them. One thing is clear is that they're here to stay, and I need to find a way to move from fear to excitement.

Read More