index

What your model really needs is more JPEG!

When machine learning models get deployed into production, the people who trained the model lose some amount of control over inputs that go into the model. There is a large body of literature on all the natural ways in which the data a model sees at inference time might be …


Adversarial training: or, poisoning your model on purpose

So far, we have been looking at different ways adversarial machine learning can be applied to attack a machine learning model. We've seen different adversary goals, applied under different threat models, that resulted in giant sunglasses, weird t-shirts, and forehead stickers.

But what if you are the person with a …


Anti-adversarial patches

In the papers that we have discussed about adversarial patches so far, the motivation has principally involved looking at the security or safety of machine learning models that have been deployed to production. So, these papers typically reference an explicit threat model where some adversary is trying to change the …


Getting catfished by ChatGPT

At the AIVillage at DEFCON 2022, Justin Hutchens a cybersecurity expert at Set Solutions, gave a presentation on using dating apps as an attack vector.1 The idea goes like this: let's say you want to gain access to a system that normally requires secure logins. Some systems provide a …


I asked galactica to write a blog post and the results weren't great

A few weeks ago, Meta AI announced Galactica1, a large language model (LLM) built for scientific workflows. The architecture for Galactica is a fairly vanilla transformer model, but with three interesting modifications to the training process.

First, the training corpus itself is comprised of scientific documents. These are mostly …


Adversarial patch attacks on self-driving cars

In the last post, we talked about one potential security risk created by adversarial machine learning, which was related to identity recognition. We saw that you could use an adversarial patch to trick a face recognition system into thinking that you are not yourself, or that you are someone else …


Faceoff : using stickers to fool face ID

We've spent the last few months talking about data poisoning attacks, mostly because they are really cool. If you missed these, you should check out Smiling is all you need : fooling identity recognition by having emotions, which was the most popular post in that series.1

There are two more …


Spy GANs : using adversarial watermarks to send secret messages

In the recent posts where we have been discussing data poisoning, we have mostly been focused on one of two things:

  1. an availability attack, where we degrade the accuracy of a model if it gets trained on any data that we generated; or,
  2. a backdoor attack, where the model performance …


When reality is your adversary: failure modes of image recognition

In the typical machine learning threat model, there is some person or company who using machine learning to accomplish a task, and there is some other person or company (the adversary) who wants to disrupt that task. Maybe the task is authentication, maybe the method is identity recognition based on …


Is it illegal to hack a machine learning model?

Maybe. It depends a little bit on how you do it and a lot of bit on why.

I am not a lawyer, and this blog post should not be construed as legal advice. But other people who are lawyers or judges have written about this, so we can review …


Page 1 / 4 »