blog.neater-hut - ml category

I asked galactica to write a blog post and the results weren't great

A few weeks ago, Meta AI announced Galactica¹, a large language model (LLM) built for scientific workflows. The architecture for Galactica is a fairly vanilla transformer model, but with three interesting modifications to the training process.

First, the training corpus itself is comprised of scientific documents. These are mostly …

Adversarial patch attacks on self-driving cars

In the last post, we talked about one potential security risk created by adversarial machine learning, which was related to identity recognition. We saw that you could use an adversarial patch to trick a face recognition system into thinking that you are not yourself, or that you are someone else …

Faceoff : using stickers to fool face ID

We've spent the last few months talking about data poisoning attacks, mostly because they are really cool. If you missed these, you should check out Smiling is all you need : fooling identity recognition by having emotions, which was the most popular post in that series.¹

There are two more …

Spy GANs : using adversarial watermarks to send secret messages

In the recent posts where we have been discussing data poisoning, we have mostly been focused on one of two things:

When reality is your adversary: failure modes of image recognition

In the typical machine learning threat model, there is some person or company who using machine learning to accomplish a task, and there is some other person or company (the adversary) who wants to disrupt that task. Maybe the task is authentication, maybe the method is identity recognition based on …

Is it illegal to hack a machine learning model?

Maybe. It depends a little bit on how you do it and a lot of bit on why.

I am not a lawyer, and this blog post should not be construed as legal advice. But other people who are lawyers or judges have written about this, so we can review …

We're not so different, you and I -- adversarial attacks are poisonous

I spent a lot of time thinking about the title for this post. Way more than usual! So I hope you'll indulge me in quickly sharing two runners up:

The real data posions were the adversarial examples we found along the way
Your case and my case are the same …

How to tell if someone trained a model on your data

The last three papers that we've read, backdoor attacks, wear sunglasses, and smile more, all used some variety of an image watermark in order to control the behavior of a model. These authors showed us that you could take some pattern (like a funky pair of sunglasses), and overlay it …

Smiling is all you need: fooling identity recognition by having emotions

In "Wear your sunglasses at night", we saw that you could use an accessory, like a pair of sunglasses, to cause machine learning models to misbehave. Specifically, if you have access to images that might be used to train an identity recognition model, you can superimpose barely-visible watermarks of sunglasses …

Wear your sunglasses at night : fooling identity recognition with physical accessories

In "A faster way to generate backdoor attacks", we saw how we could replace computationally expensive methods for generating poisoned data samples with simpler heuristic approaches. One of these involved doing some data alignment in feature space. The other, simpler approach, was applying a low-opacity watermark. In both cases, the …

index