Security

Pakka treats pak entry names as attacker-controlled data and runs every extract, add, and verify through the same fail-fast validator. This document is the byte-level reference for what the validator rejects and why; the README has the short summary.

The validator (pakka_unsafe_entry_name) is format-independent — it inspects the entry-name byte content, not the format-specific field width — so a name that’s legal in the on-disk directory can still be rejected at extract or add time. The same rules apply to the entry-name side of pakka_add_file / --as, so a pak built with pakka can be re-extracted with pakka without surprises.

Rejected entry names

Validation runs before any mkdir, fread, or fopen, so a malicious pak fails fast with no partial writes. Both POSIX and Windows traversal forms are checked regardless of host OS, since pak archives are portable.

Extraction-time protections

Three checks run beyond the entry-name validator when bytes would touch disk:

Normalized-collision preflight

Before any byte is written, pakka sorts the selected entries’ normalized names — case-folded, with \ mapped to /, and trailing dot or space stripped — and rejects the whole extraction if two entries would materialize to the same path on Windows or HFS+.

Without this check, two entries that differ only in case or in a trailing dot would silently overwrite each other on case-insensitive filesystems.

Every directory descent into the -C destination tree, and every leaf open, refuses to cross a symlink or reparse point:

A planted symlink inside the destination directory cannot redirect a write outside it.

When pakka -a recurses into a directory tree, any symlink or reparse point found inside is reported and skipped rather than silently followed. This applies symmetrically to the write-time protection: the archive can never gain entries materialized from a redirected path.

Decompression bombs (PK3 / PK4 / Daikatana)

For archives with compressed entries, pakka_set_max_decompressed_size(archive, max_bytes, err) caps the bytes any single pakka_open_entry or pakka_read_entry_alloc will inflate. Default is 64 MiB; pass 0 to disable. This refuses zip-bomb-style high-ratio entries before they hit RAM.

Auditing without extracting

pakka --verify runs the entry-name validator and the normalized-collision preflight against every entry, streams every payload to confirm the directory’s offset / length point at readable bytes, and exits non-zero on any error-level finding. None of those checks touch the destination filesystem, so an archive can be audited safely before extraction.

Pass PAKKA_VERIFY_DEEP to pakka_verify (library) for per-entry CRC32 and decompression checks on top of the structural walk — CRC for PK3 / PK4; byte-count check for Daikatana, whose custom codec has no CRC.