At PH-Neutral, I recently presented a bunch of information about how no two PDF readers will see a PDF file in the same way. Which is useful if you're trying to sneak an exploit past a smart A/V scanner. [Unfortunately, most A/V scanners are not even smart enough to find an exploit sitting in easy-to-read plaintext at the top of a well-formed file.]
So, I'll explain how this works, for the benefit of everyone who wasn't there at the time&hellip
You can take any other PDF data type and give it a number by wrapping it in "
obj" and "
endobj". Then later on, when you want to use that chunk of data, you can reference it, by number, with the "
R" operator. (See Figure 1.)
These two examples are equivalent to Acrobat…
Acrobat, and most other PDF readers I assume, expand the references at time of use, not at the time of parsing. So if you define object number one to be a number two "
2", and then try to use it like this: "
1 0 R 0 R" it doesn't work. That is not equivalent to "
2 0 R". If anyone knows of a PDF reader that actually does this, that would be really neat, because that parser would unintentionally make PDF equivalent to Lisp.
Think about it, you could write Ω = (λx. x x) (λx. x x) as "
1 0 obj 1 0 R 1 0 R endobj"
- All Page data
- All Whitespace, except for End-Of-Line after comments
- The version number part of
- And thus also
- Most Object
%PDF-anything, but if the file is too confusing for Acrobat, you need at least the first number. Like
/Rootdictionary for the Catalog
/Pagesdictionary, but this can be empty, just as long as it's a dictionary type.
Using Didier Steven's well-formed PDF example, for this example…
1 0 obj
/Outlines 2 0 R
/Pages 3 0 R
7 0 R
2 0 obj
3 0 obj
/Kids [4 0 R]
4 0 obj
/Parent 3 0 R
/MediaBox [0 0 612 792]
/Contents 5 0 R
/ProcSet [/PDF /Text]
/Font << /F1 6 0 R >>
5 0 obj
<< /Length 56 >>
6 0 obj
7 0 obj
0000000000 65535 f
0000000010 00000 n
0000000098 00000 n
0000000147 00000 n
0000000208 00000 n
0000000400 00000 n
0000000507 00000 n
0000000621 00000 n
1 0 R
And remember, all objects references can be replaced by their contents. And so we get…
Note: I've only tested this in Acrobat v9.1.3, from what I've been told, Acrobat v8 will throw an error on this file.
Without any pages, Acrobat kinda sits there for a while going duh, a scroll or a click event will make it realize there's an error.
/OpenAction gets to happen before the above error, but if you use the "Will Close" (
Tavis Ormandy pointed out that you can terminate the "%PDF-" with a NULL "
\0" byte, which saves two bytes (compared to "
Ryan MacArthur pointed out that you can use a "Will Close" "Additional Action" rather than an "OpenAction", which saves quite a few bytes, but with a null page object, Acrobat won't actually perform the Will Close action until some other action is performed by the user. Such as clicking or scrolling on the non-existent page. Immediately Closing or Quitting after opening the document won't trigger the "Will Close" action.
00000000 25 50 44 46 2d 00 74 72 61 69 6c 65 72 3c 3c 2f |%PDF-.trailer<</|
00000010 52 6f 6f 74 3c 3c 2f 50 61 67 65 73 3c 3c 3e 3e |Root<</Pages<<>>|
00000020 2f 41 41 3c 3c 2f 57 43 3c 3c 2f 53 2f 2f 4a 53 |/AA<</WC<</S//JS|
00000030 28 29 3e 3e 3e 3e 3e 3e 3e 3e |()>>>>>>>>|
Julia Wolf @ FireEye Malware Intelligence Lab
Questions/Comments to research [@] fireeye [.] com