Blog

World’s Smallest PDF


About That PDF Thing

At PH-Neutral, I recently presented a bunch of information about how no two PDF readers will see a PDF file in the same way. Which is useful if you’re trying to sneak an exploit past a smart A/V scanner. [Unfortunately, most A/V scanners are not even smart enough to find an exploit sitting in easy-to-read plaintext at the top of a well-formed file.]

Someone took a picture of one of my slides, which has been quite popular, based upon the number of retweets and views.

So, I’ll explain how this works, for the benefit of everyone who wasn’t there at the time&hellip




How Object References Work in PDF

You can take any other PDF data type and give it a number by wrapping it in “obj” and “endobj“. Then later on, when you want to use that chunk of data, you can reference it, by number, with the “R” operator. (See Figure 1.)

These two examples are equivalent to Acrobat…

2 0 obj
(Hello World)
endobj
3 0 obj
<<
/Example 2 0 R
>>
endobj

Figure 1
3 0 obj
<<
/Example (Hello World)
>>
endobj

Figure 2


Acrobat, and most other PDF readers I assume, expand the references at time of use, not at the time of parsing. So if you define object number one to be a number two “2“, and then try to use it like this: “1 0 R 0 R” it doesn’t work. That is not equivalent to “2 0 R“. If anyone knows of a PDF reader that actually does this, that would be really neat, because that parser would unintentionally make PDF equivalent to Lisp.

Think about it, you could write Ω = (λx. x x) (λx. x x) as “1 0 obj 1 0 R 1 0 R endobj


What You Can Leave Out

  • All Page data
  • All Whitespace, except for End-Of-Line after comments
  • The version number part of %PDF-1.1
  • The %%EOF
  • The xref table
  • And thus also startxref
  • Most Object /Types


So what’s actually required?

  • %PDF-anything, but if the file is too confusing for Acrobat, you need at least the first number. Like %PDF-1.
  • A trailer with a /Root dictionary for the Catalog
  • A /Pages dictionary, but this can be empty, just as long as it’s a dictionary type.
  • An /OpenAction if you want to launch your Javascript upon file open.
  • The Javascript Action.


Example

Using Didier Steven’s well-formed PDF example, for this example…

%PDF-1.1
1 0 obj
<<
/Type /Catalog
/Outlines 2 0 R
/Pages 3 0 R
/OpenAction 7 0 R
>>
endobj
2 0 obj
<<
/Type /Outlines
/Count 0
>>
endobj
3 0 obj
<<
/Type /Pages
/Kids [4 0 R]
/Count 1
>>
endobj
4 0 obj
<<
/Type /Page
/Parent 3 0 R
/MediaBox [0 0 612 792]
/Contents 5 0 R
/Resources <<
/ProcSet [/PDF /Text]
/Font << /F1 6 0 R >>
>>
>>
endobj
5 0 obj
<< /Length 56 >>
stream
BT /F1 12 Tf 100 700 Td 15 TL (JavaScript example) Tj ET
endstream
endobj
6 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F1
/BaseFont /Helvetica
/Encoding /MacRomanEncoding
>>
endobj
7 0 obj
<<
 /Type /Action
/S /JavaScript
/JS (app.alert({cMsg: 'Hello from PDF JavaScript', cTitle: 'Testing PDF JavaScript', nIcon: 3});)
>>
endobj
xref
0 8
0000000000 65535 f
0000000010 00000 n
0000000098 00000 n
0000000147 00000 n
0000000208 00000 n
0000000400 00000 n
0000000507 00000 n
0000000621 00000 n 
trailer
<<
 /Size 8
/Root 1 0 R
>>
startxref
773
%%EOF

And remember, all objects references can be replaced by their contents. And so we get…


The Tiny PDF

Note: I’ve only tested this in Acrobat v9.1.3, from what I’ve been told, Acrobat v8 will throw an error on this file.

%PDF-1.

trailer<</Root<</Pages<<>>/OpenAction<</S/JavaScript/JS(app.alert({cMsg:'Stuff Goes Here'});)>>>>>>

There are only 71 bytes required, aside from the Javascript code. With the improvements below, it drops to only 58 bytes required.

Without any pages, Acrobat kinda sits there for a while going duh, a scroll or a click event will make it realize there’s an error.

A dialog box which says: There was a problem reading this document (14).

An /OpenAction gets to happen before the above error, but if you use the “Will Close” (/WC) Action mentioned below, the Javascript execution happens after this error.


Improvements

Tavis Ormandy pointed out that you can terminate the “%PDF-” with a NULL “\0” byte, which saves two bytes (compared to “1.\n“).

Ryan MacArthur pointed out that you can use a “Will Close” “Additional Action” rather than an “OpenAction”, which saves quite a few bytes, but with a null page object, Acrobat won’t actually perform the Will Close action until some other action is performed by the user. Such as clicking or scrolling on the non-existent page. Immediately Closing or Quitting after opening the document won’t trigger the “Will Close” action.


Conclusion

00000000  25 50 44 46 2d 00 74 72  61 69 6c 65 72 3c 3c 2f  |%PDF-.trailer<</|
00000010  52 6f 6f 74 3c 3c 2f 50  61 67 65 73 3c 3c 3e 3e  |Root<</Pages<<>>|
00000020  2f 41 41 3c 3c 2f 57 43  3c 3c 2f 53 2f 2f 4a 53  |/AA<</WC<</S//JS|
00000030  28 29 3e 3e 3e 3e 3e 3e  3e 3e                    |()>>>>>>>>|
0000003a



Julia Wolf @ FireEye Malware Intelligence Lab

Questions/Comments to research [@] fireeye [.] com

6 thoughts on “World’s Smallest PDF

  1. evince in Debian fails to read the 71 byte version:
    Error: PDF file is damaged – attempting to reconstruct xref table…
    Error: Couldn’t find trailer dictionary
    Error: Couldn’t read xref table
    Error: PDF file is damaged – attempting to reconstruct xref table…
    Error: Couldn’t find trailer dictionary
    Error: Couldn’t read xref table

  2. Same for okular on gentoo.
    Error: PDF file is damaged – attempting to reconstruct xref table…
    Error: Couldn’t find trailer dictionary
    Error: Couldn’t read xref table

  3. Yeah, I expect that *everything* except for Adobe Acrobat 9.1.3 is going to error out. This PDF file is way-way-way out of spec. Older versions of Acrobat won’t even read this.

  4. > Do you want to make a post / analysis about this http://seclists.org/fulldisclosure/2010/Jul/7 case?
    I suppose I could, though it’s not terribly new or interesting. There is a ton of this sort of activity every day, and has been for years. This particular spam campaign was also pretending to be from wordpress.com, also saying that you’d just signed up for an account, with all links leading to that infectious PDF. (Which used one of three possible exploits, all at least a year old.)
    Using the message attached to that FD post, this particular article of spam was send from 202.13.62.5. Observe:
    [...]
    > Received: from TUHWJATY (unknown [202.133.62.5])
    > by stg.iki.fi (Postfix) with ESMTP id 26EC819D5C
    > for ; Thu, 1 Jul 2010 13:25:28 +0300 (EEST)
    > Received: from 202.133.62.5 (port=0267 helo=[swaraj])
    > by mail.ragoarts.com with asmtp
    > id 981EFE-000841-91
    > for ______hack.fi; Thu, 1 Jul 2010 15:56:02 +0530
    > Someone from the IP address 202.133.62.5 has registered the account “fgeek” with [...]
    The IP address of the spam drone is included in the body text, as well as the recipient username. Every single message is like this, just with the corresponding values filled in.
    If it wasn’t 2:30am, and I didn’t have something else to be finishing right now, I’d probably lookup the name of whichever particular spam bot this is.
    (I’ve redacted the email address of the recipient, just to avoid that much more spam sent to them.)
    I think you’ll find this illuminating:
    http://www.google.com/search?q=http%3A%2F%2Fchipsnchils.com%2Fwordpress.html

Comments are closed.