You may have heard something in the news about PDF recently… By the power of Google!
- http://www.heise.de/security/meldung/27C3-Brandgefaehrliche-PDF-Dokumente-Update-1162122.html (The one that started it all)
- http://www.h-online.com/security/news/item/27C3-danger-lurks-in-PDF-documents-Update-1162166.html (… auf Englisch)
- http://it.slashdot.org/story/11/01/02/0231242/Detailing-the-Security-Risks-In-PDF-Standard (Basically the Heise article)
- http://www.daemonforums.org/showthread.php?t=5507 (Links to Heise, like Slashdot)
- http://18.104.22.168/2011/01/10/c3.html (A blog)
- http://frutosdelcaos.blogspot.com/2010/12/idiosincrasias-de-pdf.html (Un blog en español)
- http://blog.scrt.ch/2011/01/07/27c3-we-come-in-peace/ (Un blog en français)
- http://trackback.fritz.de/2011/01/08/trb-210-27-chaos-communication-congress/ (A Podcast)
- http://www.tablix.org/~avian/blog/archives/2010/12/27c3_wrap_up/ (A mention on another blog)
… and fifty zillion ever-so-slightly modified copies of the Heise Online news article. Some with creeping errors.
It is now abundantly clear that the major source for news is other news— negativland
I recently gave this presentation at the 27th Chaos Computer Congress in Berlin.
For some reason, the slides never made it from Pentabarf to the Fahrplan.
(They should be here: http://events.ccc.de/congress/2010/Fahrplan/attachments/1796_27C3_Julia_Wolf_OMG-WTF-PDF.pdf Curerntly 404, not by intent.) So first order of business, here are the long sought after slides:
(I have had so many requests for these.)
The talk was approximately a twenty minute summary of about 2,500 pages of dense technical documentation, twenty minutes of
I wonder what Acrobat does if I feed it this., and a little bit of (incomplete) stuff about A/V programs —
because every talk like this has to mention A/V because everyone asks about it. Speaking of which: if anyone would like to perform a rigorous test of how well A/V detects obfuscated PDFs, please go right ahead; I have neither the time nor interest.
I was inspired by the 2009 work of Meredith L. Patterson, Len Sassaman, Moxie Marlinspike, Dan Kaminsky, Sergey Bratus, and any one else I may have forgotten, on ambiguities in ASN.1 and X.509 parsing.
As I read through the PDF ISO specification two years ago, certain oddities jumped out at me. I can’t possibly be the first person to have read the specification and noticed this stuff, right? Right?
Can you really just execute arbitrary programs from a PDF? [Answer: Yes (And more than one PDF reader does this!)]
While reading the ISO 32000-1 [PDF] document – or really any technical specification – what you really need to pay the most attention to, is what is not said. Not only is ISO 32000-1 absent of any formal language definition (BNF, etc.) but many possible glosses which can be formed that are not defined. (As it says right at the very beginning of the ISO 32000-1, there’s nothing in this document that defines whether or not a PDF file is well-formed or not.
It’s called Adobe Acrobat because it’ll bend over backwards!
Since I started presenting this, several unrelated people have told me that they were using foo-trick, or bar-trick for years, but I seem to be the first person to stand in front of a room, and tell people about it for an hour. (Well, that and the first to make hybridized PDF files.)
Considering the existence of PDF/A, a committee somewhere must have had the same ideas about tightening up the PDF specification.
Part of the original brainstorm for this talk, was to create a chart of features/quirks across several different PDF readers and versions.
That could have taken way more time, than I could have invested at the time.
If anyone would like to actually perform these tests, I encourage you to please do so, and publish your results. I’ll be posting the test files I used for my talk soon.
I’m speaking at the
iSEC Open Forum Bay Area next week. This is the info I have on it:
iSEC Open Forum Bay Area
- Thursday, February 3, 2011
- Intuit Building 9, Cook Conference Room
2600 Casey Ave
Mountain View, CA 94043
Please visit http://www.meetup.com/iSECOpenForums/ or RSVP to rsvp ＠ isecpartners.com if you wish to attend!
***technical managers and engineers only please***
***food and beverage provided***
I’m also speaking at TROOPERS in Heidelberg, Germany on Mar 28-Apr 1, 2011. It will be about PDF, but not the same stuff as 27C3.
I wrote the bulk of this talk back in May for PH-Neutral 0x7DA. Adobe Acrobat X
hadn’t even been announced yet. The CCC submission deadline was three months ago.
A month before my talk, Adobe Reader X is released.
So, I should have updated my talk to mention Acrobat X more prominently, rather than in passing at the end. However, I’ve done no testing with it at all.
So, I did not say that you can scan a network from a printer with a PDF. I was speculating about just how much of the PDF spec a printer may implement.
If it did implement the whole thing, then crazy stuff like scanning a network via a printer would be possible, and I’d be very surprised if any printer maker would ever do that.
Paul Baccas has a pretty good summary of most public research on this kind of thing here: http://nakedsecurity.sophos.com/2011/01/24/review-omg-wtf-pdf/
He’s quite possibly the only other person on Earth who’s actually read through my test files, and actually found an error with one.
Most of my tests were written in one day. I think I spent maybe five minutes on the duplicate object tests, so messing up the
startxref is no surprise.
This means that Slide 123, and Slide 124 in the
27C3 [PDF] talk
are the exact opposite of what they should say. The first Object will be used if
xref points to it, otherwise if ref is broken, the last Object defined is used.
(This actually seems much more consitant with Acrobat’s other behaviors.)
PDF syntax seems to have been influenced by TeX
Office, OpenOffice, and iWork documents are ZIP files; Java ARchives are ZIP files. Use your imagination.
I can barely fit the information I’ve got into 50 minutes. Acrobat has a lot of code, I mean A LOT, A LOT. Load it in
gdb sometime and just list the symbols. It’ll take about half an hour on a 2GHz Macbook.
That said, I should have at least mentioned PDF/A, and it’s cousin PDF/X.
and tighter requirements on the PDF syntax itself. For example, the first byte of “%PDF-” must be at file offset zero. I don’t know how many readers enforce these things in practice.
The official ISO documentation is available for sale here: http://www.iso.org/iso/catalogue_detail?csnumber=38920
Currently, it costs about US$125 if you’d like to buy a copy. There are actually several documents, but I think this is the main one. (I haven’t read it.)
PDF/UA is for Universal Accessibility, screen readers and the like. It should also make it possible to easily convert a PDF into a plain text document.
PDF/E PDF for Engineers, just read the Wiki page on it.
While Googling around to write this blog post, I discovered this: http://hul.harvard.edu/jhove/pdf-hul.html It’s a PDF Syntax Validator. I don’t know anything else about it other than what it says on that web page.
I’ve got this dialogue going on in my head about many of the comments I’ve heard…
The Firſt Dialogue
Salviati, Sagredo, and Simplicio.
Sagredo: It has been evidently demonﬆrated that PDF sekurite is not good; What may be done in time to come? And forgive me if I continue in Modern English in order to save time.
Simplicio: You will be safe long as you keep patching your PDF software. And besides it will take a long time to learn how to use a new PDF reader.
Simplicio: Users should only open documents from people they trust.
Salviati: Most attacks performed via PDFs are
drive-by attacks, that is, a malicious web page (sometimes a modified legitimate one, or a paid advertisement) instructed your web browser to open a PDF file in Acrobat. You don’t have a choice in opening the malicious PDF file.
Simplicio: Everyone should switch to using PDF/A!
Salviati: Yes, let’s have all the malware authors use PDF/A from now on!
This doesn’t help mitigate vulnerabilities like integer overflows in embedded image handlers.
You could get everyone to reject old-style PDF documents, and only accept PDF/A…
You’d need to convert every single old PDF document on earth, but keep in mind that documents that accept form input won’t function as a PDF/A. And you need everyone to throw out their old vulnerable PDF readers.
Sagredo: What’s a good PDF reader to use that’s not Adobe Acrobat?
This is a list of every version of my talk presented to more than 20 people at a time.
It’s mostly the same talk each time, except with some new stuff added from the last time I presented it.
May 29, 2010 ph-neutral 0x7da PH-Neutral2010.pdf Sep 09, 2010 SEC-T 2010 Sec-T_2010_Julia_Wolf_final.pdf [Video] Julia Wolf - PDF.mp4 Oct 24, 2010 ToorCon 12 San Diego Julia_Wolf_ToorCon12_OMG_WTF.pdf Oct 27, 2010 SecTor 2010 Julia_Wolf_SecTor_OMG_WTF.pdf [Video] SecTor 2010 - Julia Wolf - OMG-WTF-PDF.wmv Dec 11, 2010 BayThreat 2010 BayThreat2010.pdf Dec 29, 2010 (BerlinSides was almost identical to the 27C3 talk.) Dec 30, 2010 27th Chaos Communication Congress 27C3_Julia_Wolf_OMG-WTF-PDF.pdf [Video] http://media.ccc.de/browse/congress/2010/27c3-4221-en-omg_wtf_pdf.html [Direct Link] http://ftp.ccc.de/congress/2010/mp4-h264-HQ/27c3-4221-en-omg_wtf_pdf.mp4 [Part 1/4] http://www.youtube.com/watch?v=4F2xMw3987I [Part 2/4] http://www.youtube.com/watch?v=2zbWTUkrfJM [Part 3/4] http://www.youtube.com/watch?v=fvUB3xcMavw [Part 4/4] http://www.youtube.com/watch?v=k9gyaYebjyQ [Live Recording] http://achtbaan.nikhef.nl/27c3-stream/releases/mkv/%5b4221%5d%20OMG%20WTF%20PDF/ [Matroska]
On Sep 9, 2010, just after my presentation at Sec-T, I was interviewed by a reporter. This was the result:
Included here by permission of the copyright holder.
Adobe Reader is a tremendous risk
by Frida Sundkvist
Do you have Adobe Reader installed? Then neither your documents nor passwords are protected completely. Expert advice is to replace the PDF reader while Adobe solves the problems.
The last year has highlighted many security problems with PDF files. Often it’s enough that a user has a PDF reader installed to be targeted by an attack.
– PDF is the biggest threat today. You can do almost anything without being detected. Adobe Reader is a tremendous risk, says Julia Wolf, security researcher at Fire Eye.
Julia Wolf is in Stockholm to speak at the Sec-T Security Conference. She says that despite that, it is good that people are discovering the problems with Adobe Reader or other PDF readers.
– My advice is to uninstall Adobe Reader and select another PDF reader,
There are major weaknesses in the pdf format which means that infected
files are difficult to detect and there are also vulnerabilities in Adobe
Reader. Since PDF reader is by far the most popular, hackers have put most
of their resources into targeting Adobe.
A pdf-attack often goes unnoticed. If the user has a PDF reader installed, it may be enough to accidentally go to the wrong website. Seconds later, your personal documents uploaded to a waiting third party. Your keystrokes are recorded and thus more people than you may know your password.
– I have heard many different amounts on how much money companies lose. The conclusion is that there is a lot, she says.
In May Julia Wolf experimented with rewriting a pdf file to see if antivirus software detected the infected file. Only nine of 41 anti-virus detected it. The remaining 32 anti-virus programs, didn’t detect it. No surprise, says Julia Wolf.
Thomas Kristensen, security expert at Secunia, is on the same track. If users in a company do not need very advanced PDF files, the company should
consider changing pdf readers.
– If you are a criminal and want to cause as much harm as possible, which route do you choose? In almost all cases, you choose Adobe Reader before for example Foxit, he says.
Per Hellqvist, security expert at Symantec, does not think that the solution is to uninstall Adobe Reader, but he understands that line of thinking. He argues instead that it’s most important to patch, that is to say to update the security settings in Reader.
– All PDF readers have security holes and it can take a long time to train users to use a new program, he says.
The only way to be protected from PDF attacks is to not have a PDF reader installed. Towards the end of the year, comes the next update for Adobe
Reader. This version will have isolated the program itself within a “sandbox” (see article above).
Photo Caption: [Some kind of Swedish idiom about poking fun at security, or saws, or something] Problems with Adobe Reader have been known for a
long time. “You can do almost anything without being detected,” said Julia Wolf, security researcher at Fire Eye, which calls on companies to use a
different PDF reader until Adobe sorts out its program.
I can read French better than Google Translate, so this is the relevant excerpt from http://blog.scrt.ch/2011/01/07/27c3-we-come-in-peace/.
The vulnerabilities of PDF
This last decade, the Portable Document Format, or PDF, is now clearly the most common format for the publication of electronic documents.
It is used in the publishing industry as a reference format as much for printing as for the publication of ebooks.
And the PDF reference implementation is still and always the popular Adobe Acrobat, in both versions, Writer or simply “Reader”.
Unfortunately, as Julia Wolf has mentioned in her presentation, PDF is a standard that has more or less been agglomerated over the needs of it’s original version, and Acrobat is the result of this organic evolution: more than 15 million lines of code, Acrobat is bigger than Mozilla firefox, bigger than the Linux kernel and the majority of the code was written in the ’90s, in Adobe’s secret laboratories.
The result is that it’s possible to easily fool Acrobat.
The syntax of PDF and along with other reasons that the standard has no explicitly described method for validation of PDF documents allows the creation of PDF documents that are at the same time executables, which contain malicious code or that pose as any of a number of data formats.
In a PDF, one can call system commands, execute arbitrary programs, form documents that display different contents depending on the PDF reader program used, and so on.
Julia Wolf is focused on Acrobat, because it is still today the most common PDF client, but the standard on on which these programs are based is sufficiently confusing and complex that the exploitation of vulnerabilities is possible with each independent implementation.
Stuff that should be here, but isn’t:
- Testing Samples
- PDF Portmanteau technical explanation
- PDF Quine, with explanation
- Misc. stuff, like that forwards-backwards-pages PDF file (It’s in the Test Samples)
Once I’ve organized it all, I’ll publish it on this blog. But I’m rushing to get this one [the blog post you're now reading] published.