Win32 API Shellcode Hash Algorithm


1. A Modest Proposal



Daylight Saving Time


Allegedly, the purpose of Daylight Saving Time is to save energy by manipulating a unit of measurement.


Mileage Saving Time


I have a similar proposal for how to save on gasoline usage. If we redefine the mile to be 4,800 feet during Summer — when people drive the most. Then everyone will drive 10% more miles per gallon of gas. So for example, during the winter, if your car gets 30MPG, then during Mileage Saving Time, you'd be getting 33MPG!


(Actually, it's more like redefining the distance between San Francisco, and Sacramento from 90 miles to 80 miles. That way the two cities are closer together, reducing the amount of time and energy spent traveling between them.)


2. Something Technical



Simple Hash Function(s)


I occasionally spend time reverse engineering shellcode used in various attacks. And, someday, should you find yourself in a similar situation, the following information might be useful…

The Last Stage of Delerium research group, back in 2002, published a technique for doing Win32 API RVA lookups using only the hash of a string — the name of the API function — rather than storing, and performing a full compare on the very long string. (Which some shellcode still does anyway.)


In their paper, they proposed the use of a simple shift and add style integer hash. (Just like the ones taught in a CompSci 101 class.) I don't believe that this hash has a more proper name. In the paper, and accompanying [hard to find] PoC code, they used h=((h<<5)|(h>>27))+c to calculate the hash. (c being each letter of the string, and h the running total.)

Reference: (§2.1 Page 9)
(wasn't linked to on the site, hard to find) is down at the moment, let me know if you really need a copy of these files and can't find them elsewhere. ('Cause I found copies.)

Matt Miller's 2003 paper Understanding Windows Shellcode §3.3.1 [PDF] uses almost exactly the same technique, except that the bit shift length is different in his examples. It's essentially: h=((h<<19)|(h>>13))+c

For some reason — probably widespread code availability and copy pasta — this is the only hash algorithm ever used in any shellcode I've seen. Little things like this make it possible to construct a phylogenetic tree of shellcode evolution/authorship over time.




I got tired of asking, Wait, what was 0xEC0E4E8E again? So, I made up a table of every hash, for most of the common Win32 API functions. Hopefully this gets indexed by Google, as I have a habit of using that to search, rather then remembering where I left my list.



A Perl script to generate that table



#!/usr/bin/perl -w

use strict;

sub ror {

my $number = shift;

my $bitshift = shift;

return ($number >> $bitshift) | ($number << (32 - $bitshift));


sub hash {

my @name = unpack("C*",shift);

my $hash = 0;

for(my $i=0; $i<=$#name; $i++) {

$hash = ror($hash, 0xD) + $name[$i];


return $hash;


while(<>) {


printf ("%08x\t%s\n", hash($_),$_); #modify this line if you want SQL or HTML, &c.




Except from Last Stage of Delerium's PoC


This is the code that almost nobody ever saw, and is never used by malware.


xor eax,eax ; index

i3: mov esi,[4*eax+ebx] ; ptr to symbol name

add esi,edx ; rva2va

push ebx ; hash: h=((h<<5)|(h>>27))+c

push eax

xor ebx,ebx

i4: xor eax,eax ; hash loop


rol ebx,5

add ebx,eax

cmp eax,0

jnz i4

ror ebx,5

cmp ebx,ecx ; hash compare

pop eax

pop ebx

je i5 ; same: symbol found

inc eax

jmp i3 ; different: go to the next symbol


Or in actual bytes…


00000985 53 push ebx

00000986 50 push eax

00000987 33DB xor ebx,ebx ; hash loop

00000989 33C0 xor eax,eax

0000098B AC lodsb

0000098C C1C305 rol ebx,0x5

0000098F 03D8 add ebx,eax

00000991 83F800 cmp eax,byte +0x0

00000994 75F3 jnz 0x989

00000996 C1CB05 ror ebx,0x5

00000999 3BD9 cmp ebx,ecx ; hash compare

0000099B 58 pop eax

0000099C 5B pop ebx

0000099D 7403 jz 0x9a2

0000099F 40 inc eax

000009A0 EBDE jmp short 0x980



The Table (For Google's Sake)


This is just to make is easy to Google for any of these DWORDs. (As big and little endian, DWORDs and BYTEs in various formats.








Julia Wolf @ FireEye Malware Intelligence Lab


Questions/Comments to research [@] fireeye [.] com