Evolving Analytics for Execution Trace Data

Five years ago, Mandiant released a proof of concept tool named ShimCacheParser, along with a blog post titled “Leveraging the Application Compatibility Cache in Forensic Investigations”. Since then, ShimCache metadata has become increasingly popular as a source of forensic evidence, both for standalone analysis and enterprise intrusion investigations.

While five years may seem like a long time, few community efforts have focused on leveraging ShimCache metadata at an enterprise scale. Today, we intend to fix that with the release of a new tool called AppCompatProcessor.

AppCompatProcessor started as a simple tool to automate ShimCacheParser execution on enterprise-wide investigations, leveraging Python’s multiprocessing module to speed up the parsing process. Later, it evolved to support faster regular expression searching and has continued to evolve ever since. Today, it handles both AppCompat and AmCache artifacts, has modules for processing more than 11 different formats, and contains some novel analytics to redefine the way we look at execution trace artifacts.

Available ‘Modules’ in the Initial Release

Upon execution, AppCompatProcessor (ACP) enumerates all the available commands we refer to as ‘modules’, as seen in Figure 1.

Figure 1: List of commands and modules

Investigators will likely use ‘load’, ‘status’, ‘list’ and ‘dump’ first. These will enable you to ingest data and enumerate loaded hosts or dump the data for a specific host in the database.

Once you have ingested your data, the ‘search’ module will enable you to search against a list of known-bad regular expressions as part of your triaging or hunting methodology. The module will also enable you to perform enterprise-wide string literal or regex searches from CLI.

Regular expression searching leverages multiprocessing “under the hood”. However, for simple literal string searches, the ‘fsearch’ module will automatically use the available indexes to provide investigators with virtually immediate results.

The ‘Search’ and ‘FSearch’ functions empower an investigator to search across the enterprise and much more. Perhaps you have found a dropper and need to know the other systems it was executed on. Perhaps an attacker is randomizing versioning information for binaries (stored in AmCache), which is predictable. Perhaps the attacker has only been active during specific timeframes. These objectives, and many more, are capable using AppCompatProcessor. Due to the number of analysis features offered by this tool, we will only discuss a subset of them. Check the GitHub page for a more detailed description of each feature.

leven’: Based on the Levenshtein distance algorithm, which measures the ‘edit distance’ between two strings, the module will identify small deviations from any known legitimate filename present in the Windows\System32 folder on your dataset. ‘svchosts.exe’, ‘svch0st.exe’, ‘scvhost.exe’ – how many of those do we have to search for? None. ‘leven’ will spot all of them, as well as any other possible variation, and will do so for all legitimate file names and conveniently report them for you to investigate. You can also run the ‘leven’ module with a user-supplied filename. In that case, ACP will report all small deviations in your dataset, regardless of the folder those files are found in. This is an effective technique for spotting attacker intentional or unintended typos during an investigation.

Figure 2: High level explanation of how temporal execution strength is calculated

tcorr’: This module is based on a Temporal Correlation execution engine, as shown in Figure 2. You supply a file name to it, and it will determine what other files were executed before and after (within a user configurable window). It calculates a correlation ‘strength’ index, which it uses to display the list of files that present the strongest temporal execution correlation indexes with your file name of interest. It will also automatically calculate if the correlation is mutual by calculating the inverse correlation index, which it will report as an ‘Inverse Bond’. ‘tcorr’ is great way to triage suspicious files, or to easily find additional attacker files as the investigation progresses. As a simple example, Figure 3 illustrates the results generated by ‘tcorr’ for “net.exe”. The results provided evidence of the well-known fact that “net.exe” will always execute “net1.exe”, resulting in a very strong temporal execution correlation between both.

Figure 3: Explained output of “tcorr net.exe” command

tstack’: Investigators often question if they’ve overlooked some detail as a result of an attacker deviating from observed TTPs. Time stacking analytics have been designed to provide you with the list of file names that have predominantly executed within, or close to, the supplied attacker’s time frame of activity, as seen in Figure 4. As the technique is inherently most effective for narrow time intervals, investigators will want to focus around the time frames associated with attacker lateral movement.

Figure 4: Time Stacking calculation

Conclusion

One of the main reasons for the Open Source release of ACP is to provide a framework for the community at large to delve into advanced analytics. Having a tool available that automates some of the analysis of ShimCache and AmCache artifacts at enterprise scale should lower the entry barrier for investigators interested in developing advanced analytics. There’s a tremendous amount of untapped value on advanced execution trace analytics. The examples presented here, while immediately useful, are just the beginning.

Download AppCompatProcessor today.