Challenges in Modern Digital Investigative Analysis


Ovie Carroll
Director
US Department of Justice Cybercrime Lab
Computer Crime & Intellectual Property Section

In the last 15 years, significant challenges have arisen in the field formerly known as “computer forensics.” Among these challenges are the dramatic increase in the volume of digital evidence, the rise in use of effective encryption, the creation of new technologies that cause digital evidence to become increasingly evanescent (e.g., ephemeral), and an increased expectation amongst jurists that prosecutors not only prove that evidence was on the defendant’s computer, but attribute the evidence to the defendant. This article discusses some of these challenges and identifies techniques that prosecutors, agents, and analysts can consider to effectively respond to these challenges.

Introduction

The Cybercrime Lab is a group of highly trained digital investigative analysts located in the Computer Crime and Intellectual Property Section of the Criminal Division in Washington, DC. The Cybercrime Lab provides support to prosecutors through advanced digital investigative analysis, technical and investigative consultations, and research and training to support Department of Justice initiatives. Digital Investigative Analysis (DIA) is the evolution of what was previously referred to as “computer forensics.” It is important for prosecutors to appreciate the three aspects of the profession that caused this evolution:

Digital. Digital Investigative Analysts (analysists) no longer limit their analysis to standard computer systems. Today, analysts examine everything “digital,” including desktop computers, laptops, mobile devices (cell phones and tablets), GPS navigation devices, vehicle computer systems, Internet of Things (IoT) devices, and much more. We are still in the infancy of the digital age, but developers of many products—from shoes and sports bras to lightbulbs and doorbells—are already incorporating technology into their products to collect, store, and transmit information about the user that they can analyze and hopefully monetize.

Investigative. While technology progresses at lightning speed, the legal system and those who uphold our laws are just beginning to appreciate the need for analysts to conduct deeper “investigative” analysis on digital devices to obtain a better understanding of issues being investigated. Each year we are generating or replicating eight zettabytes of information. That is equivalent to a stack of paper 1.6 trillion miles high. To manage the high volume of data that needs to be analyzed, some organizations have employed a raw data extraction process to digital evidence. This non-analytical approach blindly identifies types of files (e.g. pictures, documents, spreadsheets, etc.) in the storage media, without further analysis, to determine if the user opened the file or even knew the file was there. This raw data extraction process allows an organization to quickly process a large volume of data and may be an excellent first step in the simplest cases.

Raw data extraction, however, does little to satisfy many of the offense elements necessary to establish guilt. In contrast, DIA requires analysts to investigate or even “interrogate” digital devices. Analysts ask questions in the form of keyword searches and review digital artifacts to form additional questions or logical investigative leads based on the answers received. Even when the response to questions is silence (or a lack of recorded information), an analyst may ask why is there no response or recorded information. Was counter-forensics conducted? Is there something unique about the digital device being investigated that the technique or tool cannot read or display the information?

Receive our free monthly newsletter and/or job posting alerts Click to sign up

Analysis. Lastly, an analyst must “analyze” the response to each question and determine its relevance to other digital artifacts, as well as how it relates to information available from the non-digital investigation. An excellent example of this was used in “The Physical Computer and the Fourth Amendment” by acting Principal Deputy Chief of CCIPS, Josh Goldfoot, where he explained that in isolation, the fact that a suspect downloaded tide tables for a particular beach in Oregon at 5 a.m. might mean nothing. Josh Goldfoot, The Physical Computer and the Fourth Amendment, 16 BERKELEY J. CRIM. L. 112 (2011). When combined with the fact that a young woman's body was discovered in the surf on that beach an hour and a half later, however, the significance of the tide tables became apparent. Id.

Incident Response and Encryption

For years, law enforcement has debated the value of imaging Random Access Memory (RAM) when they encounter a powered-on computer with an active user account logged in. RAM is the place in a computing device where the operating system, applications and data in use are kept so they can be quickly reached by the device's processor. RAM is much faster than other kinds of storage. Data remains in RAM as long as the computer is running. When the computer is turned off, RAM information in RAM rapidly dissipates and is lost. In 2016, the majority of law enforcement more often elected to pull the power plug from the computer rather than image RAM.

Many agents still prefer not to acquire RAM because they believe that RAM is unlikely to contain relevant evidence. Sometime agents base this belief on the specific nature of the investigation (e.g., white collar crime) or the latency of the crime under investigation. Today, however, the most appropriate practice is to image RAM where practicable. First, an aggressive defense counsel may argue that RAM might have contained exculpatory evidence and its intentional destruction amounts to a knowing Brady violation. Second, today there is increased possibility that the hard drive of the computer to be searched will be encrypted. That possibility is becoming more likely each day.

Encryption is default on new computers

Investigative agencies are already beginning to see an increased use of “BitLocker” whole disk encryption. It is not just that our targets are getting savvy about securing their data. In many instance, the providers are doing the work for our targets. For example, starting with the core edition of Windows 8.1, Windows RT, and Windows 10, Microsoft began automatically encrypting the system boot volume (typically the entire C-drive) without notifying the user.

Thankfully, as of the writing of this article, whole disk encryption is still not the default on every Windows computer; it is hardware conditional. These conditions must be met for Microsoft to encrypt the operating system drive:

  1. the device meets Connected Standby or Modern Standby hardware specifications;
  2. the device features a non-removable (soldered) RAM (this protects against the rarely used cold-boot technique where RAM is removed and placed in a separate reader device and imaged without allowing the computer to be powered down);
  3. the device has a Trusted Platform Module (TPM) 2.0 chip; and
  4. at least one account with administrative privileges logs in with Microsoft Account credentials (as opposed to using a local Windows account).

While this may sound like a lot of very specific requirements, it is worth noting that every Windows Surface and Surface Pro computer meets all of these requirements and is encrypted by default. And even if a computer does not initially meet all the requirements (e.g., it has no solid-state drive or an account with administrative privileges using Microsoft account credentials is used), the moment the device meets all the prerequisites, Windows will begin silently encrypting the boot partition in the background without notice to the user.

It is important for prosecutors to be aware that because BitLocker Device Encryption encrypts Windows devices without user awareness, it also automatically stores a 48-character recovery key in the users Microsoft OneDrive account. Prosecutors may be able to serve legal process upon Microsoft to obtain the BitLocker Recovery Key from the user’s Microsoft OneDrive account. CCIPS recommends that prosecutors use a search warrant to obtain the recovery key in most instances. If you find that any of your personal computers have been automatically encrypted, you can see all your BitLocker recovery keys by logging into your OneDrive account and going to https://onedrive.live.com/recoverykey.

Four Basic Incident Response Steps

With the increased likelihood of encountering encryption, prosecutors and agents should familiarize themselves with the four basic, recommended steps for responding to a computer that is powered on with a user logged in.

First, isolate and preserve the state of the computer as it is when law enforcement first encounter it. Do a visual assessment to determine if anything requires immediate action. For example, consider disconnecting the system from the network. If the responder detects excessive hard drive activity suggesting the drive is being wiped, consider terminating the wiping program if possible or removing power from the computer to prevent further damage.

Second, preserve volatile data by imaging RAM. There are many simple ways this can be accomplished, but all require the introduction of incident response software. Incident response software is typically introduced to the target computer by inserting external storage media such as a USB drive. Some incident responders have expressed concern that introducing anything to the target computer changes evidence and may render the computer inadmissible. While that is always a theoretical risk, the risk is quite small, and it is usually a greater risk not to image RAM.

As an initial matter, the “changes” to the computer caused by imaging RAM are minimal, contained, and usually identifiable. These changes are especially de minimus when one recognizes that any computer powered on is always in a fluid state of motion, and changes are taking place regardless of what actions are taken by the examiner. Thus, the risk created by imaging RAM is quite minimal. The incident responder can further minimize the risk by using a sanitized storage device to introduce the incident response software and by carefully documenting any actions they take on a live computer system for later reference.

The risk created by not imaging RAM is often much more significant. The average computer sold between 2015 and 2016 came with at least six gigabytes of RAM. Six gigabytes of text roughly equals a stack of paper 6,000 feet high. An aggressive defense counsel may argue that by removing power from the computer without preserving RAM, your agent just destroyed the equivalent to a 6,000 foot stack of information (most of which was surely exculpatory).

Third, once RAM has been preserved, check for signs of encryption. The two most common encryption detection tools are “Encrypted Disk Detector” (EDD) by Magnet Forensics or "Crypthunter" by the Software Engineering Institute at Carnegie Mellon University. When executed, both tools will report the presence of a number of different volume and disk based encryption programs. More information about EDD can be found at www.magnetforensics.com/free-tool-encrypted-disk-detector/. More information about Crypthunter can be found at www.cert.org/forensics.

Finally, create a forensic image. If there are no indications of encryption, and RAM has been successfully imaged, power should be removed from the system to abruptly stop all operations. Removing power prevents any maintenance or counter forensic programs from running and causing changes to the system during the standard shutdown process. A “write block” (preferably a “hardware write block”) should be applied to the hard drive before any further actions are taken to prevent the imaging process from writing any information to the drive being imaged or otherwise changing the data being investigated.

Before beginning the process of creating a full forensic image (whether a physical image or a faster “logical” copy), consider creating a “triage” image. Analysts can typically image at 60 to 80 gigabyte per hour. A complete copy (full “physical” image) of a one terabyte drive would typically take between 12 to16 hours. In contrast, a triage image uses a more surgical approach to create a smaller, partial image of high value digital artifacts that can reveal key information. For example, a triage image may alert investigators to online accounts that need to be immediately preserved, or actions recently taken on the computer that may aid in taking immediate investigative actions (e.g. searches conducted, files opened, chat sessions, etc.). Analysis of the high value digital artifacts can then be conducted while the more time-consuming full forensic image takes place.

If encryption is detected or suspected because of step three, incident responders should consider creating a live “logical” image of the computer before removing power. Several tools can image RAM and create images of a live system—one of the most popular is FTK Imager by Access Data, which can image RAM, create both live logical or physical images, and accomplish many additional incident response tasks. While a live logical image is not the preferred method of copying a hard drive, it allows investigators to capture all the active files in an unencrypted state so that if the encryption cannot be circumvented, at least the active files are available for the investigation.

Electromagnetic vs Solid State Hard Drives

Another issue prosecutors and analysts should consider is the impact that new “solid state” hard drives (SSDs) have on DIA. Standard hard drives, also called “electromagnetic drives,” consist of platters that spin between 5,400 and 15,000 RPMs and hold positive and negative charges read by the computer as binary data. From this binary data, the computer can read the files and programs that store information and make the computer work. Since the beginning of the computer forensic profession, it has been well known that nothing is ever truly “deleted” from an electromagnetic drive. Instead, when information is “deleted” from an electromagnetic drive, the computer is simply told that the space where that information resides is now available for new files to reside, if that space becomes needed. A file deleted can be recovered forever, as long as no other data is written to the area of the hard drive that file resides.

SSDs change this fundamental principle. The benefits of SSDs include increased speed of access. There is no longer a motor moving a head of a hard drive across a spinning platter to read the polarity of binary data stored on it. As a result, the access to data is instantaneous. With no moving parts, SSDs are also silent, less fragile, and stay relatively cool compared to electromagnetic drives.

One negative aspect of SSDs, however, is the “write endurance.” Write endurance is the number of “write cycles” (or number of times data can be written to) a block of flash memory can hold. Once a user has reached the write endurance limit, the disk may become unreliable or unable to use any of the cells. As a result, there is a tendency for repeated writes to eventually corrupt the flash memory, making the SSD partially or completely unusable. SSDs employ two features to reduce this phenomenon and expand the life of an SSD. These features are called “wear leveling” and “trim.”

Wear leveling is the process of moving data around on the SSD to prevent any specific area of the drive from wearing out prematurely. When active data is moved to a location marked as being inactive, any data previously in that location is overwritten. This process decreases the time deleted files can be recovered on SSDs because data on the drive is constantly being overwritten.

Trim is used to increase the speed data can be written to the drive. As an analogy, if you think of each cell that holds data on an SSD as a paint can, trim is the process that looks at which cells are holding active files, then occasionally pops the lid on all paint cans that are not holding active data. This increases the write speed because data can be immediately written to a clear, open cell rather than first having to pop the lid and clear the “inactive” or deleted files.

Wear leveling and trim have at least two effects that may relate to prosecutors and analysts. First, the amount of time deleted files can be recovered drops from “indefinitely” on an electromagnetic drive to potentially weeks or months on an SSD. Time is now of the essence for imaging an SSD. If you have reason to believe your target has an SSD, act quickly. Second, when wear leveling or trim occurs, data in inactive cells of the SSD are being destroyed, causing the drive to constantly change. As a result, an SSD with a particular hash value when imaged originally may have a different hash value if the drive is later reimaged because trim or wear leveling may occur during the reimaging process.

Unfortunately, the trim and wear leveling functions are accomplished at the hard drive controller level, and nothing can currently be done to suspend these functions. While some operating systems can invoke trim, disabling it through the operating system does not prevent trim from being initiated by the drive firmware. Even attaching an SSD to a hardware write block will not prevent wear leveling or trim.

 Earn a Degree in Crime Scene Investigation, Forensic Science, Computer Forensics or Forensic Psychology

VI. What is a hash value?

A hash value is a unique identifier representing a specific data set (for example, a particular file, record, or hard drive). The result, which is generated by an algorithm, is a distinct fixed length alphanumeric string, using a combination of letters and numbers. The following is an example of a particular hash value called an “MD5” hash:

26a981554d7d761230bc7ef3a6645375

Such an algorithm result is sometimes called a hash value, hash sum, checksum, or message digest. A hash value can refer to the hash function calculation for any data set, such as a file, record, or hard drive.

Hash values provide a fundamental role in forensic examinations concerning the review and analysis of data. Analysts can authenticate digital evidence by determining the hash value of the original evidence, making a physical copy of the evidence, and then confirming that the copy has the exact same hash value as the original evidence. If a corrupt or sloppy agent changed even a single character in one Word document saved on a 10-terabyte hard drive after imaging it, the entire drive would have a different hash value. Thus, the fact that two hash values match is powerful evidence that the prosecutor is presenting a perfect image of the original drive.

No such thing as a full forensic analysis

As the digital age matures, the number of devices collecting and storing information, and the volume of digital evidence to be examined in any investigation, are becoming a significant challenge. Over the past 15 years, the maximum capacity of a single storage device has doubled every 12 to 18 months. As the maximum capacity of individual devices has dramatically increased, the cost of storage has considerably decreased. The substantial volume of data has had a considerable impact on investigative agencies and their efforts to keep up with the tsunami of digital evidence to be analyzed. One major change to the digital investigative analysis profession is that there is no longer such a thing as a “full forensic analysis.”

For years now, the most sophisticated analysts have applied a phased approach to digital analysis. The phased approach consists of a variant of at least three phases: Triage, Identification, and Deep Analysis.

Partially because of the increased storage capacity of individual devices, and secondarily because the investigative value of information tends to decrease with time, the timing of the triage phase is often critical. The earlier triage can be conducted, the more potential value the information may have, whether it is used to confront a suspect in hopes to obtain a confession, or to identify other critical time sensitive evidence that needs to be preserved (e.g. web-based email, storage, or social networking accounts).

Another change occurring in the digital investigative analysis profession is the shift from analysis being conducted by a single examiner, to a team approach. In addition to a phased approach, the SANS Institute Digital Forensic and Incident Response (DFIR) program have conducted extensive research over the past six years, constructing teams of three, four, and five analysts. The teams were given a forensic image and approximately six hours to identify and analyze digital artifacts and present their findings. A four-person team was found to be the optimal size to efficiently conduct a collaborative analysis of digital evidence. Focusing on high-value digital artifacts, also called “compass points,” analysts can quickly reconstruct events that occurred.

Compass Points or High Value Digital Artifacts

Often, when supporting an investigation, it is helpful to focus on compass points that help prove particular elements of the investigation. This section will highlight some of the compass points that are frequently of most value. This is not an exhaustive list, but only a few of the most valuable digital artifacts in each category.

The information below is provided so prosecutors will have a general awareness of the type of information that may be available through digital investigative analysis. Prosecutors should not use the information below as a “checklist” or “to do list” when working with agents or analysts. As digital artifacts can change with every operating system update or patch, the Cybercrime Lab is available to discuss and consult on any digital evidence matter and can be reached by calling CCIPS at (202) 514- 1026 and asking for any available digital investigative analyst.

Location information

NetworkLists Signatures and Profiles — Since connecting to a network is generally proximity dependent, that is, you must be within the range of the wired or wireless network to connect to it, one easy way to prove a computer was at a specific location at a specific time is to identify when the computer was connected to particular networks. The most valuable artifacts that document the networks to which a computer connected are in the “Windows Software registry hive.” The Windows registry is essentially several databases that track system and application configuration information, as well as user activity. Although additional registry keys exist, two registry keys in the Software registry hive track every network to which the computer has ever connected. The “\Microsoft\Windows NT\CurrentVersion\NetworkList\Signatures\Unmanaged” key tracks the network description, the MAC address of the default gateway router, and the domain name of each network to which the computer ever connects. It also records a profile “global unique identifier” (GUID) for each network. The “\Microsoft\Windows NT\CurrentVersion\NetworkList\Profiles” registry key tracks all profile GUIDs and for each network. It records the date and time the computer first and last successfully connected to each network and how the connection was made (e.g., wireless, wired, 3G, etc.).

A MAC Address, short for “Media Access Control” address, is a hardware address that uniquely identifies each node of a network, similar to a serial number. Because MAC addresses serve like a unique serial number for each wireless router, you can search available databases to attempt to identify each router’s geolocation. By default, all Apple and Android phones are configured to routinely scan for wireless networks around you and collect the network names and MAC addresses (along with other information) and send the collected information (along with your phones GPS location) back to Apple and Google, which use this information to provide you and others with quicker and more accurate location information. If you ever turn off your WiFi antenna on your Apple or Android phone and then use any application that queries location services, this is why you will receive a notification informing you that you can receive more accurate location information if you turn on your WiFi antenna. A free open source database frequently used to look up the location of a network MAC address is www.wigle.net.

Event Logs — While the software registry hive tracks the first and last successful connection to each network, Microsoft started keeping more robust Windows event logs starting with Windows Vista. The “WLAN-Autoconfig” event log creates an event ID-8001 record with the network name and MAC address for each successful connection to a wireless network. An event ID-8002 record is created and records each unsuccessful wireless connection attempt (e.g., the user does not have the password or types it incorrectly). If a user attempts to connect to the Internet without a proper network connection, Windows will offer to diagnose the problem. If the user agrees to the diagnostics, an event ID-6100 record is created and records the name, MAC address, network name, and signal strength for every wireless network the computer can see at that point in time.

SRUM — An additional lesser known digital artifact that tracks networks to which a computer is connected is the “System Resource Usage Monitor” (SRUM). Starting in Windows 8, Microsoft began monitoring system resource usage and recording that information in an “extensible storage engine” (ESE) database called SRUM. SRUM records each network to which the computer is connected, the network name, the connection start time and duration, the user account responsible for the connection, and the volume of network activity from all applications running (even if the application is not installed on the computer and runs from an external USB drive). SRUM collects and documents this information on an hourly basis, so an examination of SRUM data would allow you to determine within 59 minutes which applications were running and how much data each application transferred (uploaded or downloaded) across the network.

In addition to using SRUM to identify when a user connected to a specific network and for how long, SRUM data may be evidence of an employee transferring mass amounts of data from the corporate network to her laptop before leaving the company. This activity would likely appear in SRUM as the Windows Explorer application transferring the large amount of network data inbound to her computer. If the employee then went to the local coffee shop, connected to her wireless network and uploaded the data to a web-based storage location (like Dropbox), SRUM would show the large outbound network transfer (likely proportionate to the inbound transfer on the corporate network) on the coffee shop wireless network connection. The SRUM Database is located in the “C:\Windows\System32\sru\” directory.

File knowledge and access

Windows Searches — For years, one challenge in digital investigative analysis has been proving a user not only had something significant to an investigation on their computer, but that he knew it was on there. Two of the easiest ways help prove knowledge of a file is to prove the user was searching for it or accessed it. In order for Microsoft to enhance the user experience, Windows tracks the names of files you access and search for in multiple locations. As previously discussed, the Windows registry is essentially several databases called registry hives. Each user has his own primary registry hive called the NTUSER.DAT. This registry hive tracks information specific to each user’s activity and preferences. Starting in Windows 7, when a user conducts a search on his computer using the Windows search function or the “Charm Bar” in Windows 8-10 (the magnifying glass that appears when you move your mouse to the right edge of the screen), Windows records each search in temporal order in the “NTUSER.DAT\ Software\ Microsoft\ Windows\ CurrentVersion\ Explorer\WordWheelQuery” registry key. Because the searches are recorded in temporal order, an analyst can frequently see indications of the user’s thought process as he searched for particular files.

File Access —– Windows also records in numerous artifacts when a user opens or attempts to open non-executable files. Four of the most useful digital artifacts to identify files opened or attempted to be opened are “LNK” files (pronounced as “link” files), Jump Lists, and several “most recently used” registry keys.

LNK files — A LNK File is an artifact that has existed since Windows XP. LNK files are also known as a “Windows Shortcut” files and are created anytime a user opens or attempts to open a nonexecutable file. A LNK file is created even if the file opened is on a network or external drive. When an opened file is later deleted, its LNK file does not get deleted with it. Windows creates and stores approximately 149 LNK files in the user’s home directory under the “AppData\Roaming\Microsoft\ Windows\Recent” directory. LNK files contain a wealth of information including the modified, accessed, and created dates and times of the file opened; the full directory path, volume name, and volume serial number from which the file was last opened; and the file size.

Starting in Windows 10, Microsoft added rules to when LNK files would be created in addition to when files are opened. On earlier versions of Windows 10, a LNK file was created for the directory to which any file was copied. The creation of a LNK file for the directory a file was copied to was stopped on later versions of Windows 10. However, on versions as early as version 1607, Microsoft created a LNK file for the directory a file is opened from. Additionally, when a directory is created, Windows creates a LNK file for the directory created and for the created directories “parent” and “grandparent” directory. In addition to all the information LNK files record, LNK files also record the last time a file was opened.

Jump Lists — One of the newest artifacts to identify files opened by a user are “Jump Lists.” Starting in Windows 7, Microsoft introduced two types of jump lists: “AutomaticDestinations” and “CustomDestinations.” Automatic and Custom jump lists are created and stored in their respective directory in each user’s home directory under the “AppData\ Roaming\ Microsoft\ Windows\Recent” directory. Each application can incorporate its own jump lists as a “mini-start” menu. AutomaticDestinations allow a user to quickly “jump” to or access files they recently or frequently used, usually by right-clicking the application in the Windows taskbar. CustomDestinations allow a user to pin recent tasks, such as opening a new browser window or create a new spreadsheet to the jump list.

Jump lists are essentially mega LNK files. Each jump list can record upwards of the last 1,000 files opened by each application. As jump lists are essentially compound LNK files, they contain all the same information as LNK files, such as when each file was opened, modified, accessed, and created; dates and times that the file was opened; the full directory path, volume name, and volume serial number from where the file was last opened; and the file size.

Most Recently Used (MRU) Registry Keys – As previously mentioned, the Windows Registry is a series of massive databases that track system configuration and user activity. There are several registry keys that track most recently used items. An analysis of these registry keys can help an analyst quickly identify files accessed. Every application developer has the option of creating registry keys specific to his application configuration and user activity. Three of the most useful registry keys that track files accessed are “RecentDocs,” “Microsoft Office FileMRU,” and “OpenSavePIDMRU.”

RecentDocs — The “RecentDocs” registry key tracks the name and order of the last 10 files opened for every file extension (e.g. .doc, .docx, .jpg, etc.). The registry organizes each of the last 10 files opened in sub keys named by the file extension. A sub key named “folder” is also created when the first folder is opened using the Windows Explorer. This sub key tracks the name of the last 30 folders opened. Each user has his own RecentDocs registry key located in his NTUSER.DAT registry hive under the “\Software\ Microsoft\ Windows\ Currentversion\ Explorer” registry key. The master RecentDocs key maintains a master list, organized in temporal order of the last 150 files or folders opened. By analyzing the order that particular files were opened, analysts have often been able to refute claims that a single type of file was opened by mistake. In one trade secret case, it was helpful for the analyst to show the pattern of files opened that all related to the same subject matter.

Applications Specific Most Recently Used (MRU) — With every Windows application, developers have the ability to create their own set of registry keys to track specific configuration and user activity for their application. If a specific application is used to commit or facilitate a crime or is otherwise significant to an investigation, it is often advantageous for the analyst to determine both if the application has its own set of registry keys and what actions those keys record. Two excellent examples are “Winzip,” which records the name of the last several zip files created using the Microsoft Office suite of applications. Each application in the Office suite has its own set of “FileMRU” (most recently used files) that tracks most recent files used and when they were opened. Additionally, starting with Office version 365 and 2016, Microsoft Office tracks the “reading location” for each Word, PowerPoint, and Excel document opened and when each file was closed. Using this information, an analyst can determine not only what document was last opened and when it was closed, but also that the user had scrolled to and was on page 32 of the document when it was closed.

OpenSavePIDMRU — Windows has some basic dialog boxes that all programs can use when a user opens or saves a file. Some may have noticed that when saving files, a dropdown arrow in the file name dialog entry location appears. By clicking on the arrow, you will see several of the most recent file names you have saved for that application. These file names are saved as a part of the “OpenSavePIDMRU” registry key which is located under the “NTUSER.DAT \ Software\ Microsoft\ Windows\ CurrentVersion\ Explorer\ ComDlg32\ OpenSaveMRU” registry key. A record of the last 10 to 25 names of the last files opened or saved using the Windows Common Dialog Box are stored under sub keys based on file extension.

Directory locations used

With the extensive storage capacity of standard hard drives today, it is often a challenge to find where users are storing information, particularly if they are trying to hide it. One technique digital investigative analysts can use to locate the directories from where a user is saving or accessing files is to analyze where the user has navigated, even when they did not open or save a file. We have already discussed several artifacts, such as LNK files, jump lists, and several MRU registry keys that document the full directory path where files were opened or saved. There is one additional artifact, the “LastVisitedPIDMRU,” that, for each application, specifically tracks the last directory navigated to when opening or saving a file. Another artifact that also tracks the directories a user navigates, even when they do not open or save a file, is “ShellBags.”

LastVisitedPIDMRU — The “LastVisitedPIDMRU” is a registry key located in the user’s NTUSER.DAT registry hive in the “\Software\ Microsoft\ Windows\ CurrentVersion\ Explorer\ ComDIg32\” registry key. This key tracks the last directory a file is opened or saved in for each application. This is why when you go to open a document, the MS Word dialog box opens the directory in which you last opened or saved a word document. In a recent case, the analysis of the LastVisitedPIDMRU registry key revealed the user had last opened a Word document from a hidden truecrypted container that was previously mounted as “e:\HiddenTruecryptFolder.” The data is stored in binary format, so conversion is necessary, but many registry forensic tools make this easy work.

ShellBags — Windows tracks user display preferences for the Windows Explorer in a registry key called “ShellBags,” located in the “UserClass.dat” registry hive. Anytime a user changes the way files are displayed in the Windows Explorer, everything from what columns are visible to display mode (e.g., large icons, small icons, details list, etc.), the user’s preferences are updated in ShellBags, and the recent navigation history is recorded. If you have ever changed a folder and returned to that folder to find your new preferences intact, then you have seen Shellbags in action.

Shellbags only records information about a directory for folders that have been opened and closed in Windows Explorer at least once. In other words, the simple existence of a directory in Shellbags is evidence the specific user account once visited that folder. Shellbags also records when that directory was first visited or last updated. Sometimes, Shellbags also records information regarding the files in the listed folders. An analyst can use ShellBag information to refute an individual’s claims to have no knowledge about a directory with incriminating information inside. On more than one occasion, information from ShellBags has been used to prove someone using a specific user account had knowledge of an encrypted container because they had navigated there previously.

Applications Used

Prefetch — A good first stop for identifying applications ran on a computer is the Windows “Prefetch” directory. Prefetching is the process of loading information from the hard drive into memory before it is needed. Prefetch began in Windows XP and is located in the “Windows\Prefetch” directory. Prior to Windows 10, a maximum of 134 prefetch files were stored at a time, as compared to 1,024 prefetch files stored with Windows 10. The prefetch file is designed to essentially be an audit log of all the files needed to execute a particular application. Any time an application is launched, the prefetch file monitors and creates a list of every file name and full directory path that is accessed during the initial execution of the application. Starting in Windows 8, prefetch also maintains the date and time the application was executed and the total number of times that application was run. Because users frequently open files (e.g., pictures, documents, spreadsheets, etc.) by double clicking on the file, analysts can often find the name and full path of several files opened by each application inside the prefetch file.

Imagine the value of identifying an otherwise covert application used to facilitate a crime that was launched from a USB drive or inside an encrypted directory. The application could even have been deleted from the computer, but a prefetch file will likely still exist showing when and how often the user executed the application. In one investigation, the defendant was identified using a portable Firefox browser on a thumb drive to surf the Internet, leaving no temporary Internet cache or other evidence on the office computer. Examination of the computer’s prefetch files showed Firefox bring launched from a USB drive, and when investigators obtained a warrant to search the portable thumb drive, they found it contained significant evidence of criminal activity and incriminating bookmarks.

UserAssist — The “UserAssist” registry key tracks all applications ran with a graphical user interface. The UserAssist registry key is frequently an artifact that complements the Windows Prefetch artifact previously discussed. Like Prefetch, the UserAssist key tracks applications ran and the number of times each application is executed; however, UserAssist also tracks the “focus count” and “focus time.” Focus count records the number of times the application has come into primary focus of the Windows desktop. Focus time tracks the total time, down to the millisecond, each application was in primary focus on the Windows Desktop. This artifact has been useful when a defendant claims he had no knowledge that a specific application had run and suggests it must have been running in the background. With UserAssist, the analyst can tell exactly how often the application was run and how many hours, minutes, and seconds the application was the foremost active application on the desktop.

SRUM — As mentioned before, the System Resource Usage Monitor (SRUM) is an extensible storage engine (ESE) database located in the “c:\Windows\system32\sru” directory. Each hour, SRUM records every application running at that time and what user account is responsible for executing the application. Each hour, SRUM also records for each application the number of bytes written and read from disk and the bytes sent and received over the network. SRUM can be particularly useful in documenting the amount of data shared or downloaded by a particular peer-to-peer network program or, in a hacking case, how much data was exfiltrated out of the corporate network and when.

External USB Storage Devices

USB Storage Devices — Whether you are prosecuting theft of trade secrets, computer crime, or child pornography, tracking “thumb drives” (more accurately referred to as “USB storage devices”) that have been connected to a single computer or across multiple systems can be crucial to an investigation.

To qualify for the Windows Logo Program “Designed for Windows,” a USB device must have a unique serial number. This device serial number is burned into the firmware of the USB device and cannot be changed. Because this is unique to the USB device, it can be used to track when a specific USB device was inserted into multiple computer systems. If a device does not conform to the Windows Logo Program and has no unique device serial number, analysts can track such a USB device across multiple computers by using the “volume serial number.” As long as the USB device is not reformatted, the volume serial number will remain the same across all Windows devices. The volume name of the device (e.g.,“Kingston Data Traveler” or “My Evil USB”), its device serial number, manufacturer, product identification (PID) number, and revision of a specific USB drive can be located in the “SYSTEM\CurrentControlSet\Enum\USBStor” registry key.

Starting in Windows 8, the first and last time a USB device was connected to a computer and the last time it was disconnected is recorded in the “\CurrentControlSet\ Enum\ USBStor\ Ven_Prod_Version\ Device_serial#\ Properties\ [83da6326-97a6-4088-9453-a1923f573b29}\” registry key. There are three sub-keys that have a 64bit hex timestamp in the key value that will identify these times: (a) first time USB device was inserted (Sub-Key: 0064); (b) last time USB device was inserted (Sub-Key: 0066); (c) last time USB device was unplugged (Sub-Key: 0067). Analysts can also identify the user account that was logged on when a device was connected by looking for the USB Device’s Global Unique Identifier (GUID) found in the “SYSTEM\MountedDevices” registry key in the user’s “NTUSER.DAT\ Software\ Microsoft\ Windows\ CurrentVersion\ Explorer\ MountPoints2\” registry key.

Although not as definitive as the artifacts above, starting in Windows 8, an Event ID-20 log in the “Application and Services Logs\Microsoft\Windows\Audio\PlaybackManager” log file is created every time a USB device is connected to a system or removed without being properly ejected. Each event is associated with the audio tone heard when a device is connected or disconnected without going through the eject process. This event log does not identify a specific device, but an entry is created each time a device is attached or removed improperly and may be corroborative when combined with other artifacts, such as LNK files or jump lists.

Validating the Time is Correct

Analysts are often reluctant to commit to a timestamp of a file on a computer being accurate. Some will correctly point out that the time of the “Complementary metal–oxide–semiconductor” (CMOS /'si?mos/) could have been changed prior to the operating system being booted. When the time an activity tool places on a computer is critical, there are two artifacts an analyst can check.

Event ID 1 and 4616 — The Windows operating system routinely reaches out to one of several time servers (e.g., time.windows.com, time.nist.gov) to synchronize the computer’s clock time. Each instance the time is synchronized; the time of the computer is adjusted by milliseconds. An Event ID4616 or ID-1 is created whenever the time on the computer is changed. The event log records when the time was changed, the previous time, and the new time. If a significant event occurred and it is critical to validate that the time of the event was recorded correctly on the computer, an analysis of event logs surrounding the event could be conducted to determine whether the computer was synchronized with the time server before and after the event at issue.

Event Log Record Sequence Numbers — All event logs have a hidden field about which most users and event administrators are unaware. For every event log, each event record is given a sequentially numbered “record number.” If the time of an event is called into question by the suggestion that someone could have changed the date or time on the computer before the operating system started, an analyst could refute the claim by reviewing all of the event logs to show that no event record number was out of sequence. If someone changed the computer time forward or backward (typically in an attempt to establish an alibi), the event record numbers would clearly reveal this activity.

Going back in time

Volume Shadow Copies — Beginning with Windows Vista, Microsoft started taking snapshots of almost every file on the system by default. Volume Shadow Copies are copies of files that have been modified since the last “system restore point” was made. Volume Shadow Copies have great potential to help law enforcement identify and document earlier versions of files or folders.

While Volume Shadow Copies are not as granular as saving every version of a saved document, they do provide significant information. In the user interface, the existence of previous versions of a document can be identified in the operating system by right clicking on the file or folder and then selecting “restore previous versions.” The user has the option to open, copy, or restore, any of the previous versions. With previous versions, it may be possible to restore a shadow copy of a file that was deleted, even after the recycle bin has been emptied. The one caveat is that the analyst must know the original location of the file or folder. Testing has shown that if previous versions of a file are available and the file is moved to a new location on the hard drive, the list of previous versions will appear empty. This presents an interesting opportunity for forensic examiners to mount the volume shadow copy to their forensic workstation and examine previous versions of significant files or any specific digital artifacts.

Conclusion

There can be no doubt that significant challenges have recently arisen in the field formerly known as “computer forensics.” Law enforcement can help manage these challenges by rethinking the analyst’s role in the investigatory process. The prosecutor, agent, and analyst are all best served when collaborating and becoming an integral part of the investigation. Analysts can become more effective through a teambased, phased approach to digital investigative analysis. There is no longer any such thing as a “full forensic analysis,” so all analysis must be iterative. Once the analyst understands the changing needs of the prosecutor and agent, she is best positioned to identify the critical artifacts that will help establish the elements of the crime and respond to any likely defenses that may arise.

Prosecutors interested in these and other digital evidence issues and techniques can call CCIPS and the Cybercrime Lab, who are also available for consultation on digital investigative analysis and other technical investigative matters, by calling (202) 514-1026. Many other resources are available on our section's public website, www.cybercrime.gov.


About the Author


    Ovie Carroll, is the Director for the Department of Justice Cybercrime Lab at the Computer Crime and Intellectual Property Section (CCIPS) and has over 30 years of law enforcement experience. The Cybercrime Lab provides advanced digital investigative analysis, cybercrime investigative support, and other technical support to DOJ prosecutors as it applies to implementing the Department's national strategies in digital evidence, combating electronic penetrations, data thefts, and cyber attacks on critical information systems.

    Mr. Carroll is also an adjunct professor with George Washington University for the Masters of Forensic Science program and is also a course author and certified instructor with the SANS Institute, where he teaches advanced computer forensics.

    Mr. Carroll was also the Special Agent in Charge of the Computer Investigations and Operations Branch, Washington Field Office, Air Force Office of Special Investigations, where he was responsible for coordinating national level computer intrusions occurring within the United States Air Force. He has extensive field experience applying his training to a broad variety of investigations and operations. As a special agent with the AFOSI, Mr. Carroll has extensive field experience working general crimes, counterintelligence, and has conducted investigations into a variety of offenses, including murder, rape, fraud, bribery, theft, gangs, and narcotics.



Department of Justice Journal of Federal Law and Practice
65 U S Attorneys’ Bulletin, Jan 2017.

Article posted April 26, 2019