Reverse-engineering malware enables threat hunters to understand what software does and how it affects a system. It can also analyze the presence of malware on a system to help prevent the software from doing further harm.
Malware analysts have multiple reverse-engineering frameworks to choose from. One option is Ghidra, which was originally developed for internal use by the National Security Agency (NSA) and officially released to the public in 2019.
Malware analyst and author A.P. David wrote Ghidra Software Reverse Engineering for Beginners because he felt there weren't any books available that offered a deep dive into the open source reverse-engineering tool.
"All the relevant reverse-engineering frameworks have at least one book for learning it. At the time of writing my book, there was not a book explaining Ghidra in the depth it deserves," David said. "I considered it necessary to explain the framework in an organized way and for beginners, while still covering the entire framework and advanced reverse-engineering techniques."
Here, David explains what beginner and experienced malware analysts will learn in the book, from what to expect from Ghidra to how complicated the reverse-engineering tool may be for beginner or veteran security researchers and whether to use the framework on its own or in combination with others.
Editor's note: The following interview was edited for clarity and length.
What do you want readers to get from your book on Ghidra reverse-engineering?
A.P. David: Readers should be able to reverse-engineer programs -- i.e., malware samples -- using Ghidra. Even if readers don't master assembly language, they should be able to reverse-engineer programs by reading their decompiled versions. In Ghidra's case, the decompiled version of the program consists of a high-level, pseudo-C programming language code listing. Readers should also be able to use and write their own Ghidra scripts in Java and Python -- and automate them. To reverse-engineer programs efficiently, it is necessary to automate repetitive and time-consuming tasks.
In the book, I wanted to explain the Ghidra reverse-engineering framework in depth. It is extremely useful for beginners to understand how a reverse-engineering framework works and how the tool can take and show information from a given program.
Since the Ghidra reverse-engineering framework is open source, users can modify it and add features as needed through plugins and extensions. Readers can also submit custom code and suggestions to the NSA repository to possibly include in the next release.
Lastly, I wanted to engage readers. I included real examples, discussed Windows structures and covered common malware tricks. This gives readers a better background in reverse-engineering, while also learning the Ghidra framework.
Who will benefit most from reading this book?
David: People who have zero knowledge about Ghidra will get the most out of it. I tried to cover the entire framework using simple language to make it easy to understand. I noticed from reviews and general feedback that advanced reverse-engineers find the book useful, too -- especially when it comes to how to compile Ghidra, use PCode [also written as p-code, it is code that allows a compiled file to run on different processor types] for scripting and more.
When did you start using Ghidra? What made you use it over other existing frameworks?
David: I started using Ghidra as soon as it came out. I also began to collaborate and report security bugs. I like that the Ghidra reverse-engineering framework is open source and well documented. I enjoyed exploring the source code and being able to debug it. The PCode is impressive because you can support all the architectures by writing a script once. This gives users an amazing advantage -- for instance, to efficiently perform IoT malware analysis where you usually find malware compiled for different kinds of architectures.
Extending the Ghidra framework by writing new tools, plugins and scripts is neat and makes it powerful. Some of the more interesting things about Ghidra are its collaboration feature, version tracker and huge preexisting script arsenal that is ready to use and modify as needed.
The NSA released the debugger for Ghidra later but also included the PCode emulator. This allows users to write advanced reverse-engineering tools. It was a great surprise.
More on Ghidra Software Reverse Engineering for Beginners
To learn more about using Ghidra to reverse-engineer malware code, check out an excerpt from Chapter 5.
How long would it take a beginner and advanced malware analyst to pick up and start using Ghidra?
David: Readers of all levels should be able to start using Ghidra after a few hours. Beginners will take more time to truly understand the reverse-engineering framework, of course. But, for anyone with basic C programming language knowledge, the decompiled view of Ghidra will be useful, and they will be able to start analyzing malware.
Ghidra is probably the best framework for beginners to learn. It has a GUI mode -- but also a non-GUI mode -- which is easy to use and perfect for a beginner. Since Ghidra is open source, beginners will be able to understand the tool better than if they were using a private reverse-engineering framework.
An advanced malware analyst should be able to use Ghidra or any other GUI-based framework intuitively because the GUI is similar to existing frameworks.
How does Ghidra compare to other available reverse-engineering frameworks, such as Interactive Disassembler (IDA) and Radare2?
David: Radare2 is amazing -- it is extremely fast, has a console mode and, if you enter the 'V' command, you get a visual mode from the console (switch between views using 'P' key).
But I cannot compare Ghidra to Radare2 because the two frameworks are fundamentally different. I recommend security researchers use them in combination. Ghidra can work with just about any reverse-engineering framework because it is extensible.
IDA and Ghidra are similar, but I recommend Ghidra unless IDA supports a process or something specific that Ghidra does not. I like Ghidra more because it has PCode, which allows you to write a script once and support all architectures and has more granularity than assembly language.
Ghidra has two modes that users can take advantage of. The headless mode takes automation to another level. It's useful for processing multiple files; for instance, you can apply automation deobfuscation to hundreds of files. The GUI mode allows you to deeply analyze files with colorful graphs and comments on code snippets, etc. I use both modes, depending on the task.
How common is it for someone using Ghidra to compile a version of it themselves rather than using the executable on GitHub?
David: Compiling Ghidra is necessary when developing Ghidra. From the user's perspective, it is good to understand and experiment with the source code to get more background on what is happening internally when completing an action. Additionally, when a new feature is in development but not released yet, users can compile it instead of waiting for the release version. You can learn new features early, research bugs early and contribute to the development.
Are there any vulnerabilities in Ghidra?
David: Some vulnerabilities in an earlier version of Ghidra allowed adversaries to execute arbitrary code in a victim's machine. Another vulnerability allowed overwriting of arbitrary files. These vulnerabilities are a good reason I suggest readers learn to compile Ghidra on their own. You can patch the program and/or compile a patched version of the program before the release version.
Are there any features you would add to Ghidra?
David: I wish it had support for Python 3. Ghidra uses Jython -- a Python interpreter written in Java programming language -- which is limited to Python 2. Unfortunately, Python 2 is a deprecated version.
How do you come across files to review via Ghidra?
David: There are several reasons I put files in Ghidra to analyze. For instance, I like to reverse-engineer programs before I install them on my computer to make sure they aren't invading my privacy.
For my work as a malware analyst, I have different powerful resources for getting malware samples but cannot discuss them publicly. One tool that I can suggest security researchers use is VirusTotal.
About the author
A.P. David is a senior malware analyst and reverse-engineer. He has more than seven years of experience in IT, having previously worked on his own antivirus product. He started working for a company to reverse-engineer bank malware and help automate the process. After, David joined the critical malware department of an antivirus company. He is currently working as a security researcher at the Galician Research and Development Center in Advanced Telecommunications (Gradiant) while doing a malware-related Ph.D. He has hunted vulnerabilities for some relevant companies in his free time, including Microsoft's Windows 10 and the NSA's Ghidra project.