Building & programming with “diet” engine

This documentation introduces how to build Capstone for X86 architecture to minimize the libraries for embedding purpose.

Later part presents the APIs related to this feature and recommends the areas programmers should to pay attention to in their code.

1. Building “diet” engine

Typically, we use Capstone for usual applications, where the library weight does not really matter. Indeed, as of version 2.1-RC1, the whole engine is only 1.9 MB including all architectures, and this size raises no issue to most people.

However, there are cases when we want to embed Capstone into special environments, such as OS kernel driver or firmware, where its size should be as small as possible due to space restriction. While we can always compile only selected architectures to make the libraries more compact, we still want to slim them down further.

Towards this object, since version 2.1, Capstone supports diet mode, in which some non-critical data are removed, thus making the engine size at least 40% smaller.

By default, Capstone is built in standard mode. To build diet engine, do: (demonstration is on *nix systems)

$ sudo ./ install

If we only build selected architectures, the engine is even smaller. Find below the size for each individual architecture compiled in diet mode.

Architecture Library Standard binary “Diet” binary Reduced size
Arm libcapstone.a
730 KB
599 KB
603 KB
491 KB
Arm64 libcapstone.a
519 KB
398 KB
386 KB
273 KB
Mips libcapstone.a
206 KB
164 KB
136 KB
95 KB
PowerPC libcapstone.a
140 KB
103 KB
69 KB
50 KB
X86 libcapstone.a
809 KB
728 KB
486 KB
452 KB
Combine all archs libcapstone.a
2.3 MB
1.9 MB
1.6 MB
1.3 MB

(Above statistics were collected as of version 2.1-RC1, built on Mac OSX 10.9.1 with clang-500.2.79)

2. Programming with “diet” engine

2.1 Irrelevant data fields with “diet” engine

To significantly reduce the engine size, some internal data has to be sacrificed. Specifically, the following data fields in cs_insn struct become irrelevant.

Data field Meaning Replaced with
@mnemonic Mnemonic of instruction @id
@op_str Operand string of instruction @detail->operands
Registers implicitly read by instruction No
Registers implicitly written by instruction No
Semantic groups instruction belong to No

While these information is missing, fortunately we can still work out some critical information with the remaining data fields of cs_insn struct.

Besides, all the details in architecture-dependent structures such as cs_arm, cs_arm64, cs_mips, cs_ppc & cs_x86 is still there for us to work out all the information needed, even without the missing fields.

2.2 Irrelevant APIs with “diet” engine

While most Capstone APIs are still function exactly the same, due to these absent data fields, the following APIs become irrelevant.

By irrelevant, we mean above APIs would return undefined value. Therefore, programmers have been warned not to use these APIs in diet mode.

2.3 Checking engine for “diet” status

Capstone allows us to check if the engine was compiled in diet mode with cs_support() API, as follows - sample code in C.

if (cs_support(CS_SUPPORT_DIET)) {
	// Engine is in "diet" mode.
	// ...
} else {
	// Engine was compiled in standard mode.
	// ...

With Python, we can either check the diet mode via the function cs_support of capstone module, as follows.

from capstone import *

if cs_support(CS_SUPPORT_DIET):
    # engine is in diet mode
    # ....
    # engine was compiled in standard mode
    # ....

Or we can also use the diet getter of Cs class for the same purpose, as follows.

cs = Cs(CS_ARCH_X86, CS_MODE_64)
    # engine is in diet mode
    # ....
    # engine was compiled in standard mode
    # ....