Hermes Concepts

This document contains documentation on the concepts of Hermes assembly and the Hasm format.

Names, Acronyms, and Fundamental Concepts

It can be difficult to understand what all the different parts of the hasmer ecosystem are. Below are the definitions of some basic concepts about hasmer to help explain what the pieces are.

hasmer

hasmer is an open source command line tool for (dis)assembling and decompiling Hermes bytecode.

Hermes

Hermes is a JavaScript engine optimized for executing JavaScript in React Native. Hermes has its own JavaScript compiler which outputs Hermes bytecode files. These bytecode files are optimized for execution in the Hermes engine at runtime, and many React Native apps use this precompiled format.

HBC

HBC is an acronym for “Hermes bytecode”. HBC is a binary bytecode format that is executed by the Hermes engine at runtime. It represents a precompiled JavaScript file containing HBC instructions.

Array Buffer

The Array Buffer is a set of constant data encoded in an HBC file. Constant array data from the original JavaScript file is encoded into the Array Buffer. At runtime, when constructing arrays with constant data, the data is retrieved from the Array Buffer.

As an example, if the original JavaScript had this code:

var myArray = [1, 2, 3, "hello world"];

The values 1, 2, 3, and "hello world" would be encoded consecutively into the Array Buffer. HBC instructions which load constant array data retrieve the values from the Array Buffer by index, which refers to the offset in the buffer that the values are encoded at.

Object Key Buffer

The Object Key Buffer is very similar to the Array Buffer, except it contains the keys (i.e. properties) of objects.

As an example, if the original JavaScript had this code:

var myObject = {
    key1: "hello",
    key2: "World"
};

The keys "key1" and "key2" would be encoded into the Object Key Buffer.

Object Value Buffer

The Object Value Buffer is very similar to the Array Buffer, except it contains the values of objects.

As an example, if the original JavaScript had this code:

var myObject = {
    key1: "hello",
    key2: "World"
};

The values "hello" and "World" would be encoded into the Object Value Buffer.

String Table

The String Table is an array-like structure containing all of the strings in the HBC file. The String Table includes property names, function names, constant strings, etc.

As an example, if the original JavaScript had this code:

function main() {
    print(global.someProperty);
}

The String Table would contain entries for the strings main, print, global, and someProperty.

Environment

Environments are how variables are exchanged between nested functions and closures. A parent function stores variables references by closures into its environment, and the closure can access the environment variables from its parents.

As an example, if the original JavaScript had this code:

function main() {
    var val = "hello ";
    var myClosure = () => {
        val += "world";
    };
    myClosure();
    print(val); // prints "hello world"
}

The main function stores the val variable into its environment. When myClosure is invoked, it obtains a reference to the parent environment and retrieves the val variable. It then modifies val (i.e. appends "world") and stores that value back into the environment of its parent.

Global Function

The Global Function is where all code is put into at compile-time. The Hermes execution engine runs the global function at runtime.

As an example, if the original JavaScript had this code:

print("hello, world");

The decompiled HBC would look something like this:

function global() {
    print("hello, world");
}

Global Object

The Global Object (i.e. the identifier global) is where all global functions and properties are stored. At compile time, all global functions and properties are assigned to be values within the global object.

As an example, if the original JavaScript had this code:

function main() {
    print(Object.keys(global));
}

The output would include the main function, as well as other Hermes built-ins (such as the Global Function).

Variant Instructions

Many HBC instructions have “variants”. A variant instruction is defined as an instruction which performs the same action as another instruction, but takes differently sized operands.

As an example, there are two variants for the JNotEqual instruction:

JNotEqual <Addr8> <Reg8> <Reg8>
JNotEqualLong <Addr32> <Reg8> <Reg8>

The JNotEqualLong instruction performs the exact same action as JNotEqual when executed, but has the capability of jumping up to 4 bytes (per the Addr32 operand), whereas the JNotEqual instruction can only jump up to 1 byte (per the Addr8 operand).

The hasmer dis(assembler) has two modes: auto and exact.

When an HBC file is disassembled in auto mode, variant instructions (e.g. JNotEqualLong) are converted to their base instruction (e.g. JNotEqual).

When an HBC file is disassembled in exact mode, variant instructions are kept literally and not converted to their base variant.

When a Hasm file is assembled in auto mode, the instructions are converted to their respective variants automatically by the assembly by determining the needs of the operands.

When a Hasm file is assembled in exact mode, the instructions are interpeted literally. If the operand is out of bounds of the values the instruction takes, an error is thrown.

This makes programming in exact mode quite difficult. In the vast majority of use cases, auto mode is preferable to exact mode when using hasmer.