Reading the ECMA-262 Specification: What are Operations and Semantics?

Published on

What is ECMA-262?

As a frontend developer, most of my work involves using the JavaScript programming language. The standards for the JavaScript language are mostly defined in the “ECMAScript language specification,” which is ECMA-262.

The name “ECMA” also refers to the organization that creates the specification. Since 2015, the organization has published the specification every year on its official website: https://ecma-international.org/publications-and-standards/standards/ecma-262/

ECMA-262 defines the general-purpose, cross-platform, and vendor-neutral standards of ECMAScript. It only includes the syntax, semantics, and built-in libraries of the language. On the other hand, it does not define the specific implementation of features. Things such as the DOM model or other APIs specific to browsers are not defined in this specification either.

The Fundamental Concepts about Reading the Spec

When I first started reading the specification, I jumped into the topic I was interested in directly. However, I soon realized that there is some prerequisite knowledge that can help with a better understanding.

Although I’m very familiar with JavaScript and its features, the specification is hard to read for me at first glance because I have no idea about the terms, the grammar, and the conventions it uses.

I would recommend reading the “5.1.5 Grammar Notation” and “5.2 Algorithm Conventions” sections before delving into any specific topic. These sections cover fundamental concepts and grammar used throughout the entire specification.

There are two concepts that will be encountered in almost every section of the specification: “operation” and “semantic.” I will introduce them in the following sections.

Operations

An operation is some kind of action, task, or function of the language that needs to be defined. The definition of how should the language engine behave is the definition of the “operations.” For example, Evaluation, VarScopedDeclarations, and BlockDeclarationInstantiation are all operations. Most of the operations are defined as a certain set of algorithm steps.

There are two kinds of operations: “Abstract Operations,” and “Syntax-Directed Operations”

“Abstract Operations” are sets of reusable algorithms defined in a parameterized functional form. They are designed to be used in multiple parts of the specification and can be invoked by name within other algorithms. We can think of them playing the role of the “function” in any kind of common programming language. By using the abstract operations, we don’t need to write duplicated steps all over the specification.

“Syntax-Directed Operations” are operations whose definitions consist of algorithms associated with specific grammar productions. These operations are defined in terms of the syntactic structure of the language.

An operation might be both abstract operations and syntax-directed operations at the same time. It might be defined over specific grammar productions and is also designed to be reusable across the specification.

Semantics

Semantics refers to the meaning or behavior of a certain kind of syntactic constructs. While syntax defines the structure or form of valid statements in a language, semantics defines what those statements actually do when executed.

There are lots of sections in the ECMAScript specification that are titled as Static Semantics: <operation_name> or Runtime Semantics: <operation_name>

The “Static Semantics” means that the rules in this section are defined for how an ECMAScript implementation (such as a JavaScript engine) should gather specific information from the source code during the parsing phase before any actual execution occurs. These rules focus on analyzing the structure and declarations within the code to collect necessary data.

On the other hand, the “Runtime Semantics” defines the behavior of the language constructs during program execution. It specifies how the program acts and how its state changes when a piece of code runs.

According to the definition, the operation in the section title, accompanied by a certain kind of semantic, is considered a “Syntax-Directed Operation” since it is specified with a certain syntactic structure.

The specific syntactic structure for a semantic is called a “production.” A semantic may have multiple alternate definitions, each with a certain production and corresponding algorithm steps. The execution or analysis of a given piece of code should fall into only one of these alternate definitions.

The production is separated by a colon ”:” into a left-hand side and a right-hand side. The left-hand side is a non-terminal symbol representing a syntactic category or construct in the language, while the right-hand side describes how the left-hand side can be constructed using more specific symbols (terminal or non-terminal).

A “terminal symbol” refers to symbols that appear directly in our source code, such as let, const, {}, and +=. In contrast, “non-terminal symbols” are namely defined syntactic categories or constructs used only in the specification.

Here are the examples of how the ECMAScript specification defined the semantics of operations:

ECMAScript specification runtime semantics example

ECMAScript specification static semantic example

Addition: What is the implicit definition of “chain productions”?

While studying the static semantics of VarDeclaredNames and VarScopedDeclarations for a block, I wondered why there were no explicit rules for the production Block: { StatementList } in the specification.

Then as Bergi pointed out, the rules for Block: { StatementList } are actually implicitly defined through the alternate definition with the StatementList production.

According to the specification:

A chain production is a production that has exactly one nonterminal symbol on its right-hand side along with zero or more terminal symbols.

The production Block: { StatementList } is a chain production because it has exactly one nonterminal symbol, StatementList, on its right-hand side.

And:

Unless explicitly specified otherwise, all chain productions have an implicit definition for every operation that might be applied to that production’s left-hand side nonterminal. The implicit definition simply reapplies the same operation with the same parameters, if any, to the chain production’s sole right-hand side nonterminal and then returns the result. For example, assume that some algorithm has a step of the form: “Return Evaluation of Block” and that there is a production:

For example, assume that some algorithm has a step of the form: “Return Evaluation of Block” and that there is a production:

Block: { StatementList }

but the Evaluation operation does not associate an algorithm with that production. In that case, the Evaluation operation implicitly includes an association of the form:

Runtime Semantics: Evaluation
Block : { StatementList }
1. Return Evaluation of StatementList.

Following these rules, we can infer that the definition of VarDeclaredNames(Block: { StatementList }) is equivalent to returning VarDeclaredNames(StatementList). And the same principle applies to VarScopedDeclarations.