Please note that as of Aug 3, 2015, StrongOps has been EOL’d.

In our last weekly performance tip, we discussed in detail how the Node.js event-loop works as the orchestrator of requests, events and callbacks. We also troubleshot a blocked event-loop which could wreck havoc on application performance. In this week’s post we’ll dive into the fundamentals of garbage collection (GC) in V8 and how it holds the “keys to the kingdom” in the optimization of Node applications. We will also look at some tools to triage GC and memory management issues in V8.

Where did it all start

Contrary to folklore, V8 was not designed explicitly for Node; rather was built to power a fast browser (Chrome/Chromium) or client side JavaScript (V8.NET). Eventually it got ported to the server as a high performance engine for running server side JavaScript and became the basis of Node. But, just like any server side runtime like the JVM or CLR, V8 runs on memory and needs memory management.

In some systems or languages, it is up to the application program to manage all the bookkeeping details of allocating memory from the heap and freeing it when it is no longer required. This is known as manual memory management. Manual memory management may be appropriate for small programs, but it does not scale well in general, nor does it encourage modular or object-oriented programming.

Who/What is the Garbage Collector?

V8 embraces garbage collection (GC) also known as managed memory.  Built in GC provides huge simplification for developers by not having to explicitly handle memory bookkeeping in code as was done in the “C” era. It reduces a large class of errors and memory leaks, which are typical in large long-running applications and in some cases, it can even improve performance.

However, you also give up control on memory management, which could be a problem particularly for mobile applications aiming for the optimization of device resources. In particular for JavaScript, the ECMAScript specification doesn’t expose any interface to the garbage collector blocking visibility and forced GC.

Performance wise, it is a wash. In C, allocating (malloc) and freeing objects can be costly, since heap bookkeeping tends to be more complicated. With managed memory, allocation usually means just incrementing a pointer, but  you pay for it eventually when you run out of memory and the garbage collector kicks in.  The fact is that V8  uses garbage collection for better or worse.

How does V8 Organize the heap?

V8 divides the heap into several different spaces for effective memory management:

  • New-space: Most objects are allocated here. New-space is small and is designed to be garbage collected very quickly
  • Old-pointer-space: Contains most objects which may have pointers to other objects. Most objects are moved here after surviving in new-space for a while.
  • Old-data-space: Contains objects which just contain raw data (no pointers to other objects). Strings, boxed numbers, and arrays of unboxed doubles are moved here after surviving in new-space for a while.
  • Large-object-space: Contains objects which are larger than the size limits of other spaces. Each object gets its own mmap’d region of memory. Large objects are never moved by the garbage collector.
  • Code-space: Code objects, which contain JITed instructions, are allocated here. This is the only space with executable memory.
  • Cell-space, property-cell-space and map-space: Contain Cells, PropertyCells, and Maps, respectively. Each space contains objects which are all the same size and is restricted in pointers, which simplifies collection.

Each space is composed of a set of “pages”. A Page is a contiguous chunk of memory, allocated from the operating system with mmap. Pages are always 1 MB in size and 1 MB aligned, except in large-object-space, where they may be larger.

How does the Garbage Collector work?

The central concept of Node.js/JavaScript application memory management is a concept of reachability.

  • A distinguished set of objects are assumed to be reachable or in live scope: these are known as the “roots.” Typically, these include all the objects referenced from anywhere in the call stack (that is, all local variables and parameters in the functions currently being invoked), and any global variables.
  • Objects are kept in memory while they are accessible from roots through a reference or a chain of references.
  • Root objects are pointed directly from V8 or the Web browser like DOM elements

The fundamental problem garbage collection solves is to identify dead memory regions (unreachable objects/garbage) which are not reachable through some chain of pointers from an object which is live. Once identified, these regions can be re-used for new allocations or released back to the operating system

To ensure fast object allocation, short garbage collection pauses, and the “no memory fragmentation V8” employs a stop-the-world, generational, accurate, garbage collector. V8 essentially:

  • stops program execution when performing a garbage collection cycle.
  • processes only part of the object heap in most garbage collection cycles. This minimizes the impact of stopping the application.
  • always knows exactly where all objects and pointers are in memory. This avoids falsely identifying objects as pointers which can result in memory leaks.

In V8, the object heap is segmented into many parts; hence If an object is moved in a garbage collection cycle, V8 updates all pointers to the object.

No, no…how does it really work ?

The GC needs to follow pointers in order to discover live objects. Most garbage collection algorithms can migrate objects from one part of memory to another (to reduce fragmentation and increase locality), so we also need to be able to rewrite pointers without disturbing plain old data.

V8 uses “tagged” pointers. Most objects on the heap just contain a list of tagged words, so the garbage collector can quickly scan them, following the pointers and ignoring the integers. Some objects, such as strings, are known to contain only data (no pointers), so their contents do not have to be tagged.

Next, let’s check which algorithms V8 uses to execute garbage collection.

Short GC/scavenging

V8 divides the heap into two generations. Objects are allocated in new-space, which is fairly small (between 1 and 8 MB, depending on behavior heuristics). Allocation in new space is very cheap: we just have an allocation pointer which we increment whenever we want to reserve space for a new object. When the allocation pointer reaches the end of new space, a scavenge (minor garbage collection cycle) is triggered, which quickly removes the dead objects from new space.

Scavenging/copying GC is a kind of tracing garbage collection that operates by relocating reachable objects  and then reclaiming objects that are left behind, which must be unreachable and therefore dead.

A copying garbage collection relies on being able to find and correct all references to copied objects.

The Scavenge algorithm is great for quickly collecting and compacting a small amount of memory, but it has large space overhead, since we need physical memory backing both to-space and from-space. This is acceptable as long as we keep new-space small, but it’s impractical to use this approach for more than a few megabytes.

Scavenging is supposed to be fast by design. Hence, it is suitable for frequently occurring short GC cycles.

Full GC/mark-sweep & mark-compact

Objects which have survived two minor garbage collections are promoted to “old-space.” Old-space is garbage collected in full GC (major garbage collection cycle), which is much less frequent. A full GC cycle is triggered when we get over a certain amount of memory in old space.

To collect old space, which may contain several hundred megabytes of data, we use two closely related algorithms, Mark-sweep and Mark-compact.

Mark-sweep collection is a kind of tracing garbage collection that operates by marking reachable objects, then sweeping over memory and recycling objects that are unmarked (which must be unreachable), putting them on a free list.

The mark phase follows reference chains to mark all reachable objects. Once marking is complete, we can reclaim memory by either sweeping or compacting. Both algorithms work at a page level.  The sweep phase performs a sequential (address-order) pass over memory to recycle all unmarked objects. A mark-sweep collector doesn’t move objects.

Mark-compact collection is a kind of tracing garbage collection that operates by marking reachable objects, then compacting the marked objects (which must include all the live objects).

The mark phase follows reference chains to mark all reachable objects; the compaction phase typically performs a number of sequential passes over memory  to move objects and update references. Due to compaction, all the marked objects are moved into a single contiguous block of memory (or a small number of such blocks); the memory left unused after compaction is recycled. The compacting algorithm also tries to reduce actual memory usage by migrating objects from fragmented pages (containing a lot of small free spaces) to free spaces on other pages

However, the pause time in mark-sweep and mark-compact were found to be high, slowing down overall performance.

Incremental marking & lazy sweeping

In mid-2012, Google introduced two improvements that reduced garbage collection pauses significantly: incremental marking and lazy sweeping.

Incremental marking means being able to do a bit of marking work, then let the mutator (JavaScript program) run a bit, then do another bit of marking work. Thus, instead of one long pause for marking, the GC marks the heap in a series of short pauses in the order of 5-10 ms each as an example. Incremental marking begins when the heap reaches a certain threshold size. Post threshold, every time a certain amount of memory is allocated, execution is paused to perform an incremental marking step.

If the allocation rate is high during incremental GC, the engine may run out of memory before finishing the incremental cycle. If so, the engine must immediately restart a full, non-incremental GC in order to reclaim some memory and continue execution.

After marking, lazy sweeping begins. All objects have been marked live or dead, and the heap knows exactly how much memory could be freed by sweeping. All this memory doesn’t necessarily have to be freed up right away though, and delaying the sweeping doesn’t hurt. Hence instead of one-go, the garbage collector sweeps pages on an as-needed basis until all pages have been swept. At that point, the garbage collection cycle is complete, and incremental marking is free to start again.

What’s next?

Read about eight exciting new Node v0.12 features and how to make the most of them, from the authors themselves.