Friday, January 29, 2021

Avoiding Memory Leaks in Node.js: Best Practices for Performance

 Memory leaks are something every developer has to eventually face. They are common in most languages, even if the language automatically manages memory for you. Memory leaks can result in problems such as application slowdowns, crashes, high latency, and so on.

In this blog post, we will look at what memory leaks are and how you can avoid them in your NodeJS application. Though this is more focused on NodeJS, it should generally apply to JavaScript and TypeScript as well. Avoiding memory leaks helps your application use resources efficiently and it also has performance benefits.

👋 As you’re exploring memory leaks in Node.js, you might want to explore AppSignal for Node.js as well. We provide you with out-of-the-box support for Node.js Core, Express, Next.js, Apollo Server, node-postgres and node-redis.

Memory Management in JavaScript

To understand memory leaks, we first need to understand how memory is managed in NodeJS. This means understanding how memory is managed by the JavaScript engine used by NodeJS. NodeJS uses the V8 Engine for JavaScript. You should check out Visualizing memory management in V8 Engine to get a better understanding of how memory is structured and utilized by JavaScript in V8.

Let’s do a short recap from the above-mentioned post:

Memory is mainly categorized into Stack and Heap memory.

  • Stack: This is where static data, including method/function frames, primitive values, and pointers to objects are stored. This space is managed by the operating system (OS).
  • Heap: This is where V8 stores objects or dynamic data. This is the biggest block of memory area and it’s where Garbage Collection(GC) takes place.

V8 manages the heap memory through garbage collection. In simple terms, it frees the memory used by orphan objects, i.e, objects that are no longer referenced from the Stack, directly or indirectly (via a reference in another object), to make space for new object creation.

The garbage collector in V8 is responsible for reclaiming unused memory for reuse by the V8 process. V8 garbage collectors are generational (Objects in Heap are grouped by their age and cleared at different stages). There are two stages and three different algorithms used for garbage collection by V8.

Mark-sweep-compact GC

What Are Memory Leaks

In simple terms, a memory leak is nothing but an orphan block of memory on the heap that is no longer used by the application and hasn’t been returned to the OS by the garbage collector. So in effect, it’s a useless block of memory. An accumulation of such blocks over time could lead to the application not having enough memory to work with or even your OS not having enough memory to allocate, leading to slowdowns and/or crashing of the application or even the OS.

What Causes Memory Leaks in JS

Automatic memory management like garbage collection in V8 aims to avoid such memory leaks, for example, circular references are no longer a concern, but could still happen due to unwanted references in the Heap and could be caused by different reasons. Some of the most common reasons are described below.

  • Global variables: Since global variables in JavaScript are referenced by the root node (window or global this), they are never garbage collected throughout the lifetime of the application, and will occupy memory as long as the application is running. This applies to any object referenced by the global variables and all their children as well. Having a large graph of objects referenced from the root can lead to a memory leak.
  • Multiple references: When the same object is referenced from multiple objects, it might lead to a memory leak when one of the references is left dangling.
  • Closures: JavaScript closures have the cool feature of memorizing its surrounding context. When a closure holds a reference to a large object in heap, it keeps the object in memory as long as the closure is in use. Which means you can easily end up in situations where a closure holding such reference can be improperly used leading to a memory leak
  • Timers & Events: The use of setTimeout, setInterval, Observers and event listeners can cause memory leaks when heavy object references are kept in their callbacks without proper handling.

Best Practices to Avoid Memory Leaks

Now that we understand what causes memory leaks, let’s see how to avoid them and the best practices to use to ensure efficient memory use.

REDUCE USE OF GLOBAL VARIABLES

Since global variables are never garbage collected, it’s best to ensure you don’t overuse them. Below are some ways to ensure that.

Avoid Accidental Globals

When you assign a value to an undeclared variable, JavaScript automatically hoists it as a global variable in default mode. This could be the result of a typo and could lead to a memory leak. Another way could be when assigning a variable to this, which is still a holy grail in JavaScript.

1
2
3
4
5
6
7
8
9
10
// This will be hoisted as a global variable
function hello() {
    foo = "Message";
}

// This will also become a global variable as global functions have
// global `this` as the contextual `this` in non strict mode
function hello() {
    this.foo = "Message";
}

To avoid such surprises, always write JavaScript in strict mode using the 'use strict'; annotation at the top of your JS file. In strict mode, the above will result in an error. When you use ES modules or transpilers like TypeScript or Babel, you don’t need it as it’s automatically enabled. In recent versions of NodeJS, you can enable strict mode globally by passing the --use_strict flag when running the node command.

1
2
3
4
5
6
7
8
9
10
11
12
"use strict";

// This will not be hoisted as global variable
function hello() {
    foo = "Message"; // will throw runtime error
}

// This will not become global variable as global functions
// have their own `this` in strict mode
function hello() {
    this.foo = "Message";
}

When you use arrow functions, you also need to be mindful not to create accidental globals, and unfortunately, strict mode will not help with this. You can use the no-invalid-this rule from ESLint to avoid such cases. If you are not using ESLint, just make sure not to assign to this from global arrow functions.

1
2
3
4
5
// This will also become a global variable as arrow functions
// do not have a contextual `this` and instead use a lexical `this`
const hello = () => {
    this.foo = 'Message";
}

Finally, keep in mind not to bind global this to any functions using the bind or call method, as it will defeat the purpose of using strict mode and such.

Use Global Scope Sparingly

In general, it’s a good practice to avoid using the global scope whenever possible and to also avoid using global variables as much as possible.

  1. As much as possible, don’t use the global scope. Instead, use local scope inside functions, as those will be garbage collected and memory will be freed. If you have to use a global variable due to some constraints, set the value to null when it’s no longer needed.
  2. Use global variables only for constants, cache, and reusable singletons. Don’t use global variables for the convenience of avoiding passing values around. For sharing data between functions and classes, pass the values around as parameters or object attributes.
  3. Don’t store big objects in the global scope. If you have to store them, make sure to nullify them when they are not needed. For cache objects, set a handler to clean them up once in a while and don’t let them grow indefinitely.

USE STACK MEMORY EFFECTIVELY

Using stack variables as much as possible helps with memory efficiency and performance as stack access is much faster than heap access. This also ensures that we don’t accidentally cause memory leaks. Of course, it’s not practical to only use static data. In real-world applications, we would have to use lots of objects and dynamic data. But we can follow some tricks to make better use of stack.

  1. Avoid heap object references from stack variables when possible. Also, don’t keep unused variables.
  2. Destructure and use fields needed from an object or array rather than passing around entire objects/arrays to functions, closures, timers, and event handlers. This avoids keeping a reference to objects inside closures. The fields passed might mostly be primitives, which will be kept in the stack.
1
2
3
4
5
6
7
8
9
10
11
12
13
function outer() {
    const obj = {
        foo: 1,
        bar: "hello",
    };

    const closure = () {
        const { foo } = obj;
        myFunc(foo);
    }
}

function myFunc(foo) {}

USE HEAP MEMORY EFFECTIVELY

It’s not possible to avoid using heap memory in any realistic application, but we can make them more efficient by following some of these tips:

  1. Copy objects where possible instead of passing references. Pass a reference only if the object is huge and a copy operation is expensive.
  2. Avoid object mutations as much as possible. Instead, use object spread or Object.assign to copy them.
  3. Avoid creating multiple references to the same object. Instead, make a copy of the object.
  4. Use short-lived variables.
  5. Avoid creating huge object trees. If they are unavoidable, try to keep them short-lived in the local scope.

PROPERLY USING CLOSURES, TIMERS AND EVENT HANDLERS

As we saw earlier, closures, timers and event handlers are other areas where memory leaks can occur. Let’s start with closures as they are the most common in JavaScript code. Look at the code below from the Meteor team. This leads to a memory leak as the longStr variable is never collected and keeps growing memory. The details are explained in this blog post.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
var theThing = null;
var replaceThing = function () {
    var originalThing = theThing;
    var unused = function () {
        if (originalThing) console.log("hi");
    };
    theThing = {
        longStr: new Array(1000000).join("*"),
        someMethod: function () {
            console.log(someMessage);
        },
    };
};
setInterval(replaceThing, 1000);

The code above creates multiple closures, and those closures hold on to object references. The memory leak, in this case, can be fixed by nullifying originalThing at the end of the replaceThing function. Such cases can also be avoided by creating copies of the object and following the immutable approach mentioned earlier.

When it comes to timers, always remember to pass copies of objects and avoid mutations. Also, clear timers when done, using clearTimeout and clearInterval methods.

The same goes for event listeners and observers. Clear them once the job is done, don’t leave event listeners running forever, especially if they are going to hold on to any object reference from the parent scope.

Conclusion

Memory leaks in JavaScript are not as big of a problem as they used to be, due to the evolution of the JS engines and improvements to the language, but if we are not careful, they can still happen and will cause performance issues and even application/OS crashes. The first step in ensuring that our code doesn’t cause memory leaks in a NodeJS application is to understand how the V8 engine handles memory. The next step is to understand what causes memory leaks. Once we understand this, we can try to avoid creating those scenarios altogether. And when we do hit memory leak/performance issues, we will know what to look for. When it comes to NodeJS, some tools can help as well. For example, Node-Memwatch and Node-Inspector are great for debugging memory issues.

References

P.S. If you liked this post, subscribe to our new JavaScript Sorcery list for a monthly deep dive into more magical JavaScript tips and tricks.

P.P.S. If you’d love an all-in-one APM for Node or you’re already familiar with AppSignal, go and check out the first version of AppSignal for Node.js.

Our guest author Deepu K Sasidharan is the co-lead of the JHipster platform. He is a polyglot developer and Cloud-Native Advocate currently working as a Developer Advocate at Adyen. He is also a published author, conference speaker, and blogger.

Wednesday, December 30, 2020

Easy profiling for Node.js Applications

There are many third party tools available for profiling Node.js applications but, in many cases, the easiest option is to use the Node.js built in profiler. The built in profiler uses the profiler inside V8 which samples the stack at regular intervals during program execution. It records the results of these samples, along with important optimization events such as jit compiles, as a series of ticks:

code-creation,LazyCompile,0,0x2d5000a337a0,396,"bp native array.js:1153:16",0x289f644df68,~
code-creation,LazyCompile,0,0x2d5000a33940,716,"hasOwnProperty native v8natives.js:198:30",0x289f64438d0,~
code-creation,LazyCompile,0,0x2d5000a33c20,284,"ToName native runtime.js:549:16",0x289f643bb28,~
code-creation,Stub,2,0x2d5000a33d40,182,"DoubleToIStub"
code-creation,Stub,2,0x2d5000a33e00,507,"NumberToStringStub"

In the past, you needed the V8 source code to be able to interpret the ticks. Luckily, tools have been introduced since Node.js 4.4.0 that facilitate the consumption of this information without separately building V8 from source. Let's see how the built-in profiler can help provide insight into application performance.

To illustrate the use of the tick profiler, we will work with a simple Express application. Our application will have two handlers, one for adding new users to our system:

app.get('/newUser', (req, res) => {
  let username = req.query.username || '';
  const password = req.query.password || '';

  username = username.replace(/[!@#$%^&*]/g, '');

  if (!username || !password || users[username]) {
    return res.sendStatus(400);
  }

  const salt = crypto.randomBytes(128).toString('base64');
  const hash = crypto.pbkdf2Sync(password, salt, 10000, 512, 'sha512');

  users[username] = { salt, hash };

  res.sendStatus(200);
});

and another for validating user authentication attempts:

app.get('/auth', (req, res) => {
  let username = req.query.username || '';
  const password = req.query.password || '';

  username = username.replace(/[!@#$%^&*]/g, '');

  if (!username || !password || !users[username]) {
    return res.sendStatus(400);
  }

  const { salt, hash } = users[username];
  const encryptHash = crypto.pbkdf2Sync(password, salt, 10000, 512, 'sha512');

  if (crypto.timingSafeEqual(hash, encryptHash)) {
    res.sendStatus(200);
  } else {
    res.sendStatus(401);
  }
});

Please note that these are NOT recommended handlers for authenticating users in your Node.js applications and are used purely for illustration purposes. You should not be trying to design your own cryptographic authentication mechanisms in general. It is much better to use existing, proven authentication solutions.

Now assume that we've deployed our application and users are complaining about high latency on requests. We can easily run the app with the built in profiler:

NODE_ENV=production node --prof app.js

and put some load on the server using ab (ApacheBench):

curl -X GET "http://localhost:8080/newUser?username=matt&password=password"
ab -k -c 20 -n 250 "http://localhost:8080/auth?username=matt&password=password"

and get an ab output of:

Concurrency Level:      20
Time taken for tests:   46.932 seconds
Complete requests:      250
Failed requests:        0
Keep-Alive requests:    250
Total transferred:      50250 bytes
HTML transferred:       500 bytes
Requests per second:    5.33 [#/sec] (mean)
Time per request:       3754.556 [ms] (mean)
Time per request:       187.728 [ms] (mean, across all concurrent requests)
Transfer rate:          1.05 [Kbytes/sec] received

...

Percentage of the requests served within a certain time (ms)
  50%   3755
  66%   3804
  75%   3818
  80%   3825
  90%   3845
  95%   3858
  98%   3874
  99%   3875
 100%   4225 (longest request)

From this output, we see that we're only managing to serve about 5 requests per second and that the average request takes just under 4 seconds round trip. In a real world example, we could be doing lots of work in many functions on behalf of a user request but even in our simple example, time could be lost compiling regular expressions, generating random salts, generating unique hashes from user passwords, or inside the Express framework itself.

Since we ran our application using the --prof option, a tick file was generated in the same directory as your local run of the application. It should have the form isolate-0xnnnnnnnnnnnn-v8.log (where n is a digit).

In order to make sense of this file, we need to use the tick processor bundled with the Node.js binary. To run the processor, use the --prof-process flag:

node --prof-process isolate-0xnnnnnnnnnnnn-v8.log > processed.txt

Opening processed.txt in your favorite text editor will give you a few different types of information. The file is broken up into sections which are again broken up by language. First, we look at the summary section and see:

 [Summary]:
   ticks  total  nonlib   name
     79    0.2%    0.2%  JavaScript
  36703   97.2%   99.2%  C++
      7    0.0%    0.0%  GC
    767    2.0%          Shared libraries
    215    0.6%          Unaccounted

This tells us that 97% of all samples gathered occurred in C++ code and that when viewing other sections of the processed output we should pay most attention to work being done in C++ (as opposed to JavaScript). With this in mind, we next find the [C++] section which contains information about which C++ functions are taking the most CPU time and see:

 [C++]:
   ticks  total  nonlib   name
  19557   51.8%   52.9%  node::crypto::PBKDF2(v8::FunctionCallbackInfo<v8::Value> const&)
   4510   11.9%   12.2%  _sha1_block_data_order
   3165    8.4%    8.6%  _malloc_zone_malloc

We see that the top 3 entries account for 72.1% of CPU time taken by the program. From this output, we immediately see that at least 51.8% of CPU time is taken up by a function called PBKDF2 which corresponds to our hash generation from a user's password. However, it may not be immediately obvious how the lower two entries factor into our application (or if it is we will pretend otherwise for the sake of example). To better understand the relationship between these functions, we will next look at the [Bottom up (heavy) profile] section which provides information about the primary callers of each function. Examining this section, we find:

   ticks parent  name
  19557   51.8%  node::crypto::PBKDF2(v8::FunctionCallbackInfo<v8::Value> const&)
  19557  100.0%    v8::internal::Builtins::~Builtins()
  19557  100.0%      LazyCompile: ~pbkdf2 crypto.js:557:16

   4510   11.9%  _sha1_block_data_order
   4510  100.0%    LazyCompile: *pbkdf2 crypto.js:557:16
   4510  100.0%      LazyCompile: *exports.pbkdf2Sync crypto.js:552:30

   3165    8.4%  _malloc_zone_malloc
   3161   99.9%    LazyCompile: *pbkdf2 crypto.js:557:16
   3161  100.0%      LazyCompile: *exports.pbkdf2Sync crypto.js:552:30

Parsing this section takes a little more work than the raw tick counts above. Within each of the "call stacks" above, the percentage in the parent column tells you the percentage of samples for which the function in the row above was called by the function in the current row. For example, in the middle "call stack" above for _sha1_block_data_order, we see that _sha1_block_data_order occurred in 11.9% of samples, which we knew from the raw counts above. However, here, we can also tell that it was always called by the pbkdf2 function inside the Node.js crypto module. We see that similarly, _malloc_zone_malloc was called almost exclusively by the same pbkdf2 function. Thus, using the information in this view, we can tell that our hash computation from the user's password accounts not only for the 51.8% from above but also for all CPU time in the top 3 most sampled functions since the calls to _sha1_block_data_order and _malloc_zone_malloc were made on behalf of the pbkdf2 function.

At this point, it is very clear that the password based hash generation should be the target of our optimization. Thankfully, you've fully internalized the benefits of asynchronous programming and you realize that the work to generate a hash from the user's password is being done in a synchronous way and thus tying down the event loop. This prevents us from working on other incoming requests while computing a hash.

To remedy this issue, you make a small modification to the above handlers to use the asynchronous version of the pbkdf2 function:

app.get('/auth', (req, res) => {
  let username = req.query.username || '';
  const password = req.query.password || '';

  username = username.replace(/[!@#$%^&*]/g, '');

  if (!username || !password || !users[username]) {
    return res.sendStatus(400);
  }

  crypto.pbkdf2(password, users[username].salt, 10000, 512, 'sha512', (err, hash) => {
    if (users[username].hash.toString() === hash.toString()) {
      res.sendStatus(200);
    } else {
      res.sendStatus(401);
    }
  });
});

A new run of the ab benchmark above with the asynchronous version of your app yields:

Concurrency Level:      20
Time taken for tests:   12.846 seconds
Complete requests:      250
Failed requests:        0
Keep-Alive requests:    250
Total transferred:      50250 bytes
HTML transferred:       500 bytes
Requests per second:    19.46 [#/sec] (mean)
Time per request:       1027.689 [ms] (mean)
Time per request:       51.384 [ms] (mean, across all concurrent requests)
Transfer rate:          3.82 [Kbytes/sec] received

...

Percentage of the requests served within a certain time (ms)
  50%   1018
  66%   1035
  75%   1041
  80%   1043
  90%   1049
  95%   1063
  98%   1070
  99%   1071
 100%   1079 (longest request)

Yay! Your app is now serving about 20 requests per second, roughly 4 times more than it was with the synchronous hash generation. Additionally, the average latency is down from the 4 seconds before to just over 1 second.

Hopefully, through the performance investigation of this (admittedly contrived) example, you've seen how the V8 tick processor can help you gain a better understanding of the performance of your Node.js applications.