Wednesday, December 30, 2020

Easy profiling for Node.js Applications

There are many third party tools available for profiling Node.js applications but, in many cases, the easiest option is to use the Node.js built in profiler. The built in profiler uses the profiler inside V8 which samples the stack at regular intervals during program execution. It records the results of these samples, along with important optimization events such as jit compiles, as a series of ticks:

code-creation,LazyCompile,0,0x2d5000a337a0,396,"bp native array.js:1153:16",0x289f644df68,~
code-creation,LazyCompile,0,0x2d5000a33940,716,"hasOwnProperty native v8natives.js:198:30",0x289f64438d0,~
code-creation,LazyCompile,0,0x2d5000a33c20,284,"ToName native runtime.js:549:16",0x289f643bb28,~
code-creation,Stub,2,0x2d5000a33d40,182,"DoubleToIStub"
code-creation,Stub,2,0x2d5000a33e00,507,"NumberToStringStub"

In the past, you needed the V8 source code to be able to interpret the ticks. Luckily, tools have been introduced since Node.js 4.4.0 that facilitate the consumption of this information without separately building V8 from source. Let's see how the built-in profiler can help provide insight into application performance.

To illustrate the use of the tick profiler, we will work with a simple Express application. Our application will have two handlers, one for adding new users to our system:

app.get('/newUser', (req, res) => {
  let username = req.query.username || '';
  const password = req.query.password || '';

  username = username.replace(/[!@#$%^&*]/g, '');

  if (!username || !password || users[username]) {
    return res.sendStatus(400);
  }

  const salt = crypto.randomBytes(128).toString('base64');
  const hash = crypto.pbkdf2Sync(password, salt, 10000, 512, 'sha512');

  users[username] = { salt, hash };

  res.sendStatus(200);
});

and another for validating user authentication attempts:

app.get('/auth', (req, res) => {
  let username = req.query.username || '';
  const password = req.query.password || '';

  username = username.replace(/[!@#$%^&*]/g, '');

  if (!username || !password || !users[username]) {
    return res.sendStatus(400);
  }

  const { salt, hash } = users[username];
  const encryptHash = crypto.pbkdf2Sync(password, salt, 10000, 512, 'sha512');

  if (crypto.timingSafeEqual(hash, encryptHash)) {
    res.sendStatus(200);
  } else {
    res.sendStatus(401);
  }
});

Please note that these are NOT recommended handlers for authenticating users in your Node.js applications and are used purely for illustration purposes. You should not be trying to design your own cryptographic authentication mechanisms in general. It is much better to use existing, proven authentication solutions.

Now assume that we've deployed our application and users are complaining about high latency on requests. We can easily run the app with the built in profiler:

NODE_ENV=production node --prof app.js

and put some load on the server using ab (ApacheBench):

curl -X GET "http://localhost:8080/newUser?username=matt&password=password"
ab -k -c 20 -n 250 "http://localhost:8080/auth?username=matt&password=password"

and get an ab output of:

Concurrency Level:      20
Time taken for tests:   46.932 seconds
Complete requests:      250
Failed requests:        0
Keep-Alive requests:    250
Total transferred:      50250 bytes
HTML transferred:       500 bytes
Requests per second:    5.33 [#/sec] (mean)
Time per request:       3754.556 [ms] (mean)
Time per request:       187.728 [ms] (mean, across all concurrent requests)
Transfer rate:          1.05 [Kbytes/sec] received

...

Percentage of the requests served within a certain time (ms)
  50%   3755
  66%   3804
  75%   3818
  80%   3825
  90%   3845
  95%   3858
  98%   3874
  99%   3875
 100%   4225 (longest request)

From this output, we see that we're only managing to serve about 5 requests per second and that the average request takes just under 4 seconds round trip. In a real world example, we could be doing lots of work in many functions on behalf of a user request but even in our simple example, time could be lost compiling regular expressions, generating random salts, generating unique hashes from user passwords, or inside the Express framework itself.

Since we ran our application using the --prof option, a tick file was generated in the same directory as your local run of the application. It should have the form isolate-0xnnnnnnnnnnnn-v8.log (where n is a digit).

In order to make sense of this file, we need to use the tick processor bundled with the Node.js binary. To run the processor, use the --prof-process flag:

node --prof-process isolate-0xnnnnnnnnnnnn-v8.log > processed.txt

Opening processed.txt in your favorite text editor will give you a few different types of information. The file is broken up into sections which are again broken up by language. First, we look at the summary section and see:

 [Summary]:
   ticks  total  nonlib   name
     79    0.2%    0.2%  JavaScript
  36703   97.2%   99.2%  C++
      7    0.0%    0.0%  GC
    767    2.0%          Shared libraries
    215    0.6%          Unaccounted

This tells us that 97% of all samples gathered occurred in C++ code and that when viewing other sections of the processed output we should pay most attention to work being done in C++ (as opposed to JavaScript). With this in mind, we next find the [C++] section which contains information about which C++ functions are taking the most CPU time and see:

 [C++]:
   ticks  total  nonlib   name
  19557   51.8%   52.9%  node::crypto::PBKDF2(v8::FunctionCallbackInfo<v8::Value> const&)
   4510   11.9%   12.2%  _sha1_block_data_order
   3165    8.4%    8.6%  _malloc_zone_malloc

We see that the top 3 entries account for 72.1% of CPU time taken by the program. From this output, we immediately see that at least 51.8% of CPU time is taken up by a function called PBKDF2 which corresponds to our hash generation from a user's password. However, it may not be immediately obvious how the lower two entries factor into our application (or if it is we will pretend otherwise for the sake of example). To better understand the relationship between these functions, we will next look at the [Bottom up (heavy) profile] section which provides information about the primary callers of each function. Examining this section, we find:

   ticks parent  name
  19557   51.8%  node::crypto::PBKDF2(v8::FunctionCallbackInfo<v8::Value> const&)
  19557  100.0%    v8::internal::Builtins::~Builtins()
  19557  100.0%      LazyCompile: ~pbkdf2 crypto.js:557:16

   4510   11.9%  _sha1_block_data_order
   4510  100.0%    LazyCompile: *pbkdf2 crypto.js:557:16
   4510  100.0%      LazyCompile: *exports.pbkdf2Sync crypto.js:552:30

   3165    8.4%  _malloc_zone_malloc
   3161   99.9%    LazyCompile: *pbkdf2 crypto.js:557:16
   3161  100.0%      LazyCompile: *exports.pbkdf2Sync crypto.js:552:30

Parsing this section takes a little more work than the raw tick counts above. Within each of the "call stacks" above, the percentage in the parent column tells you the percentage of samples for which the function in the row above was called by the function in the current row. For example, in the middle "call stack" above for _sha1_block_data_order, we see that _sha1_block_data_order occurred in 11.9% of samples, which we knew from the raw counts above. However, here, we can also tell that it was always called by the pbkdf2 function inside the Node.js crypto module. We see that similarly, _malloc_zone_malloc was called almost exclusively by the same pbkdf2 function. Thus, using the information in this view, we can tell that our hash computation from the user's password accounts not only for the 51.8% from above but also for all CPU time in the top 3 most sampled functions since the calls to _sha1_block_data_order and _malloc_zone_malloc were made on behalf of the pbkdf2 function.

At this point, it is very clear that the password based hash generation should be the target of our optimization. Thankfully, you've fully internalized the benefits of asynchronous programming and you realize that the work to generate a hash from the user's password is being done in a synchronous way and thus tying down the event loop. This prevents us from working on other incoming requests while computing a hash.

To remedy this issue, you make a small modification to the above handlers to use the asynchronous version of the pbkdf2 function:

app.get('/auth', (req, res) => {
  let username = req.query.username || '';
  const password = req.query.password || '';

  username = username.replace(/[!@#$%^&*]/g, '');

  if (!username || !password || !users[username]) {
    return res.sendStatus(400);
  }

  crypto.pbkdf2(password, users[username].salt, 10000, 512, 'sha512', (err, hash) => {
    if (users[username].hash.toString() === hash.toString()) {
      res.sendStatus(200);
    } else {
      res.sendStatus(401);
    }
  });
});

A new run of the ab benchmark above with the asynchronous version of your app yields:

Concurrency Level:      20
Time taken for tests:   12.846 seconds
Complete requests:      250
Failed requests:        0
Keep-Alive requests:    250
Total transferred:      50250 bytes
HTML transferred:       500 bytes
Requests per second:    19.46 [#/sec] (mean)
Time per request:       1027.689 [ms] (mean)
Time per request:       51.384 [ms] (mean, across all concurrent requests)
Transfer rate:          3.82 [Kbytes/sec] received

...

Percentage of the requests served within a certain time (ms)
  50%   1018
  66%   1035
  75%   1041
  80%   1043
  90%   1049
  95%   1063
  98%   1070
  99%   1071
 100%   1079 (longest request)

Yay! Your app is now serving about 20 requests per second, roughly 4 times more than it was with the synchronous hash generation. Additionally, the average latency is down from the 4 seconds before to just over 1 second.

Hopefully, through the performance investigation of this (admittedly contrived) example, you've seen how the V8 tick processor can help you gain a better understanding of the performance of your Node.js applications. 

Wednesday, December 2, 2020

install node/npm with nvm

 

Installing and Updating

Install & Update Script

To install or update nvm, you should run the install script. To do that, you may either download and run the script manually, or use the following cURL or Wget command:

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.37.2/install.sh | bash
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.37.2/install.sh | bash

Running either of the above commands downloads a script and runs it. The script clones the nvm repository to ~/.nvm, and attempts to add the source lines from the snippet below to the correct profile file (~/.bash_profile~/.zshrc~/.profile, or ~/.bashrc).

export NVM_DIR="$([ -z "${XDG_CONFIG_HOME-}" ] && printf %s "${HOME}/.nvm" || printf %s "${XDG_CONFIG_HOME}/nvm")"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm

Additional Notes

  • If the environment variable $XDG_CONFIG_HOME is present, it will place the nvm files there.

  • You can add --no-use to the end of the above script (...nvm.sh --no-use) to postpone using nvm until you manually use it.

  • You can customize the install source, directory, profile, and version using the NVM_SOURCENVM_DIRPROFILE, and NODE_VERSION variables. Eg: curl ... | NVM_DIR="path/to/nvm". Ensure that the NVM_DIR does not contain a trailing slash.

  • The installer can use gitcurl, or wget to download nvm, whichever is available.

Troubleshooting on Linux

On Linux, after running the install script, if you get nvm: command not found or see no feedback from your terminal after you type command -v nvm, simply close your current terminal, open a new terminal, and try verifying again.

Troubleshooting on macOS

Since OS X 10.9, /usr/bin/git has been preset by Xcode command line tools, which means we can't properly detect if Git is installed or not. You need to manually install the Xcode command line tools before running the install script, otherwise, it'll fail. (see #1782)

If you get nvm: command not found after running the install script, one of the following might be the reason:

  • Since macOS 10.15, the default shell is zsh and nvm will look for .zshrc to update, none is installed by default. Create one with touch ~/.zshrc and run the install script again.

  • If you use bash, the previous default shell, run touch ~/.bash_profile to create the necessary profile file if it does not exist.

  • You might need to restart your terminal instance or run . ~/.nvm/nvm.sh. Restarting your terminal/opening a new tab/window, or running the source command will load the command and the new configuration.

If the above doesn't fix the problem, you may try the following:

  • If you use bash, it may be that your .bash_profile (or ~/.profile) does not source your ~/.bashrc properly. You could fix this by adding source ~/<your_profile_file> to it or follow the next step below.

  • Try adding the snippet from the install section, that finds the correct nvm directory and loads nvm, to your usual profile (~/.bash_profile~/.zshrc~/.profile, or ~/.bashrc).

  • For more information about this issue and possible workarounds, please refer here

Usage

To download, compile, and install the latest release of node, do this:

nvm install node # "node" is an alias for the latest version

To install a specific version of node:

nvm install 6.14.4 # or 10.10.0, 8.9.1, etc

The first version installed becomes the default. New shells will start with the default version of node (e.g., nvm alias default).

You can list available versions using ls-remote:

nvm ls-remote

And then in any new shell just use the installed version:

nvm use node

Or you can just run it:

nvm run node --version

Or, you can run any arbitrary command in a subshell with the desired version of node:

nvm exec 4.2 node --version

You can also get the path to the executable to where it was installed:

nvm which 5.0

In place of a version pointer like "0.10" or "5.0" or "4.2.1", you can use the following special default aliases with nvm installnvm usenvm runnvm execnvm which, etc:

  • node: this installs the latest version of node
  • iojs: this installs the latest version of io.js
  • stable: this alias is deprecated, and only truly applies to node v0.12 and earlier. Currently, this is an alias for node.
  • unstable: this alias points to node v0.11 - the last "unstable" node release, since post-1.0, all node versions are stable. (in SemVer, versions communicate breakage, not stability).

Long-term Support

Node has a schedule for long-term support (LTS) You can reference LTS versions in aliases and .nvmrc files with the notation lts/* for the latest LTS, and lts/argon for LTS releases from the "argon" line, for example. In addition, the following commands support LTS arguments:

  • nvm install --lts / nvm install --lts=argon / nvm install 'lts/*' / nvm install lts/argon
  • nvm uninstall --lts / nvm uninstall --lts=argon / nvm uninstall 'lts/*' / nvm uninstall lts/argon
  • nvm use --lts / nvm use --lts=argon / nvm use 'lts/*' / nvm use lts/argon
  • nvm exec --lts / nvm exec --lts=argon / nvm exec 'lts/*' / nvm exec lts/argon
  • nvm run --lts / nvm run --lts=argon / nvm run 'lts/*' / nvm run lts/argon
  • nvm ls-remote --lts / nvm ls-remote --lts=argon nvm ls-remote 'lts/*' / nvm ls-remote lts/argon
  • nvm version-remote --lts / nvm version-remote --lts=argon / nvm version-remote 'lts/*' / nvm version-remote lts/argon

Any time your local copy of nvm connects to https://nodejs.org, it will re-create the appropriate local aliases for all available LTS lines. These aliases (stored under $NVM_DIR/alias/lts), are managed by nvm, and you should not modify, remove, or create these files - expect your changes to be undone, and expect meddling with these files to cause bugs that will likely not be supported.

To get the latest LTS version of node and migrate your existing installed packages, use

nvm install 'lts/*' --reinstall-packages-from=current

Tuesday, December 1, 2020

How To Run TypeScript Scripts with ts-node

 

Introduction

TypeScript has been gaining in popularity. With TypeScript being a superset of JavaScript, using it means compiling your TypeScript files down to pure JavaScript before the V8 engine can understand them. You could watch for file changes and automate the compiling. But sometimes, you just want to run your script and get results. This is where ts-node comes in. With ts-node, you can skip the fuss and execute your TypeScript scripts with ease.

Prerequisites

To successfully complete this tutorial, you will need the following:

Step 1 — Getting Started

To get things started, you need to install typescript and ts-node:

  • npm install typescript ts-node

Since ts-node is an executable you can run, there’s nothing to import or require in your scripts.

If you don’t already have a TypeScript project to work with, you can just grab use this script to test ts-node with:

Script: reptile.ts
class Reptile {
  private reptiles: Array<string> = [
    'Alligator',
    'Crocodile',
    'Chameleon',
    'Komodo Dragon',
    'Iguana',
    'Salamander',
    'Snake',
    'Lizard',
    'Python',
    'Tortoise',
    'Turtle',
  ];

  shuffle(): void {
    for (let i = this.reptiles.length - 1; i > 0; i--) {
      let j: number = Math.floor(Math.random() * (i + 1));
      let temp: string = this.reptiles[i];
      this.reptiles[i] = this.reptiles[j];
      this.reptiles[j] = temp;
    }
  }

  random(count: number = 1, allowDupes?: boolean): Array<string> {
    let selected: Array<string> = [];

    if (!allowDupes && count > this.reptiles.length) {
      throw new Error(`Can't ensure no dupes for that count`);
    }

    for (let i: number = 0; i < count; i++) {
      if (allowDupes) {
        // Dupes are cool, so let's just pull random reptiles
        selected.push(this.reptiles[
          Math.floor(Math.random() * this.reptiles.length)
        ]);
      } else {
        // Dupes are no go, shuffle the array and grab a few
        this.shuffle();
        selected = this.reptiles.slice(0, count);
      }
    }

    return selected;
  }
}

const reptile = new Reptile();
console.log(`With Dupes: ${reptile.random(10, true)}`);
console.log(`And Without: ${reptile.random(10)}`);

The above script pulls random values from an array and returns two lists of different types of reptiles: one with duplicates and one without. Again, you can use your own script if you prefer.

Ensure that your script returns a value and prints something to the console. You will want to see the results of your code. With your TypeScript script in place, you can now move on to running your script.

Step 2 — Running Scripts

Before you usets-node, it’s good practice to know what happens when you run a TypeScript script with Node.

You will run the reptile.ts script (or your own TypeScript script) using the node command:

  • node reptile.ts

After running this command, you will see an error message:

Output
SyntaxError: Unexpected identifier

The SyntaxError: Unexpected identifier message specifically points out the private class variable on the second line of reptile.ts.

Using Node to run TypeScript scripts returns an error. Now that you know what not to do and what to expect when you do it. This is where ts-node comes in.

Use ts-node to run the reptile.ts script:

  • npx ts-node reptile.ts

If you would like to know more about the npx command, it’s a tool that now comes with npm that allows you to run binaries that are local to the project from the command line.

You will see this output from running this script:

Output
With Dupes: Komodo Dragon,Python,Tortoise,Iguana,Turtle,Salamander,Python,Python,Salamander,Snake And Without: Alligator,Iguana,Lizard,Snake,Tortoise,Chameleon,Crocodile,Komodo Dragon,Turtle,Salamander

Running reptile.ts with ts-node will return two lists of types of reptiles, one potentially having duplicates and one without. The ts-node command efficiently runs TypeScript scripts. But there is a way to make it even faster.

Step 3 — Speeding Things Up

Under the hood, ts-node takes your script, does some semantic checking to ensure your code is error-free, and then compiles your TypeScript into JavaScript.

This is the safest option. But if you’re not worried about TypeScript errors, you can pass in the -T or --transpileOnly flag. This flag tells ts-node to transpile down to JavaScript without checking for any TypeScript errors.

While it’s not always advisable to use this flag, there are scenarios where it makes sense. If you’re trying to run someone else’s script or if you’re confident that your editor and linter are catching everything, using the -T or --transpileOnly flag is appropriate.

  • npx ts-node -T reptile.ts

Running this command will give you the same output as npx ts-node reptile.ts. There is still more ts-node can do. This command also offers a TypeScript REPL.

Step 4 — Using the TypeScript REPL

Another added bonus to ts-node is being able to use a TypeScript REPL (read-evaluate-print loop) similar to running node without any options.

This TypeScript REPL allows you to write TypeScript right on the command-line and is super handy for quickly testing something out.

To access the TypeScript REPL, run ts-node without any arguments:

  • npx ts-node

And now you can enjoy all of the strictness that TypeScript has to offer, right in your favorite terminal!

Conclusion

In this article, you used ts-node to run TypeScript scripts. You also used the TypeScript REPL with ts-node.

If you’re interested in moving further with TypeScript, you may find this [How To Work With TypeScript in Visual Studio Code](digitalocean.com/community/tutorials/how-to-work-with-typescript-in-visual-studio-code) article useful.