Friday, October 9, 2020

Javascript - Single-thread liệu đã lỗi thời?

Introduction

Sống trong 1 thế giới công nghệ thay đổi đến chóng mặt, trong trí nhớ của tôi thì mấy con PC những năm tôi học cấp 2, cấp 3 tầm 200x cấu hình còn không mạnh bằng smartphone bây giờ nữa.

Khoảng năm 2005 trở về trước là thời đại của Pentium 4 và Athlon64, bước ngoặt có lẽ là khi Intel ra mắt Pentium D, còn AMD ra mắt Athlon 64 X2, đánh dấu kỉ nguyên của chip multi-core.

Trước đấy thì đa phần các nhà sản xuất CPU đua performance với nhau bằng xung nhịp (clock speed - bị ảnh hưởng bởi số lượng, kích thước của các transistor). Đơn giản thời kỳ này so chip là so con nào clock speed cao hơn là con đấy win. Sau này việc thu nhỏ transistor hay tăng clock speed ngày càng trở nên khó khăn hơn do các vấn đề về giới hạn vật lý, chi phí, v..v.. buộc các NSX chip phải phát triển những phương án khác để gia tăng hiệu năng CPU. Cuộc đua clock speed dừng lại tại đây, con chip có xung nhịp cao nhất thời bấy giờ là Pentium EE với mức xung nhịp lên tới 4.0Ghz, và cho đến bây giờ nó vẫn nằm top những CPU có clock speed cao nhất thế giới - nhưng tất nhiên là hiệu năng thì chắc bằng không bằng 1/10 mấy em chip bây giờ.

CPU multi-core ra đường trong hoàn cảnh này và cuộc đua số lượng core trong CPU lại 1 lần nữa được bắt đầu, lần này không chỉ gói trong mảng PC, mà các NSX mobile chip cũng tham gia vào.

Cùng với sự xuất hiện của CPU multi-core là cũng là sự chuyển mình của ngành công nghiệp phần mềm. Chip multi-core thời kỳ đầu không đạt được kỳ vọng về hiệu năng 1 phần do phần mềm ít hỗ trợ. Nhưng dần dần các nhà phát triển cũng đã refactor lại theo hướng mới này.

Những khái niệm multi-processes , multi-threads,.. như kim chỉ nam cho bất cứ dev nào muốn become guru. Các ngôn ngữ thời thượng C#, C++, Java, cũng đều support multi-threading, các máy chủ web Apache, IIS,.. cũng là những ứng dụng multi-thread, nói chung là người người, nhà nhà multi-thread.

Ấy vậy mà trong thế giới đa luồng - đa tiến trình ấy, Javascript vẫn nổi lên là 1 trong những ngôn ngữ thông dụng nhất thế giới và thậm chí Nodejs run-time environment còn đang lăm le soán ngôi của các ngôn ngữ kỳ cựu khác trong cuộc chiến server-side.

Nhưng javascript - và Nodejs environment - hoàn toàn là 1 ngôn ngữ đơn luồng (single - thread)

Liệu điều này có đi ngược lại quy luật phát triển không? Chắc chắn về mặt nhận thức thực tế thì là không rồi. Người ta đang dùng nó ngày 1 nhiều và nhất là về tốc độ thì nó đang tỏ ra vượt trội so với các đối thủ khác. Server xây dựng bằng Nodejs có khả năng chịu được nhiều hơn hàng ngàn connection đồng thời so với Apache, IIS ,...

Vậy liệu gì đã biến Javascript trở nên như vậy, chúng ta sẽ cùng tìm hiểu qua bài viết này nhé các bạn !

Trước khi bắt đầu, tôi remind lại 1 chút cho các bạn là Nodejs bản chất của nó là 1 runtime-environment dùng để chạy code javascipt, điều đó có nghĩa là nodejs chính là code javascript và tất nhiên chúng cũng chia sẻ nhau những khái niệm về cấu trúc và cách hoạt động.

Bởi vì có những bài viết trên mạng thậm chí từ nguồn uy tín định nghĩa Nodejs là 1 ngôn ngữ => điều này là không đúng.

Threads và Processes

Đầu tiên chúng ta sẽ quay lại lịch sử 1 chút, thời kỳ đầu thì các máy tính chỉ có thể làm 1 việc vào mỗi thời điểm chứ không đa tác vụ như bây giờ. Tức là bạn chỉ có thể làm được 1 trong 2 việc: "gõ văn bản" hay "nghe nhạc" chứ không thể làm được cả 2 việc này cùng 1 lúc.

Tất cả các chương trình đều chạy 1 cách tuần tự, chương trình này chạy hết rồi mới đến chương trình tiếp theo, đây gọi là batch processing. Với sự mạnh mẽ lên từng ngày của phần cứng, batch processing gây nên lãng phí tài nguyên, 1 tác vụ đơn giản không thể tận dụng được toàn bộ sức mạnh của CPU.

Các hệ điều hành Multitasking ra đời khắc phục nhược điểm này, các hệ thống multitasking cho phép người dùng chuyển đổi giữa các tác vụ như nghe nhạc, lướt web và làm chúng dường như "có thể" chạy 1 cách đồng thời.

Tại sao tôi dùng từ có thể thì chúng ta sẽ làm rõ hơn vấn đề này bên dưới. Note lại nhé.

Sự ra đời của Multitasking OS kéo theo 2 khái niệm: Thread và Process

Process

Process là 1 tiến trình => Khi bạn click đúp vào icon Chrome => 1 process chrome đã được khởi tạo.

Mỗi process khi được khởi tạo sẽ chạy 1 thread chính, tuy nhiên nó cũng có thể tạo ra nhiều thread con trong quá trình hoạt động.

Thread

Thread là 1 thực thể hiện hữu trong process. Có thể coi thread là 1 process con của process cha. 1 Process có thể có nhiều thread con.

1 đặc điểm quan trọng của thread là trình tự thực thi trong CPU của nó có thể sắp xếp (schedule) được, là tiền đề để máy tính có thể xử lý đa tác vụ

Để so sánh kỹ hơn về process và thread các bạn có thể tham khảo ở đây

Máy tính xử lý đa tác vụ như thế nào?

Các bạn còn nhớ tôi đã dùng "có thể" khi nói về cách các hệ điều hành xử lý đa tác vụ ? Đúng vậy, trên thực tế CPU đơn nhân không thể chạy được nhiều hơn 1 tác vụ tại mỗi thời điểm, nó không khác gì CPU từ những thế kỷ trước cả. Vậy làm cách nào hệ điều hành có thể biến quy trình xử lý tuần tự trở thành quy trình xử lý đa tác vụ?

Nếu như các bạn đã biết thì mỗi tác vụ - ngay cả đơn giản như 1 click chuột - bao gồm hàng ngàn, hàng vạn , thậm chí hàng triệu phép tính => lợi dụng đặc điểm này mà các nhà phát triển nghĩ ra cách cho phép OS chỉ thực thi 1 phần các phép tính của tác vụ này sau đó chuyển sang làm tác vụ khác. Và vì mỗi phép tính là cực kỳ nhanh nên nó làm cho ta bị đánh lừa rằng máy tính đang làm nhiều tác vụ cùng 1 lúc.

Thôi nói nhảm đi, zing mp3 của tôi không bị khựng lại tí nào trong lúc tôi đang lướt facebook

Đúng vậy, OS xử lý đa tác vụ không chỉ nhờ vào tốc độ cực kỳ nhanh của CPU, 1 chiếc máy tính còn nhiều con chip khác nữa, bên trong mỗi con chip đó được lấp đầy bởi bộ nhớ cache và hàng đợi các công việc nó phải làm. CPU sẽ tính toán vài giây mp3 trước khi nó chạy tới và ném kết quả sang cho chip audio, trong lúc chip audio phát ra nhạc thì CPU rảnh rỗi để có thể xử lý tác vụ khác.

CPU ngày nay cực kỳ nhanh, tốc độ của nó được đo bằng flops (thao tác tính số thực dấu phẩy động - Floating Point Operations per second). 1 CPU thường thì có tốc độ khoảng 50 - 70 GigaFlops tương đương với 1 phép tính được xử lý trong vòng 1/triệu giây. Để so sánh thì 1 người ấn nút bàn phím nhanh nhất cỡ 4mili giây. Sau khi bấm xong nút đó thì máy tính đã xử lý xong ~200 000 phép tính rồi. Cực kỳ nhanh!

Áp dụng chung lối thiết kế đó, các chương trình ngày nay cũng được phân thành các threads hoạt động phụ thuộc vào nhau nhằm tận dụng hoàn toàn sức mạnh của CPU.

1 CPU đơn lõi xử lý multi-threading như ví dụ sau:

Cho 2 threads A và B cần phải chạy 1 tập các lệnh. Sau mỗi lệnh, threads cần kết quả trả về thì mới thực hiện lệnh tiếp theo.

CPU sẽ thực thi bắt đầu từ thread A (with multi-threading) và như tôi đã nói ở phần trên, thread có thể được sắp xếp lại trình tự hoạt động của nó khi được thực thi trong CPU, nghĩa là việc CPU thực hiện command của A => thực hiện xong nhảy sang thread B => rồi lại nhảy trở về thực hiện tiếp lệnh tiếp theo của thread A là khả thi ?!?

Thread A và Thread B được luân phiên xử lý. ĐIều này giúp cho tài nguyên CPU không bị lãng phí khi thread A chạy đến lệnh waiting, trong thời gian chờ kết quả thì CPU đã nhảy sang thực thi tập lệnh của thread B. Khi không chạy multi-threading CPU sẽ xử lý 1 cách tuần tự như sau:

Dễ dàng nhận thấy xử lý multy-threading hiệu quả và tận dụng tài nguyên tốt hơn rất nhiều! Tuy là giữa những lần thay đổi ngữ cảnh thực thi giữa 2 thread sẽ phải đánh đổi 1 lượng tài nguyên (delay) nhất định - còn gọi là time slicing - nhưng multi-thread vẫn hoàn toàn chiến thắng trong các bài test kiểu như thế này. (thậm chí với công nghệ HT - Hyper Threading - điều này còn được tối ưu hơn nữa).

Từ nãy tới giờ chỉ thấy nói đến điểm mạnh của multi-threading, vậy Javascript đứng ở đâu trong bức tranh này?

Multi-threading rất mạnh, nhưng xin được nhắc lại 1 lần nữa, js hoàn toàn là single-thread, nhưng tại sao nó lại được trọng dụng đến vậy? Để làm sáng rõ thì phần tiếp theo tôi sẽ nói 1 chút về cái hồn của js - Eventloop

~~Note: Eventloop tài liệu trên mạng rất nhiều, và nếu bạn nào đã biết về Eventloop thì có thể bỏ qua đọc phần next-next luôn.~~

Eventloop trong javascript

Tôi sẽ chỉ nói qua về event-loop trong javascript thôi, bởi vì trên mạng đã có rất nhiều tài liệu rồi, nhưng chính thống nhất thì vẫn là trang chủ Mozilla Ngoài ra tôi cũng đã đọc được 1 series khá hay và dễ hiểu dành cho những ai chưa biết tý gì về Eventloop -hongkiat

OK, vậy thì đã có ai từng tự hỏi là tại sao cơ chế lắng nghe sự kiện hay gửi request AJAX trong javascript dường như được thực hiện theo thời gian thực, bạn vẫn có thể tương tác với trang web mà đồng thời vẫn gửi được các request đi? Thực ra về mặt lý thuyết thì đây không phải là real-time mà được gọi là xử lý bất đồng bộ "ASYNCHRONOUS", và nó cũng không hoạt động theo kiểu multi-threading để xử lý nhiều việc song song nhau 1 lúc.

Javascript sử dụng mô hình xử lý Concurrency dựa trên Event-loop để xử lý bất đồng bộ.

Cũng như mọi ngôn ngữ khác, javascript được biên dịch từ trên xuống dưới và từ trái sang phải.

Khi 1 function() được gọi trong code, javascript không thực thi nó ngay mà đưa nó vào 1 hàng đợi (queue), những function() đã được đưa vào hàng đợi sẽ lần lượt được được vào Callstack ngay khi CallStack rỗng. Đây cũng là nơi function() sẽ được thực thi. Thứ tự của Queue là function nào được đưa vào trước thì sẽ được đưa sang Callstack trước, function nào sau thì sang sau.

Đối với Callstack khi 1 function được đưa vào đây nó sẽ trở thành ngữ cảnh thực thi - gọi đây là function context - và Callstack sẽ gọi đệ quy tất cả các function được function context gọi đến. Bởi vì có 1 vòng lặp chạy mãi mãi thực thi Queue nên nó có tên gọi là Event-loop.

while (queue.waitForMessage()) {
  queue.processNextMessage();
}

Nhưng đây không phải cách duy nhất để gọi 1 function trong javascript.

JS web API là những API được cung cấp bởi trình duyệt, nó có nhiệm vụ đưa eventHandle và callback function vào hàng đợi Queue khi có sự kiện xảy ra. Javascript dùng 1 process chạy ngầm để theo dõi DOM và thành phần khác xem khi nào có sự kiện sẽ dùng những API để đưa eventHandle vào Queue.

Timer cũng tương tự như vậy nhưng nó dùng để đưa function vào Queue khi ta sử dụng hàm setTimeout.

Thì vẫn là javascript đang thực hiện các tác vụ song song không phải vậy sao?

Không, chỉ có duy nhất 1 tác vụ được thực hiện trong 1 khoảng thời gian, hãy nhìn vào hình sau để rõ hơn, trong callstack luôn luôn chỉ có 1 function được thực thi hết function này thì tới function khác.

Vậy thì cái gì mà gọi là Concurrency, chẳng phải đây chính là cách CPU xử lý multi-threads đó sao, vậy có thể nói javascript là ngôn ngữ single-thread được ?

Đúng vậy, Concurrency là 1 khái niệm dễ gây hiểu nhầm, Concurrency có thể đem lại kết quả tương tự như khi ta xử lý đa luồng, vừa gửi request lại vừa xem video chả hạn? nhưng nó hoàn toàn không phải là xử lý mọi việc song song, CPU vẫn có thể chỉ sử dụng 1 thread duy nhất để xử lý Concurrency và tôi sẽ làm rõ vấn đề này ngay bây giờ.

Concurrency và Paralelism

Nhiều người mới tìm hiểu Javascript rất dễ nhầm lẫn xử lý Concurrency tức là xử lý song song (Paralelism). Thực ra đây là 2 khái niệm khác nhau nhưng liên quan đến nhau. Xử lý Concurrency không giống với Multi-thread, multi-core, multi-processes hay bất cứ gì tương tự như thế.

Concurrency

Concurrency là khi 2 hoặc nhiều task có thể bắt đầu, chạy và hoàn thành trong những khoảng thời gian chồng lên nhau. Nó không nhất thiết là chúng cần phải chạy cùng 1 lúc Tồn tại khi ít nhất 2 threads đang xử lý. Đây là 1 thuộc tính của chương trình hoặc hệ thống.

Parallelism

Parallelism là khi các task chạy cùng 1 lúc theo nghĩa đen. Là khi 2 threads được thực thi đồng thời. Đây là 1 hành vi của run-time thực thi đa tác vụ cùng 1 lúc.

Đúng vậy, Parallelism mới thực sự là đa tác vụ !

Lý thuyết thì hơi khó hiểu chúng ta xem thử 1 ví dụ sau:

Thời đại này ta có HHVM, ta có Node, ta có Swift, ta có Golang, Haskell.. chắc chả ai còn đọc sách lập trình C nữa, đem đốt nó đi thôi, nhưng vì cái xe và cái lò có hạn nên mỗi lần chỉ đốt được vài quyển thôi =)) (vui thôi nhé đừng gạch đá - vì thực ra tôi cũng không có quyển sách C nào =))))

OK, bây giờ thanh niên trong ví dụ sẽ phải làm những task sau :

Chất sách vào xe
Đẩy xe đến lò
Đốt sách
Đẩy xe về

Concurrency là phân chia công việc đó ra thành các phần nhỏ hơn, mà cụ thể trong trường hợp này là 4 task trên: 4 thằng, mỗi thằng 1 việc - không thằng nào chờ thằng nào hết, công việc nhanh hơn 4 lần:

Nếu có 2 cái lò, chia đống sách thành 2 phần , có thêm xe đẩy, có thêm người thì việc này còn nhanh hơn nhiều lần nữa:

Về lý thuyết, ta hoàn thành công việc nhanh hơn 16 lần.

Ví dụ chả liên quan nhỉ ?

Chưa liên quan thôi chứ không phải là không liên quan, tưởng tượng 1 chút nào :

Đống sách: Nội dung web
Người làm: CPU
Xe đẩy: Băng thông mạng
Lò: Trình duyệt

Nếu bạn đã hình dung ra thì Vâng, đây chính là thiết kế chương trình theo hướng concurrency, 1 tác vụ được chia thành nhiều phần nhỏ. Tuy thế việc nó có được xử lý đa tác vụ (multithreading) hay không thì còn tùy thuộc vào ngôn ngữ lập trình và CPU (số lượng cores - và các tập lệnh trên CPU)

Công việc đốt sách sẽ không thể nhanh hơn 16 lần, hay 4 lần nếu như trong 1 khoảng thời gian chỉ có 1 người làm việc (1 CPU core). Đây chính là khi chương trình được viết theo hướng concurrency nhưng không được runtime xử lý parallel, hay nói cách khác nó là single-thread.

Với 2 CPU tốc độ về lý thuyết tăng lên gấp 2, đây là điều mà ở thời kỳ đầu phát triển của CPU multicore không thể đạt được dù là trên lý thuyết do các chương trình không được viết theo hướng concurrency. Tất nhiên với những chương trình thiết kế tốt 16 cores của CPU được tận dụng hoàn toàn đạt được tốc độ xử lý nhanh gấp 16 lần so với 1 core. Tuyệt vời !!!

Tóm lại, JavaScript là Concurrency chứ không phải Parallelism

Single-thread ? Is it good?

Như vậy chúng ta đã thống nhất với nhau JS là 1 ngôn ngữ thuần đơn luồng, vậy lý do gì khiến nó trở nên đặc biệt?

Trong PHP (hoặc Java/ASP.NET/Ruby) web-server mỗi client request sẽ được khởi tạo 1 thread mới, những trong Nodejs tất cả các client chạy trên cùng 1 thread - thậm chí chia sẻ nhau chung biến). Eventloop cho phép chúng ta làm điều này bởi vì nó không block thread chính.

Nhưng nếu có 1 tác vụ nặng chạy trên CPU server node thì nó sẽ block thread này lại và những connection tiếp theo sẽ phải chờ thread xử lý xong tác vụ. Điều này không xảy ra với những webserver multithreads khi cho phép các tác vụ chạy trên những thread khác nhau.

Có thể thấy là mỗi cái đều có điểm lợi và hại

Nodejs được tạo ra 1 cách chính thức sau 1 nghiên cứu về tiến trình async( async process). Theo lý thuyết này thì xử lý async đơn luồng sẽ mang lại nhiều hiệu năng và khả năng mở rộng hơn cho trình duyệt web. Và lý thuyết này đã được thực tế xác minh. 1 nodejs có thể chạy nhiều hơn hàng ngàn connection so với máy chủ Apache hoặc IIS. Thường thì các máy chủ PHP/Java/ASP.NET/Ruby mỗi connection sẽ tạo ra 1 thread. Nhưng Nodejs thì tất cả client đều chạy trên cùng 1 thread. Nodejs được tạo ra 1 cách có chủ đích như vậy và sẽ KHÔNG BAO GIỜ hỗ trợ đa luồng.

Để giải quyết vấn đề khi có những process nặng chạy trên server node chúng ta sẽ chọn giải pháp là tạo ra các child process, từ 1 cái process to chia ra làm các cái nhỏ thì sẽ đơn giản hơn nếu làm ngược lại các bạn ạ. Làm sao để mở rộng máy chủ node

No locking bugs/ No Overloading

Mỗi 1 task trong JS đòi hỏi 1 lượng rất nhỏ trong bộ nhớ Heap do nó chạy trên chỉ 1 thread. Còn việc tạo ra thread mới thì đòi hỏi nhiều bộ nhớ hơn nhiều, vài MB đối với 1 số nền tảng. Nhưng nhược điểm chí mạng của đa luồng tới từ việc chuyển đổi ngữ cảnh thực thi. Khi số lượng thread lên tới con số trăm ngàn, thậm chí hàng triệu thì việc chuyển đổi thread là hết sức nặng nề.

Nếu ở trên bạn nhớ tôi đã nói vấn đề delay (time slicing) khi chuyển đổi thread trong CPU là không đáng kể, điều đó vẫn đúng nhưng là với máy tính cá nhân, rất ít máy tính cá nhân đạt đến con số vài ngàn threads, còn với máy chủ server thì con số hàng triệu, thậm chí hàng chục triệu threads thì đây thực sự là 1 vấn đề lớn.

Việc lập trình tốt multithread cũng khó hơn rất rất nhiều, để các threads tương tác với nhau tốt cũng là một vấn đề lớn phải quan tâm.

No DOM collapse

Còn javascript phía client-side thì sao? 1 khi code khởi tạo chạy xong (khi trang web đã được load và js code đã chạy, tất cả event handler đã được gắn vào element của nó, AJAX request thì đã được gửi) và trình duyệt đã bắt đầu lắng nghe sự kiện DOM.

Ngay tại lúc này , sao chúng ta lại không handle các event theo cách parallelism truyền thống? Hay nói cách khác là sử dụng nhiều thread để điều khiển DOM và các event gắn với nó, chắc chắn sẽ giúp hiệu năng tăng lên như đã phân tích phía trên, nó cũng không bị gò bó bởi số lượng threads như khi triển khai code server-side. Giả như nếu code js có thể thực thi parallelism, thì rõ ràng ta đủ quyền năng để khống chế số lượgn threads đảm bảo tối ưu hiệu năng.

Nhưng tại sao những nhà phát triển JS lại không làm như vậy?

Lý do là bởi vì các bạn sẽ không muốn thao túng DOM bằng nhiều threads đâu. DOM tree không phải 1 thứ an toàn để chỉnh sửa, nó rất dễ trở thành 1 mớ hỗn loạn nếu cứ cố gắng làm điều đó. Việc sử dụng nhiều thread thao túng DOM có thể ví von với việc ta làm ô nhiễm biến tòan cục bởi nhiều function liên kết với nhau hết sức chặt chẽ. Mà như đã biết, lập trình dependency threads thậm chí còn khó khăn hơn nhiều và không có gì đảm bảo nó có thể chạy trơn tru, chỉ 1 sai sót nhỏ và cái ta nhận được sẽ là DOM tree collapse.

Is it enough?

Đúng với câu, chất lượng hơn số lượng, với 2 đặc điểm nổi bật đã nêu giúp JS tranh bá với các ngôn ngữ khác bất chấp việc nó còn rất nhiều thiếu sót

Giờ đây ,JS gần như là sự lựa chọn duy nhất khi lập trình web client-side , Nodejs thì ngày càng lớn mạnh vững chắc.

Nhưng các bạn thấy đấy, multithreading là xu hướng, là tương lai của công nghiệp phần mềm, làm web bằng JS tốt không có nghĩa là nó là hoàn hảo. Thậm chí ngay cả khi làm web cũng còn nhiều tranh cãi không có hồi kết giữa các ngôn ngữ với nhau. Dù sao thì mục đích của bài viết không phải là so sánh giữa các ngôn ngữ mà để đi tới 1 kết luận:

Single-thread hữu dụng và sẽ vẫn còn hữu dụng trong tương lai

Và tôi cũng muốn nhắc bạn 1 câu cũ rích, không phải cứ công nghệ mới hơn là tốt hơn và thậm chí nếu nó tốt hơn thật thì cũng chưa chắc nó đã làm công việc của bạn tốt hơn cách bạn đang làm.

References

Rob Pike - 'Concurrency Is Not Parallelism' - Golang co-Creator Philip Roberts: What the heck is the event loop anyway? https://en.wikipedia.org https://techmaster.vn/posts/33604/su-khac-nhau-giua-process-va-thread http://www.dtp.fmph.uniba.sk/javastuff/javacourse/week11/01.html https://www.simple-talk.com/dotnet/asp-net/javascript-single-threaded/

Tuesday, October 6, 2020

NodeJS:Multiple client requests

Given a NodeJS application, since Node is single threaded, say if processing involves a Promise.all that takes 8 seconds, does this mean that the client request that comes after this request would need to wait for eight seconds?

No. NodeJS event loop is single threaded. The entire server architecture for NodeJS is not single threaded.

Before getting into the Node server architecture, to take a look at typical multithreaded request response model, the web server would have multiple threads and when concurrent requests get to the webserver, the webserver picks threadOne from the threadPool and threadOne processes requestOne and responds to clientOne and when the second request comes in, the web server picks up the second thread from the threadPool and picks up requestTwo and processes it and responds to clientTwo. threadOne is responsible for all kinds of operations that requestOne demanded including doing any blocking IO operations.

The fact that the thread needs to wait for blocking IO operations is what makes it inefficient. With this kind of a model, the webserver is only able to serve as much requests as there are threads in the thread pool.

NodeJS Web Server maintains a limited Thread Pool to provide services to client requests. Multiple clients make multiple requests to the NodeJS server. NodeJS receives these requests and places them into the EventQueue .
NodeJS server has an internal component referred to as the EventLoop which is an infinite loop that receives requests and processes them. This EventLoop is single threaded. In other words, EventLoop is the listener for the EventQueue.
So, we have an event queue where the requests are being placed and we have an event loop listening to these requests in the event queue. What happens next?
The listener(the event loop) processes the request and if it is able to process the request without needing any blocking IO operations, then the event loop would itself process the request and sends the response back to the client by itself.
If the current request uses blocking IO operations, the event loop sees whether there are threads available in the thread pool, picks up one thread from the thread pool and assigns the particular request to the picked thread. That thread does the blocking IO operations and sends the response back to the event loop and once the response gets to the event loop, the event loop sends the response back to the client.

How is NodeJS better than traditional multithreaded request response model?
With traditional multithreaded request/response model, every client gets a different thread where as with NodeJS, the simpler request are all handled directly by the EventLoop. This is an optimization of thread pool resources and there is no overhead of creating the threads for every client request.

A simple guide to JavaScript concurrency in Node.js and a few traps that come with it

I bet that you are familiar with JavaScript concurrency in Node.js. Also, most probably you have already heard that Node excels at handling multiple asynchronous I/O operations. But have you ever wondered what does it really mean? There are lots of potential questions. How exactly it’s done in Node.js? Isn’t it single-threaded? What about operations other than I/O? Is there any way you can handle them without making your app unresponsive? In the article, I’m hoping to clarify how Node deals with asynchronicity under the hood. I’ll try to explain what are the potential traps that you should be aware of. Also, I’ll focus on how Node’s new features might help you push your applications even further than ever before. Let’s dive in!

Let’s start with the theory

Before I’ll start talking about Node.js, let’s quickly distinguish two terms that might be a little confusing. These are concurrent and parallel concurrency in Node.js. Let’s imagine that you have two operations: A and B. If you weren’t dealing with any concurrency at all, you would start with the operation A. Once it’s completed, you would start the operation B, so they would simply run sequentially. Such execution will look like the one on the diagram below.

The diagram shows the turn of events without JavaScript concurrency in Node.js — Once operation A is completed, you start the operation B – they are run sequentially.

If you want to handle these operations concurrently, the execution will look like this.

The execution changes if you need to handle operations concurrently — Alternative solution is to handle the operations concurrently.

These operations don’t run sequentially, but neither are they executed simultaneously. Instead, the process “jumps” or switches context between them.

You can also execute these operations in parallel. Then, they run simultaneously on two separate CPUs. In theory, they can be started and finished at the same time.

Another solution is to run both operations in parallel — The third option is to run the operations in parallel.

One thread to rule them all

Node took a slightly different approach to handling multiple concurrent requests at the same time if you compare it to some other popular servers like Apache. Spawning a new thread for each request is expensive. Also, threads are doing nothing when awaiting other operations’ result (i.e.: database read). That’s why Node is using one thread instead. Such an approach has numerous advantages. No overhead comes with creating new threads. Also, your code is much easier to reason about, as you don’t have to worry about what will happen if two threads access the same variable. It’s because that simply cannot happen. There are some drawbacks as well. Node isn’t the best choice for applications that mostly deal with CPU intensive computing. On the other hand, it excels at handling multiple I/O requests. So, let’s focus on this part for a bit.

I/O operations and Node

Firstly, I should answer the question: what do you mean by I/O operations? These are operations that communicate with stuff from the outside of your application. It means HTTP requests, disk reads and writes or database operations, just to name a few. I/O in Node comes in two “flavors”: blocking and non-blocking. It’s quite important to distinguish these two. “Blocking” in “blocking I/O operations” is quite self-descriptive. It means that next operations are blocked and that they have to wait as long as the currently running operation is taking. Let’s take a look at the example.

fs.readFileSync(filePath1)
fs.readFileSync(filePath2)
console.log(‘Logging after both reads are finished’)

view raw blocking-io.js hosted with ❤ by GitHub

In the example, you trigger the first read and then, only after it’s finished you start the second one. console.log will happen once both reads are completed.

The non-blocking model works in a pretty different way. Let’s take a look at an example once again.

fs.readFile(filePath1, (err, content) => {
   if (err) // handle error 
   // otherwise do something with the content
})

fs.readFile(filePath2, (err, content) => { /* same story */ })

console.log(‘Will happen before any of the reads is finished’)

view raw non-blocking-io.js hosted with ❤ by GitHub

Here, the first read operation is triggered, but the code is not stopped. You also trigger the second read and right after that – console.log writes to the console. What about the results of the two read operations? Well, they will be handled asynchronously. It means that you’ll receive their results in the future, and you have no guarantee which operation finishes first.

Which one should you prefer?

Instead of giving the answer, let’s conduct an experiment. Let’s use Express and create a very simple HTTP server with two endpoints. First one reads some data from the file asynchronously (non-blocking I/O) and the other reads them synchronously (blocking I/O). Let’s assume that you wrapped standard readFile and readFileSync methods with some magic, so they take an extra 75 ms to finish. It will look like this.

const express = require('express')

const app = express()
const port = 3055

app.get('/async', async (req, res) => {
  const user = await readUserAsync()
  res.send(user)
})

app.get('/sync', (req, res) => {
  const user = readUserSync()
  res.send(user)
})

app.listen(port, () => console.log(`Example app listening on port ${port}!`))

view raw express-endpoints.js hosted with ❤ by GitHub

If you want to compare the performance of these two endpoints you could use one of the available benchmarking tools (let’s say – you’re interested in how many requests you can handle within 5 seconds and up to 10 requests are concurrent). I chose Apache Benchmark. It’s been there for quite some time, but it’s built into macOS. It’s doing its job just fine. Let’s run the following benchmark, first for the synchronous/blocking endpoint.

ab -c 10 -t 5 "http://localhost:3055/sync"

view raw benchmark.sh hosted with ❤ by GitHub

The benchmarking tool will give you plenty of interesting statistics, but let’s focus on how many requests you can handle per second.

Benchmarking tool gives a few interesting statistics

As you can see – you’re capable of handling about 12.1 requests per second, which makes sense. You’re only able to handle one request at the time, it takes about 80ms (75 of your “extra” read time, plus a few milliseconds to read the file and handle the request). So, you can easily calculate how many requests per seconds you can handle. It’s 1000 ms (1 second) / 80 ms = 12.5 request per second, so you’re quite close.

Now, let’s take a look at the benchmark for the asynchronous/non-blocking endpoint below.

Benchmark for the asynchronous/non-blocking endpoint

You can see that the result is much better in this example. You can handle over nine times more results per second than in the blocking version (and the difference would be more significant for longer read times). So it should be clear which version you should favor.

Highlight: Always favor non-blocking I/O over blocking I/O.

But… how?

You can see that Node works quite well when it comes to handling I/O operations asynchronously. It might be a little surprising when one knows that it works with just one thread. Doesn’t one thread equals to one operation at the time? Well… yes and no. To be less vague, you can indeed handle only one JavaScript operation at the time, however, you can take advantage of the fact that modern operating systems’ kernels are multi-threaded. So, you can delegate I/O operations to them. That brings us another question. When such operations are completed, how does Node know that it’s time to handle them? You can imagine that it can’t happen anytime, otherwise – your applications would be a nightmare to work with. Imagine that you’re doing an operation and suddenly in the middle of it – a callback is triggered because the disk operation has just finished. I think you’ll agree that it’s not the way to go. So, let’s introduce someone that’ll help Node manage all this mess.

What is the JavaScript Event Loop?

Let’s briefly explain what Event Loop is and how it works. Previously, I’ve mentioned that you need some kind of “manager” to be able to handle asynchronous operations. That’s precisely what Event Loop does. Let’s take a look at an example of an asynchronous operation.

asyncOperation(param1, param2, function(error, result) { /* … */ })

view raw async-operation.js hosted with ❤ by GitHub

The interesting part here is a function that you pass as the last argument. It’s a callback function that will be called once your operation is finished, but not immediately. We agreed that such things shouldn’t happen anytime and Event Loop will make sure of that. Once your operation is finished, instead of calling the callback right away, it will place it into a special queue. Once it’s possible to handle it safely, your callbacks will be pushed to the stack and then executed one by one. You should be careful here, even if your I/O operation was asynchronous, the callback will be handled on the main thread, assuming it’s a JS operation. So, if you’re doing something time consuming, you’ll be blocking the Event Loop.

This is, of course, a great simplification of how Event Loop works. If you’d like to learn more about it – this video is a great introduction.

This series in the Insider Attack blog explains it much deeper.

What about non I/O operations?

Okay, so I covered a piece of information about a great deal with the use of blocking and non-blocking I/O. You saw that Node deals quite well with the latter and now you know how to do it. Event Loop for the win! What about the stuff that doesn’t deal with I/O, but may process some heavy computing for you? For example – sorting a huge list. Unfortunately, this is where Node single-threaded environment won’t shine that bright. Let’s jump back into our example with Express API. You saw that Node deals quite well with handling requests to your asynchronous endpoint. What would happen if the same API exposed a new endpoint which did some heavy computing that takes a few seconds?

  app.get('/timeConsumingEndpoint', (req, res) => {
  doSomeHeavyComputing()

  res.sendStatus(200)
})

view raw time-consuming-endpoints.js hosted with ❤ by GitHub

Now, let’s imagine a scenario where this endpoint is being used by some batch job to process some data. In such a scenario, you’re not even doing any concurrent requests. Instead, you’re making sure that your server is constantly spinning and doing some job for you. What would happen if you’d run some benchmarks on your old asynchronous endpoint (let’s name it /async), while your CPU intensive task is constantly running? Let’s see.

Too many CPU intensive tasks can make your API unresponsive

Yup, you’re right. Your API is pretty much unresponsive at the moment. If you’re unaware of the limits of Node’s single thread, then unfortunately, you can achieve such unresponsiveness pretty easily. Is there anything you can do to somehow fix this case? One idea that comes to mind is to scale your app. But you don’t need help from the DevOps team to do that. Fortunately, Node has some built-in mechanisms that allow you to achieve it.

Cluster mode

The official documentation explains it quite nice, so I won’t be reinventing the wheel and will simply quote it here.

A single instance of Node.js runs in a single thread. To take advantage of multi-core systems, the user will sometimes want to launch a cluster of Node.js processes to handle the load.

So, what Cluster Mode gives you is a very basic load balancer. It will distribute the load in the round-robin approach between the nodes. Sounds pretty nice, doesn’t it? Nowadays, we have fancy CPUs with multiple cores, so it makes sense to take advantage of them. Especially if your application became so hopelessly unresponsive.

Let’s take a look at how this may look.

if (cluster.isMaster) {
  console.log(`Master ${process.pid} is running`)

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork()
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`worker ${worker.process.pid} died`)
  })
} else {
  const express = require('express')

  const app = express()

  // define our endpoints here
   
  app.listen(port)

  console.log(`Process ${process.pid} started`)
}

view raw api-with-cluster.js hosted with ❤ by GitHub

Nothing too complicated, right? The fact that you don’t have to bother the DevOps team makes it even better! If you’ll run your API, you can see that you indeed spawned some processes.

A screenshot of a benchmark.

Now, let’s see if this fixed the problem. Your long-running job is opened in the background, your Super Important Algorithm is still doing its job, so let’s run the benchmark again.

Example of another algorithm run

Looks good, doesn’t it? But you will probably agree that your CPU intensive endpoint isn’t that busy right now. What if someone decided to speed some things up and run, let’s say, five concurrent jobs (or just any number of jobs that exceeds the number of CPUs on the server) that are consuming this endpoint? You will see the same thing as you did before introducing Cluster Mode. No responses whatsoever. And what would happen if you expose this endpoint to the public, for the whole world to use it? You can see that as nice and easy to use Cluster Mode is, it doesn’t really solve the problem if you’re able to make each process unresponsive so quickly.

Are you doomed then? Well, luckily you are not. Some time ago, one of the Node releases gave us all a nice surprise, namely…

Worker Threads!

Worker threads are still an experimental feature, meaning using them on production environment probably isn’t the best idea. But there’s a chance this will change soon. Worker Threads allow you to execute JavaScript in the parallel. It sounds exactly like something that you can use in the aforementioned scenario. Let’s take a quick look at how you could introduce Worker Threads to your API. The first thing you have to do is to create a new file. I named it heavy-computing-with-threads.js.

const { Worker, isMainThread, parentPort } = require('worker_threads')

if (isMainThread) {
  module.exports = async function timeConsumingOperationOnThreads(raw) {
      return new Promise((resolve, reject) => {
          const worker = new Worker(__filename, {
              workerData: raw
          })
          worker.on('message', resolve)
          worker.on('error', reject)
          worker.on('exit', (code) => {
              if (code !== 0) {
                  reject(new Error(`Worker stopped with exit code ${code}`))
              }
          })
      })
  }
} else {
  const result = doSomeHeavyComputing()
  parentPort.postMessage({ result})
}

view raw heavy-computing-with-threads.js hosted with ❤ by GitHub

Not too complicated, right? First, you check whether you are on the main thread. If you are, then you spin a new worker thread and handle some basic communication with it. After that, you wrap it all up with a Promise which is quite a useful pattern in Node. It makes working with a callback or event-based code much easier. If you are not on the main thread that means you’re on one of the spawned threads, and that’s where you can run your long-running operation.

Now, you simply have to modify your Express API so it uses this version of the CPU intensive algorithm.

const timeConsumingOperationWithThreads = require('./heavy-computing-with-threads')
/* … */

app.get('/timeConsumingEndpoint', async (req, res) => {
  const result = await timeConsumingOperationWithThreads()

  res.send(result)
})

view raw api-with-threads.js hosted with ❤ by GitHub

If you run benchmarks again (even with the version which was making concurrent requests to the long-running endpoint) everything should be fine. Your other endpoints are responsive again!

Interested in developing microservices? 🤔 Make sure to check out our State of Microservices 2020 report – based on opinions of 650+ microservice experts!

Few more things to remember

Besides being useful when using experimental Worker Threads, you should also keep in mind that they are not necessarily useful for handling I/O operations. What’s already been built into Node before Worker Threads will work much better. And one last thing – if you take a look at your API which is using threads, you’ll see that you’re spinning a new thread for each request. That won’t be too efficient in the long run. What you should do instead is to have a pool of threads and re-use these. Otherwise, the cost of creating new threads will overcome their benefits.

Summing it up

I hope I briefly walked you through some of the traps that you can encounter if you’re not aware of Node single-threaded nature. Remember to favor non-blocking I/O over blocking I/O. Also, you should keep in mind that each JavaScript operation will block the Event Loop and that long-running operations are especially dangerous in that matter. Be aware of the built-in scaling that Cluster Mode gives you, but also be aware that it won’t necessarily solve the problems that come with CPU intensive operations in Node. For these, there’s a much better solution that came quite recently – Worker Threads. Just bear in mind that they’re still experimental, but follow the newest Node releases, as this may change anytime!

	fs.readFileSync(filePath1)
	fs.readFileSync(filePath2)
	console.log(‘Logging after both reads are finished’)

	fs.readFile(filePath1, (err, content) => {
	if (err) // handle error
	// otherwise do something with the content
	})

	fs.readFile(filePath2, (err, content) => { /* same story */ })

	console.log(‘Will happen before any of the reads is finished’)

	const express = require('express')

	const app = express()
	const port = 3055

	app.get('/async', async (req, res) => {
	const user = await readUserAsync()
	res.send(user)
	})

	app.get('/sync', (req, res) => {
	const user = readUserSync()
	res.send(user)
	})

	app.listen(port, () => console.log(`Example app listening on port ${port}!`))

	app.get('/timeConsumingEndpoint', (req, res) => {
	doSomeHeavyComputing()

	res.sendStatus(200)
	})

	if (cluster.isMaster) {
	console.log(`Master ${process.pid} is running`)

	for (let i = 0; i < numCPUs; i++) {
	cluster.fork()
	}

	cluster.on('exit', (worker, code, signal) => {
	console.log(`worker ${worker.process.pid} died`)
	})
	} else {
	const express = require('express')

	const app = express()

	// define our endpoints here

	app.listen(port)

	console.log(`Process ${process.pid} started`)
	}

	const { Worker, isMainThread, parentPort } = require('worker_threads')

	if (isMainThread) {
	module.exports = async function timeConsumingOperationOnThreads(raw) {
	return new Promise((resolve, reject) => {
	const worker = new Worker(__filename, {
	workerData: raw
	})
	worker.on('message', resolve)
	worker.on('error', reject)
	worker.on('exit', (code) => {
	if (code !== 0) {
	reject(new Error(`Worker stopped with exit code ${code}`))
	}
	})
	})
	}
	} else {
	const result = doSomeHeavyComputing()
	parentPort.postMessage({ result})
	}

	const timeConsumingOperationWithThreads = require('./heavy-computing-with-threads')
	/* … */

	app.get('/timeConsumingEndpoint', async (req, res) => {
	const result = await timeConsumingOperationWithThreads()

	res.send(result)
	})