What’s New in Node.js and libuv – May 30, 2013

Welcome to this week’s wrap up of the week in Node and libuv covering May 23 through May 29. The purpose of this blog is to recap a subset of the non-documentation related commits to Node.js, plus give a little color and commentary to the ongoing development of Node.

Node v0.10 branch highlights

Node v0.10.9 (Stable) is out

A new stable release of v0.10 was released on May 30. Read the official blog for the release notes.

tls: ignore .shutdown() syscall error
tls: invoke write cb only after opposite read end
tls: proper .destroySoon

Fedor fixed a few TLS/HTTPS bugs in fa170dd, 4f14221 and 9ee86b7 that could cause a connection to terminate prematurely (i.e. before sending the full request or response.)

repl: fix JSON.parse error check

Brian fixed JSON.parse() error handling in the REPL in 774b28f.  Before that commit, the REPL would just sit there waiting when it got bad input.

You can view the complete Node v0.10 commit history on GitHub.

Node master branch highlights

events: define properties on prototype

Ryunosuke Sato sped up EventEmitter construction by 15-20% in 7ce5a31. As Ben pointed out, sometimes it’s the simplest of fixes in Node that can deliver significant speed improvements.

vm: fix race condition in watchdog cleanup

Andrew Paprocki fixed a race condition in the vm module’s timeout watchdog in 49e3fcd that would make Node.js abort (occasionally, that’s the thing with race conditions.)

process: remove maxTickDepth from _tickCallback

Trevor removed the process.nextTick() maxTickDepth restriction in commit 0761c90. Now, we are going to say “restriction” here because it was a well-intentioned if rather arbitrary limitation. The user-visible change is that you won’t get maxTickDepth warnings anymore in code with deeply recursive nextTick() chains.  Users will need to take care not to starve the event loop by posting nextTick() callbacks ad infinitum. In layman’s terms, the following snippet will stall the event loop:

function f() { process.nextTick(f) }
f();

process: remove spinner

Trevor also removed the process.nextTick() idle spinner in commit b846842. There are no user-visible changes, but it does simplify the implementation quite a bit.

You can view the complete Node master commit history on GitHub.

libuv v0.10 branch highlights

linux: fix cpu model parsing on newer arm kernels

Ben fixed /proc/cpuinfo parsing on newer (>=3.8) Linux ARM kernels in 92c72f5 that squashed bug #812. The file format has changed and that was throwing off uv_cpu_info().

include: document uv_update_time() and uv_now()

For those who have been pondering the subject for a while, uv_now() returns the loop timestamp in milliseconds.  Now it’s documented in dfff2e9 along with why and when you’d call uv_update_time().

darwin: make uv_fs_sendfile() respect length param

Wynn Wilkes fixed uv_fs_sendfile() on OS X in b4c658c.  Before, it always sent the whole file rather than the requested range. Enjoy!

You can view the complete libuv v0.10 commit history on GitHub.

libuv master branch highlights

windows: call idle handles on every loop iteration

In bc56a4e, Bert made it so that uv_idle_t handles have the exact same semantics on Windows and Unices.

unix, windows: run expired timers in run-once mode

Ben fixed a bug in f6d8ba3 where an expired timer’s callback wouldn’t always get invoked when you called uv_run() with UV_RUN_ONCE.

build: remove CSTDFLAG, use only CFLAGS

Ben removed the CSTDFLAG build flag from the Makefile in a8da229. I think Ben is being a little sarcastic when he says, “Presumably no one cares (or even knew it was there) but I mention it for posterity.”

unix: support for android builds
uv: support android libuv standalone build

Linus Mårtensson added Android support in fc6a2ad and 3fdd2a1.

unix: avoid extra read, short-circuit on POLLHUP

Ben made end-of-stream handling a little faster on Linux in 442d11d. It now short-circuits on EPOLLHUP, meaning it does one less read() system call and avoids a call to handle->alloc_cb (meaning one less malloc/free call for most users.)

You can view the complete libuv master commit history on GitHub.

This week’s blogs, tutorials, how-to’s and news

What’s next?

  • Ready to develop APIs in Node.js and get them connected to your data? Check out the Node.js LoopBack framework. We’ve made it easy to get started either locally or on your favorite cloud, with a simple npm install.

 

What’s New in Node.js and libuv – May 23, 2013

Welcome to this week’s wrap up of the week in Node and libuv covering May 16 through May 22. (We were at GlueCon last week plus were busy getting StrongNode 1.0 GA out the door, so we fell behind a week). The purpose of this blog is to recap a subset of the non-documentation related commits to Node.js, plus give a little color and commentary to the ongoing development of Node.

Node v0.10 branch highlights

Node v0.10.8 (Stable) is out

A new stable release of v0.10 was released on May 24. Read the official blog for the release notes.

crypto: Clear error after DiffieHellman key errors

Isaac fixed bug #5499 that was causing some crypto operations to report inconsistent errors.

 buffer: throw when writing beyond buffer

Trevor served up a fix so that Buffer#write() calls are now checked and will throw an exception on out-of-range indexes.

timers: internal unref’d timer for API timeouts
net: use timers._unrefActive for internal timeouts

TJ fixed a bug in f46ad01 and a846d93 where an unref’d socket handle would still keep the event loop alive if it had an active timeout timer.

configure: respect the -dest-os flag consistently

Nathan made cross-compiling a little easier in 99b737b. “Prior to this patch, for example, DTrace probes would incorrectly attempt to be enabled because the configure script was running on a MacOS machine, even though you were compiling a binary for Linux. With this patch the --dest-os flag is respected throughout the configure script, thus leaving DTrace probes disabled in this cross-compiling scenario.”

http: save roundtrips, convert buffers to strings

Ben sped up making many POST requests with the HTTP client by impressive amounts: 150x on localhost, 10x to 100x on physical networks in commit fda2b31. Headers and the first body chunk + TE chunk headers and the chunk itself are now much more likely to end up in the same TCP packet, which has a rather dramatic impact on the number of round-trips (and therefore throughput.)

http: Return true on empty writes, not false

Isaac fixed a bug in a2f93cf where HTTP connections could stall if your script sent an empty string.

v8: update to 3.14.5.9

TJ updated V8 to 3.14.5.9 in 279361b.  There’s a new flag –abort_on_uncaught_exception to “abort program (dump core) when an uncaught exception is thrown.”

uv: upgrade to 0.10.8
npm: Upgrade to 1.2.23

Isaac upgraded libuv to v0.10.8 and npm to 1.2.23.

tls: retry writing after hello parse error

Fedor (with some help from Bert and Ben) fixed a bug in f7ff8b4 where lib/tls.js (and therefore lib/https.js) would hit an assert if your script sent a large response.

http: remove bodyHead from ‘upgrade’ events 

Nathan optimized handling of HTTP Upgrade requests a little in a40133d.  We’ve seen at least one report from a user that this change is causing disconnects with socket.io so it may have to be reverted.

Node master branch highlights

tls:add localAddress and localPort properties

Ben added localAddress and localPort properties to tls.CleartextStream
objects in d820b64 (proxies the net.Socket properties, like was already the case for remoteAddress and remotePort properties.)

v8: upgrade to v3.19.3

Trevor upgraded V8 to v3.19.3 in 506fc4d.  There have been further API changes to native add-on, so authors probably want to retest their modules.

timers: use uv_now instead of Date.now

TJ added an optimization in f8193ab that should reduce the overhead of setTimeout() and friends a little.  lib/timers.js makes less or no gettimeofday() system calls now, which should speed up things a little on Solaris and the BSDs. For what it is worth, gettimeofday() is not usually a system call on Linux so it probably doesn’t matter much on that platform.

systemtap: add tapset for node user probes

TJ’s patch makes systemtap tracing prettier. You can now do things like stap -e
'probe node_http_server_request { println(probestr); }'

You can view the complete Node master commit history on GitHub.

libuv v0.10 branch highlights

darwin: assume CFRunLoopStop() isn’t thread-safe

Fedor fixed an OS X-only bug #799 in d5fa633 where uv_loop_delete() would
sometimes hang.  Turns out CFRunLoopStop() is not thread-safe, which something used for FSEvents directory watching.

unix: implicitly signal write errors to libuv user

Ben fixed a potential infinite loop in c53fe81. “Potential” because it only happens when you use uv_write() in a certain way and the underlying socket/pipe/whatever hits an error. Regrettably, the patch forgets to remove a handful of obsolete assert() statements. Ben rectified that in b38c9c1, but by that time Node.js v0.10.8 was already released, so it won’t be fixed until v0.10.9.

unix: fix assert on signal pipe overflow

Bert fixed a bug in c5d570d where libuv would assert() if it received a ton of UNIX signals. This fix is only relevant if you’re using libuv to listen for signals (which Node.js does).

unix: turn off POLLOUT after stream disconnect

Ben fixed a bug in 4146805 where the event loop would sometimes wake up unnecessarily after connect(). Zero impact otherwise, it’s just a little more efficient now.

unix: fix stream refcounting buglet

Ben fixed a stream refcounting issue in 636a13b where the event loop could exit prematurely, but only when very specific conditions are met. There’s no bug report as Ben discovered it when tracking down something unrelated.

This week’s blogs, tutorials, how-to’s and news

What’s next?

  • Ready to develop APIs in Node.js and get them connected to your data? Check out the Node.js LoopBack framework. We’ve made it easy to get started either locally or on your favorite cloud, with a simple npm install.

 

Practical Examples of the New Node.js Streams API

Node brought a simplicity and beauty to streaming.  Streams are now a powerful way to build modules and applications.  Yet the original streams API had some problems.  So in Node v0.10, we saw the streams API change in order to fix the prior problems, extend the APIs to encapsulate more common use cases, and be simpler and easier to use.

As I tried to make the adjustment to the new APIs, I found some documentation on it but not many practical examples.  This article explores some of the Node documentation that may be confusing about the new APIs.  It will also apply the new APIs in practical terms to help you get started using these APIs in your programs.  Let’s dive in!

The “line by line” problem

Good log data can be an invaluable resource to a company and developer team. However, sifting through that data can be time consuming and you can only get so far with shell commands.  Wouldn’t it be helpful to programmatically get statistics or locate patterns in your logs?  For many log formats, in order to do that, we need a way to get at our data line by line.

The beauty of Node streams is we don’t have to do this all in memory (log files can be huge) and we can process data as soon its been read.  Streams also will work from any I/O source (file system, network).

Using the new stream APIs, we can create a reusable I/O component that transforms our data into individual lines for further processing.

The Transform stream

Node 0.10 provides a nifty stream class called Transform for transforming data intended to be used when the inputs and outputs are causally related.  In our problem, the input and output are actually the same data.  However, this data is transformed into separate lines for further processing down the road (such as collecting stats or searching).

Transforms sit in the middle of a pipeline and are both readable and writeable:

To set up our transformation, we need to include the stream module and instantiate a new Transform stream:

Switching on objectMode

Whoa!  What is that { objectMode: true } I threw in there?  Well, we want the destination of our transformation to be able to handle the data line by line.  objectMode allows a consumer to get at each value that is pushed from the stream.  Without objectMode, the stream defaults to pushing out chunks of data.  As the name suggests, objectMode is not just for string values, like in our problem, but for any object in JavaScript ({ “my”: [ “json”, “record” ]}).

The _transform method

So that’s cool but we aren’t done yet.  Transform classes require that we implement a single method called _transform and optionally implement a method called _flush.  Our example will implement both, but let’s cover the _transform method first.

The _transform method is called every time our source has data for us.  Let’s look at the code and then talk about it:

As data from a source stream arrives, _transform will be called.  Along with it comes a chunk of available data, the type of encoding that data has been provided in and a done function that signals that you are done with this chunk and ready for another.

In our case, we don’t care about the underlying encoding.  We just want the chunk to be a string value, so we will perform a toString() conversion.  Once we have our chunk as a string, we can split(‘n’) to get an array of individual lines. Next, we push each line separately to whatever is consuming the transformation.

Note: The push method is part of the Readable stream class (which Transform inherits from).  If you are familiar with Node 0.8, push is akin to emitting data events.

Lastly, we signal that we are finished with the chunk by calling done().  Since done is a callback, it allows us to also perform asynchronous actions in our _transform if desired.

What is the _lastLineData stuff all about? We don’t want a chunk of data to get cut off in the middle of a line.  In order to avoid that, we splice out the last line we find so it does not push to the consumer.  When a new chunk comes in we prepend the line data to the front and continue.  This way we can safeguard against half lines being pushed out.

The _flush method

However, we still have a problem.  When the last call to _transform happens, we have a _lastLineData value sitting around that never got pushed.  Thankfully, the Transform class also provides a _flush method for this scenario.  After all the source data has been read and transformed, the _flush method will be called if it has been implemented.  The _flush method is a great place to push out any lingering data and clean up any existing state. Here is how it would look like in our case:

We push out the _lastLineData provided if we have some to the consumer and then cleanup our instance variable.  Finally, we call done() to signal that we are finished flushing.  This will also signal to the consumer that the stream has ended.  Remember, the _flush method is optional and may be unnecessary for some Transform streams.

The solution

That wraps up our little liner module.  Let’s look at it in full:

Testing our solution

Groovy.  So how do we test this?

First, you need a data source.  Any file that uses lines to delineate records will do. The most universal file I can think of is an access log from Apache.  To pull this file data, we’ll use a Readable stream:

As data becomes readable from the transformation, we can access each line individually through objectMode.

Wrapping Up

We are only scratching the surface when it comes to all that you can do with streams.  However, I hope you can take this little example further and come up with your own stuff.  If you’ve dismissed streams before in Node, take another look!  I think you’ll find the new stream API powerful and simple to use.

Appendix A: Backwards Compatibility

Since the stream module was added in Node 0.10, running our liner example in

previous versions of Node will produce an error much like the following:

To get around this we can use the readable-stream module (npm install readable-stream). Despite its name, readable-stream has grown from a preview version of the new Stream classes before 0.10 into a drop-in shim for Node 0.8. Now, the top of our example should look a little more like this:

What’s next?

  • Ready to develop APIs in Node.js and get them connected to your data? Check out the Node.js LoopBack framework. We’ve made it easy to get started either locally or on your favorite cloud, with a simple npm install.

 

What’s New in Node.js and libuv this Week – May 17, 2013

Welcome to this week’s wrap up of the week in Node and libuv covering May 9 through May 15. The purpose of this blog is to recap a subset of the non-documentation related commits to Node.js, plus give a little color and commentary to the ongoing development of Node.

This week’s Node v0.10 branch highlights

debugger: breakpoints in scripts not loaded yet

Miroslav Bajtoš (bajtos) debugger patch got backported to v0.10 in a32a243. If you recall from the last update, this patch calls setBreakpoint with an unknown script name and converts the script name into a regular expression matching all paths ending with a given name (name can be a relative path too).

stream: make Readable.wrap support empty streams
stream: make Readable.wrap support objectMode

Daniel Moore (danielmoore) made Readable.wrap handle objectMode and empty streams properly in two patches, 1ad93a6 and 3b6fc60.

child_process: fix handle delivery

Ben Noordhuis (bnoordhuis) fixed a bug in 21bd456 where handles / file descriptors were sometimes delivered to the wrong message when sending them to other processes. As Ben points out, “the exact issue is rather complex, but the commit log goes into excruciating detail.” So, if you want to dig deep on this one, check the commit log.

src: Add StringBytes static util class
crypto: Pass encodings to C++ for Sign/Verify

Isaac Schlueter (isaacs) tackled crypto string performance with a variety of commits. For those who have looked at the performance of crypto strings in v0.10, you know there was a big regression in v0.10 compared to v0.8.

timers: fix setInterval() assert

Ben fixed a bug in 22533c0 that was causing Node to hit a C++ assert when you unref’d a setInterval() timer in a particular way.

buffer, crypto: fix default encoding regression

Ben fixed regression #5482 in patch f59ab10 that was introduced by Isaac’s crypto work, where Node hit a C++ assert if you passed it a string with encoding=’buffer’.

You can view the complete Node v0.10 commit history on GitHub.

This week’s Node master branch highlights

debugger, cluster: each worker has new debug port

Miroslav fixed the debugger when used against clustered applications in commit 43ec1b1. Current users of the debugger will recall that it used to be that all processes tried to bind to the same TCP port – which of course we know doesn’t work. With this fix each process now binds to a different port in ascending order. Node no also understands –debug-port=<port>, which tells it to claim the port, but to not start the debugger just yet.

os: Include netmask in os.networkInterfaces()

Ben Kelly (wanderview) added netmask to the output of os.networkInterfaces() in 8a407f5.

http: Use writev instead of the hacky hot end

Now there’s an interesting title for a commit! In this commit Isaac made the http module use the writev() API. It replaces about 100 lines of code which did what writev() does, but in an ad-hoc manner.

fs.watch on OSX returns file paths

After the upgrade to libuv v0.11.2 in fede68f, fs.watch() on OS X finally returns file paths and even tracks renames.

v8 upgraded to 3.19.0

Ben upgraded V8 to 3.19.0 in commit 7ee538d. This is the first V8 release that supports Harmony generators: –harmony_generators. If you are a native add-on authors the new V8 API has some backwards incompatible changes.

util: make util.log handle non strings like console.log

Nick Sullivan (gorillamania) made util.log() handle non-string arguments like console.log() in 8db693a.

cluster: use round-robin load balancing

Ben made the cluster module use the round-robin load balancing algorithm (except on Windows) in commit e72cd41. The commit log has the details of the why and how. In a nutshell: more predictable load distribution.

deps: upgrade c-ares to 1.10.0

Ben upgraded the bundled c-ares from 1.9 to 1.10 in commit 9498fd1.

dns: add getServers and setServers

TJ Fontaine (tjfontaine) added dns.getServers() and dns.setServers() in 8886c6b.

  • getServers returns an array of ips that are currently being used for resolution
  • setServers takes an array of IPs that are to be used for resolution. This will throw an exception if there’s invalid input, but preserve the original configuration

net: emit dns ‘lookup’ event before connect

Ben made net.connect() and net.createConnection() emit a DNS ‘lookup’ event in commit b3d1e50. This is useful if you want to instrument your code or profile DNS resolve times.

http: don’t escape request path, reject bad chars

Ben reworked the http request path escaping approach in 7124387 (originally introduced in 38149bb.)  It no longer escapes, just checks for invalid characters. More details in the commit log for details. This is great for folks who were finding it next to impossible to be compatible with both v0.10 and v0.11+ without undue effort.

dtrace: enable uv’s probes if enabled

TJ Fontaine enabled the embedded libuv’s dtrace probes in f0d80d7. The current probes remain quite basic, but they do let you for example, time how long a tick of the event loop took.

You can view the complete Node master commit history on GitHub.

This week’s libuv v0.10 branch highlights:

EMFILE handling on Linux

Ben fixed an EMFILE handling bug in commit b3ab332. The behavior people were reporting was that on Linux (and only Linux), Node processes would start busy-looping and eating 100% CPU. The culprit here was how the kernel’s accept() syscall checks for EMFILE before checking for EAGAIN. This threw off the EMFILE handling code in libuv.

build: set soname in shared library

In 3eb6eb3 Ben addressed a popular request from distro packagers, that the shared object file (make libuv.so) now set the soname. TJ came in afterwards and applied fix 55c150a because the SunOS ld linker doesn’t like -soname in non-dynamic builds.

darwin: fix iOS build, don’t require Application Services

Ben fixed the iOS build in f22163c.

windows: kill child processes when the parent dies

Bert Belder (piscisaureus) changed the behavior of uv_spawn() in 415f4d3 to match what Unix does: child processes are now killed when the parent process dies – unless the UV_PROCESS_DETACHED is set.

windows: make uv_spawn not fail under job control

Bert fixed uv_spawn() in commit 4f61ab2 to not fail when running
under job control.

You can view the complete libuv v0.10 commit history on GitHub.

This week’s blogs, tutorials, how-to’s and news

The Road to Node.js v1.0

Last week Bert and Ben were in town from Amsterdam — a rare occasion — so I set up an event at Klout offices called “The Road to Node.js v1.0”. If you’re interested in seeing more talks like this in the future, be sure to join the meetup group!

Here’s Isaac Schlueter talking about what can be improved as development continues towards v1.0 and some of the philosophy that guides Node development. Isaac splatters “warning: vaporware” tape across one of the slides to ensure we don’t get too wrapped up in absolutes on the Node.js Roadmap.

And here’s Bert recounting some war stories about developing libuv. Equal parts informative and funny, you’ll want to watch this all the way through to hear some rarely told stories and opinions about libuv and developing Node.js. At the end of Bert’s talk is a Q&A session with Isaac.

To see who attended and feedback on the event, head to the meetup event page. And if you’re interested in seeing more talks like this in the future, be sure to join the group!