A few weeks ago our friend Matt Debergalis from Meteor reached out and said they’d been seeing intermittent slowdowns on Atmosphere.js, the package management system for Meteor packages. They could reproduce the problem with enough time and load on certain applications and suspected an issue of libuv or event loop interaction with the fibers infrastructure in Meteor. Could StrongLoop lend some of it’s expertise to help get to the root of the problem?
To set about investigating, I asked them to install a pre-release version of lapse, which is an upcoming strong-agent feature.
- The strong-agent module is StrongLoop’s monitoring agent that pipes performance data to either the StrongOps console or your favorite visualization tool like DataDog or Graphite.
- Lapse is a tool that monitors the event loop and when it detects a stall, it starts profiling the application with a built-in sampling CPU profiler.
Upon running with lapse for some time they were able to catch the problem. For now lapse creates a log file because we don’t yet have a visualization. In analyzing the log file, the critical portions I noticed looked like this: