Node.js Application Troubleshooting Manual — Node.js Performance Platform User Guide

Intro

In the previous article, we used Chrome DevTools to troubleshoot CPU and memory problems in Node.js applications. However, in actual production practices, we may find that Chrome DevTools is more inclined to local development because it obviously does not generate dump files, which are required for problem analysis. This means that developers additionally need to configure tools like v8-profiler and heapdump for online projects and export statuses of projects running online in real time by using additionally implemented services.

Architecture

To put it simply, Node.js Performance Platform consists of three parts: cloud console, AliNode runtime, and Agenthub:

  • Alerting on abnormal metrics: This feature allows notifying developers of abnormal metric data through SMS and DingTalk messages.
  • Exporting status information about online Node.js applications: Status information that can be exported includes but is not limited to CPU and memory status described in the Chrome Devtools section.
  • Online analysis and better UI: The platform supports customized application status analysis and better suits developers in China.

Best Practices

1. Configure Appropriate Alerts

Online application alerting is actually a self-discovery mechanism. Without this alerting capability, problems can only be discovered when they are reported by users who have encountered these problems. This problem discovery procedure does not enable friendly user experience.

2. Perform Proper Analysis Based on Alerts

After you follow instructions in the previous section to configure proper alert rules, you can perform analysis accordingly when you receive SMS alert messages. This section describes how to perform analysis based on the five main metrics described in the preliminary section.

A. Disk Monitoring

This is a relatively easy question. In the quick rule list, the default disk monitoring rule is to issue an alert if the server disk usage exceeds 85%. When you receive a disk alert, you can connect to the server and use the following command to see which directory has high disk usage:

sudo du -h --max-depth=1 /

B. Error Logs

After receiving a specific error log alert, you only need to go to the Node.js Performance Platform console for the corresponding project, find the problematic Instance and view its Exception Log:

C. Process High CPU Usage

Now let’s see the error type for which we use v8-profiler to export the CPU Profile file and then use Chrome DevTools to perform analysis in the previous section. With a complete set of solutions provided in Node.js Performance Platform, you no longer need to use third-party libraries like v8-profiler to export process status data. Instead, when you receive an alert that says that a Node.js application process is taking CPU usage higher than the configured threshold, you only need to click CPU Profile of the corresponding Instance in the console:

D. Memory Leaks

Like a high CPU usage alert, when you receive an alert saying that a Node.js application process is taking heap memory usage higher than the configured threshold percentage, you also no longer need third-party modules like heapdump to export heap snapshots for analysis. Similarly, you can just click Heap Snapshot for the corresponding instance in the console to generate heap snapshots of the desired Node.js process:

  • File Size: the size of the heap snapshot file itself.
  • Total Shallow Size: Review the previous section and you can see that the Retained Size of the GC root is actually the heap size, that is, the total Shallow Size of all the objects allocated to the heap. Therefore, this is actually the heap space used.
  • Objects: the total number of the Heap Objects allocated to the current heap
  • Object Edges: This is a slightly more abstract metric. If object A.b points to object B, property b, which indicates the direction relationship, is considered an edge.
  • GC Roots:This metric shows the number of the actual GC roots. For heap allocation implemented in V8, the heap memory does not only contain one GC root. To help you have a better understanding, the previous content mentions only one GC root. However, in a real running model, many GC roots are included in the heap space.
  • This object is not released as it is expected to, occupying too much heap space.
  • Some properties of this object are not released as expected, causing the object to seemingly occupying too much heap space.

E. Core Dump

When you receive a core dump alert generated on the server, it means that your process unexpectedly failed to respond. If your Agenthub is properly configured, the core dump file generated will be automatically shown in File -> Coredump File:

Conclusion

This chapter describes the monitoring, alerting, analysis, and troubleshooting solutions and best practices of Node.js Performance Platform for Node.js application. I hope it has given you an overall and comprehensive understanding of the platform capabilities and you can learn how to use this platform to serve your specific projects better.

Original Source

https://www.alibabacloud.com/blog/node-js-application-troubleshooting-manual---node-js-performance-platform-user-guide_594966?spm=a2c41.13092552.0.0

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com