Queue Worker Observability
In this guide, we'll walk you through the process of using the Telemetry SDK to monitor worker queues and latencies, including the calculation of P90, P95, and P99 latencies. By the end of this guide, you'll have a setup that logs and analyzes queue lengths and processing latencies to help you maintain optimal performance.
Prerequisites
A valid API key for Telemetry
Basic understanding of JavaScript and Node.js
A system with worker queues to monitor
Step 1: Install Telemetry SDK
First, you need to install the Telemetry SDK in your project. If you haven't done so already, run the following command:
Step 2: Initialize Telemetry
After installing the SDK, import and initialize Telemetry in your project. Replace YOUR_API_KEY
with your actual Telemetry API key.
Step 3: Log Queue Lengths and Latencies
To monitor your worker queues and latencies, you need to log relevant data points such as queue length and processing time. Here’s an example function to log this data:
Step 4: Automate Queue Monitoring
You should set up your system to automatically log queue metrics at regular intervals or after each job is processed. Here’s an example where you monitor a queue every minute:
Step 5: Query and Analyze Queue Metrics with P90, P95, and P99 Latencies
Once you've logged sufficient data, you can query it using Telemetry's query API to analyze your worker queues and latencies. The following query calculates the P90, P95, and P99 latencies, as well as the average latency and maximum queue length for each queue:
Step 6: Explore Data with Telemetry's UI
Telemetry's UI allows you to visualize and explore your queue metrics interactively. Visit Telemetry Dashboard and log in with your credentials to create dashboards, charts, and more based on your worker queue data, including the P90, P95, and P99 latencies.
Conclusion
By following these steps, you can effectively monitor the performance of your worker queues and track latencies using the Telemetry SDK. With the inclusion of P90, P95, and P99 latencies, you gain deeper insights into the tail latencies in your system, helping you optimize performance and maintain high levels of service.
Last updated