Querying Nested JSON

Handling nested JSON in SQL can be a powerful way to manage complex data structures. With Telemetry, you have the flexibility to ingest any arbitrary JSON, making it a versatile tool for data analysis. This document provides an overview of how to handle nested JSON using SQL, particularly focusing on querying deeply nested elements in your JSON blobs.

JSON Structure Example: Uber Ride Data

Consider the following JSON blob that Telemetry can ingest, which represents data from an Uber ride:

{
  "ride_id": "12345",
  "driver": "John Doe",
  "passenger": "Jane Smith",
  "pricing": {
    "base_fare": 500,
    "cents": 1500,
    "currency": "USD",
    "details": {
      "surge_multiplier": 1.2,
      "distance_cost": 300
    }
  },
  "timestamp": "2023-08-05T10:22:00.000Z"
}

In this example, the JSON has nested objects within the pricing key.

Querying Nested JSON

One of the great strengths of using Telemetry is its ability to handle nested JSON fields effectively. Let's explore how to perform such queries using SQL.

Extracting Top-Level Fields

For top-level fields, querying is straightforward. For example, to get the average cents value from a table named ride_log, you can use:

SELECT AVG(pricing.cents) FROM ride_log;

Extracting Nested Fields

DataFusion, the popular SQL query engine we use under the hood, currently does not support deeply nested operators directly. However, with Telemetry, you can still access these deeply nested fields using nested queries, demonstrating the flexibility and power of your product.

Example: Extracting pricing.details.surge_multiplier

To extract pricing.details.surge_multiplier, you can use a subquery to first extract the nested object pricing.details and then further extract surge_multiplier from it:

SELECT details.surge_multiplier 
FROM (
  SELECT pricing.details AS details 
  FROM ride_log
) AS subquery;

This approach allows you to navigate through multiple layers of nested JSON, ensuring you can access even the most deeply nested fields in your data.

Conclusion

With Telemetry's capability to ingest any arbitrary JSON, you have a highly flexible and powerful tool at your disposal for data analysis. Whether you are working with top-level fields or deeply nested structures, SQL queries can be crafted to extract the exact data you need. Embrace the versatility of Telemetry to unlock the full potential of your JSON data.

Last updated