# Working with Timestamps

Handling time-based data in SQL is essential for analyzing trends, patterns, and other time-related metrics. With Telemetry's flexibility to ingest any arbitrary JSON, you can leverage SQL's powerful time functions to extract meaningful insights from your data. This article provides an overview of how to use time functions in SQL queries, focusing on grouping, truncating, and analyzing events over time, particularly within the constraints of the DataFusion SQL engine.

#### Handling Timestamp Schema in Telemetry

In Telemetry, we expect all logs to include a `timestamp` field that is formatted in the standard **timezone qualified** **ISO 8601** format. This field is stored as a string, but for performance reasons, we derive a secondary field, `timestamp_utc`, which is stored in a more compact format, `Timestamp(Millisecond, Some("UTC"))`. When logging data, you only need to provide the `timestamp` field. However, for faster queries, especially when handling time-based operations, you can utilize the `timestamp_utc` field.

**Example Log Entry**:

```json
{
  "event_id": "e12345",
  "event_type": "click",
  "user_id": "u67890",
  "timestamp": "2023-08-07T14:22:00.000Z",
  "metadata": {
    "browser": "Chrome",
    "device": "Mobile",
    "location": "New York"
  }
}
```

In this log, `timestamp` is the ISO 8601 string representation. Upon ingestion, Telemetry will automatically derive the `timestamp_utc` field as a `Timestamp(Millisecond, Some("UTC"))` type for efficient querying.

**Querying Time-based Data**

While querying data, you can leverage both the `timestamp` and `timestamp_utc` fields. If you're running more complex or time-intensive queries, using `timestamp_utc` is highly recommended to improve performance due to its more efficient storage format.

**Benefits of Using `timestamp_utc`**:

* Faster query execution, especially for large datasets.
* Optimized storage due to the compact timestamp format.
* Compatibility with DataFusion’s SQL engine for time-based operations like `DATE_TRUNC`, `EXTRACT`, and more.

By default, only the `timestamp` field needs to be logged, and Telemetry will handle the transformation to the optimized `timestamp_utc` format behind the scenes, ensuring both flexibility and performance in your time-based queries.

### Time Data Example: Event Logs

Consider the following JSON blob representing an event log that Telemetry can ingest:

```json
{
  "event_id": "e12345",
  "event_type": "click",
  "user_id": "u67890",
  "timestamp": "2023-08-07T14:22:00.000Z",
  "metadata": {
    "browser": "Chrome",
    "device": "Mobile",
    "location": "New York"
  }
}
```

In this example, the JSON includes a timestamp field, which we can use to analyze event data over time.

### Working with Time Functions

#### 1. Extracting Date Parts

You may often need to extract specific parts of a timestamp, such as the year, month, day, or even hour. SQL provides functions like `EXTRACT()` and `DATE_TRUNC()` to help you do this.

**Example: Extracting the Day of the Week**

To extract the day of the week from the timestamp, you can use:

```sql
SELECT 
    EXTRACT(DOW FROM timestamp_utc) AS day_of_week
FROM 
    event_log
```

This query returns the day of the week (0-6) for each event, where 0 represents Sunday and 6 represents Saturday.

#### 2. Grouping by Time Intervals

One common use case for time functions is to group events by specific time intervals, such as days, weeks, or months, to see how many events occurred during each period.

**Example: Grouping Events by Day**

To group events by day and count the number of events per day, you can use `DATE_TRUNC()`:

```sql
SELECT 
    DATE_TRUNC('day', timestamp_utc) AS day,
    COUNT(*) AS event_count 
FROM 
    event_log 
GROUP BY 
    DATE_TRUNC('day', timestamp_utc) 
ORDER BY 
    day
```

This query truncates the timestamp to the day level and then groups the events by these truncated days. The result is a count of events that occurred on each day.

#### 3. Comparing Time Ranges

You may also want to compare data from different time ranges, such as comparing events from one week to the next.

**Example: Week-over-Week Comparison**

To compare the number of events between different weeks, you can use:

```sql
SELECT 
    DATE_TRUNC('week', timestamp_utc) AS week,
    COUNT(*) AS event_count 
FROM 
    event_log 
GROUP BY 
    DATE_TRUNC('week', timestamp_utc) 
ORDER BY 
    week;
```

This query groups events by the week they occurred, allowing you to analyze weekly trends.

#### 4. Calculating Time Differences

In some SQL engines, such as PostgreSQL, you might use the `AGE()` function to calculate the difference between two timestamps. However, DataFusion does not support the `AGE()` function. Instead, you can achieve similar results using other methods.

**Example: Calculating Time Differences Between Events**

To calculate the time difference between two events in DataFusion, you can use the `LAG()` function along with direct arithmetic on the timestamps.

For instance, to calculate the difference in seconds between the current event and the previous event:

```sql
SELECT 
    event_id,
    timestamp,
    LAG(timestamp_utc) OVER (ORDER BY timestamp_utc) AS previous_timestamp,
    EXTRACT(EPOCH FROM timestamp_utc) - EXTRACT(EPOCH FROM LAG(timestamp_utc) OVER (ORDER BY timestamp_utc)) AS time_difference
FROM 
    event_log
```

In this query, `LAG(timestamp) OVER (ORDER BY timestamp)` retrieves the timestamp of the previous event, and `EXTRACT(EPOCH FROM timestamp)` converts the timestamp into the number of seconds since the Unix epoch. Subtracting these values gives you the time difference in seconds.

### Conclusion

Time functions in SQL are indispensable tools for analyzing and managing time-based data. While DataFusion has certain limitations compared to other SQL engines, you can still effectively use its supported functions to group, truncate, and analyze your data over various time intervals. Whether you're tracking daily events, comparing weekly trends, or calculating time differences between events, SQL provides the necessary functions to unlock the full potential of your time-based data.

Leverage these powerful SQL time functions within Telemetry to gain deeper insights and make more informed decisions based on your data.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.telemetry.sh/discussion-topics/working-with-timestamps.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
