Ingestion metrics schema

This document describes the Google Security Operations ingestion metrics fields and their related UDM events for supported ingestion components (collection mechanisms).

Ingestion metrics components

Ingestion components are services or pipelines that ingest logs into the platform from source log feeds. Each ingestion component collects a different set of log fields into its own ingestion metrics schema. These log fields are the dimension fields that appear in the ingestion metrics Explore interface, when creating new dashboards.

The following sections describe the ingestion metrics schemas and dimension fields for the following ingestion components: Forwarder, Ingestion API, Collection agent, Normalizer, and Out-of-band processor (Chronicle API feed).

Forwarder ingestion schema

Fields Type Description
component STRING Forwarder, is the ingestion service or pipeline type ingesting log entities into the platform.
collector_id STRING The unique identifier of the collection mechanism. For push sources, the forwarder ID or generated ID is used. For Chronicle API or Chronicle API feed, the ID has the following format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
log_type STRING The log source type identifying log entries in a batch. For example, WINDOWS_DNS.
input_type STRING This field is populated if the ingestion source is the Google Security Operations forwarder. Based on the data that the forwarder sends, this field contains pcap, syslog, or splunk.
drop_reason_code STRING The reason for dropping the log.
last_heartbeat_time TIMESTAMP

The last timestamp at which the forwarder or API feed was active, in microseconds. This field is populated if the ingestion source is the Google Security Operations forwarder or Chronicle API feed.

When the feed is active, it populates the last_heartbeat_time field, and the log_count and log_volume fields remain empty.

log_volume FLOAT64

The volume of logs during the interval, in bytes.

The log_volume field remains empty or is populated in the following cases:

  • This field is populated when the Google Security Operations forwarder or the feed sends data. The last_heartbeat_time field remains empty.
  • If the feed is inactive, no entry is made in the ingestion metrics table.
  • When a feed is active, the last_heartbeat_time, log_count, or log_volume field is populated.
drop_count FLOAT64 The number of logs dropped for the customer.
log_count FLOAT64

The number of logs ingested during the interval.

The log_count field remains empty or is populated in the following cases:

  • This field is populated when the Google Security Operations forwarder or the feed sends data. The last_heartbeat_time field remains empty.
  • If the feed is inactive, no entry is made in the ingestion metrics table.
  • When a feed is active, the last_heartbeat_time, log_count, or log_volume field is populated.
memory_used FLOAT64 The percentage of memory used by the forwarder container.
disk_used FLOAT64 Percentage of disk used by the forwarder container.
cpu_used FLOAT64 The percentage of CPU used by the forwarder container.
buffer_used FLOAT64 Percentage of buffer used by the forwarder buffer type.
buffer_type STRING The type of buffer used in the forwarder.

Ingestion API schema

Fields Type Description
component STRING Ingestion API, is the ingestion service or pipeline type ingesting log entities into the platform.
collector_id STRING The unique identifier of the collection mechanism. For push sources, the forwarder ID or generated ID is used. For Chronicle API or Chronicle API feed, the ID has the following format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
log_type STRING The log source type identifying log entries in a batch. For example, WINDOWS_DNS.
namespace STRING The namespace that the log belongs to.
log_volume FLOAT64 The size of the logs received for the customer by the Ingestion API, in bytes.
log_count FLOAT64 The number of logs received for the customer by the Ingestion API.
quota_limit_per_second FLOAT64 The quota limits set by the customer, enforced by the Ingestion API.
quota_rejected_long_term_log_volume FLOAT64 The size of the logs rejected by the Ingestion API due to insufficient quota, for the LONG_TERM_DAILY_LIMIT quota type, in bytes.
quota_rejected_short_term_log_volumed FLOAT64 The size of the logs rejected by the Ingestion API due to insufficient quota, for the SHORT_TERM_DAILY_LIMIT quota type, in bytes.
ingestion_source STRING The ingestion source in the **ingestion label** when the logs are ingested through ingestion private API.

Collection agent

Fields Type Description
component STRING Collection Agent, is the ingestion service or pipeline type ingesting log entities into the platform.
collector_id STRING The unique identifier of the collection mechanism. For push sources, the forwarder ID or generated ID is used. For Chronicle API or Chronicle API feed, the ID has the following format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
log_type STRING The log source type identifying log entries in a batch. For example, WINDOWS_DNS.
input_type STRING This field is populated if the ingestion source is the Google Security Operations forwarder. Based on the data that the forwarder sends, this field contains pcap, syslog, or splunk.
drop_count FLOAT64 The number of spans refused by the agent exporter.
log_count FLOAT64 The number of spans accepted by the agent exporter.
memory_used FLOAT64 Memory occupied by the agent process, in kilobytes.
cpu_used FLOAT64 CPU time spent on the agent process, in seconds.
buffer_used FLOAT64 Queue size of the agent exporter.
buffer_capacity FLOAT64 Queue capacity of the agent exporter.
process_uptime FLOAT64 The number of seconds that the agent process has been running.

Normalizer ingestion schema

Fields Type Description
component STRING Normalizer, is the ingestion service or pipeline type ingesting log entities into the platform.
collector_id STRING The unique identifier of the collection mechanism. For push sources, the forwarder ID or generated ID is used. For Chronicle API or Chronicle API feed, the ID has the following format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
state STRING The final status of the event or log. The status is one of the following:
  • parsed. The log is successfully parsed.
  • validated. The log is successfully validated.
  • failed_parsing. The log has parsing errors.
  • failed_validation. The log has validation errors.
  • failed_indexing. The log has batch indexing errors.
log_type STRING The log source type identifying log entries in a batch. For example, WINDOWS_DNS.
event_type STRING The event type determines which fields are included with the event. The event type includes values such as PROCESS_OPEN, FILE_CREATION, USER_CREATION, and NETWORK_DNS.
drop_reason_code STRING The reason for dropping the log.
log_volume FLOAT64 The total size of the log entries received for parsing, in bytes.
log_count FLOAT64 The number of log entries received for parsing.
event_count FLOAT64 The number of events generated during the interval.
latency_count FLOAT64 The number of values in a latency distribution, of the difference in time between ingestion and normalization.
buckets FLOAT64 The number of values in each bucket in a latency distribution, of the difference in time between ingestion and normalization.
bucketer_num_finite_buckets FLOAT64 The number of buckets in a latency distribution, of the difference in time between ingestion and normalization.
bucketer_growth_factor FLOAT64 The bucketer growth factor in a latency distribution, of the difference in time between ingestion and normalization.
bucketer_scale_factor FLOAT64 The bucketer scale factor in a latency distribution, of the difference in time between ingestion and normalization.
latency_overflow FLOAT64 The overflow bucket in a latency distribution, of the difference in time between ingestion and normalization.
latency_underflow FLOAT64 The underflow bucket in a latency distribution, of the difference in time between ingestion and normalization.

Out-of-band processor (Chronicle API feed) ingestion schema

Fields Type Description
component STRING Out-of-band processor (Chronicle API feed), is the ingestion service or pipeline type ingesting log entities into the platform.
feed_id STRING The id of the specific Xenon/Gopher feed that a log belongs to.
log_type STRING The log source type identifying log entries in the batch. For example, WINDOWS_DNS.
last_heartbeat_time TIMESTAMP

The epoch timestamp of the successful ingestion of the log entry, in seconds.

When the feed is active, the last_heartbeat_time field is populated, and the log_count and log_volume fields remain empty.

log_volume FLOAT64

The size of the logs received in the out-of-band processor, in bytes.

The log_volume field remains empty or is populated in the following cases:

  • This field is populated when the feed sends data. The last_heartbeat_time field remains empty.
  • If the feed is inactive, no entry is made in the ingestion metrics table.
  • When the feed is active, the last_heartbeat_time, log_count, or log_volume field is populated.
log_count FLOAT64

The number of logs processed in the out-of-band processor.

The log_count field remains empty or is populated in the following cases:

  • This field is populated when the feed sends data. The last_heartbeat_time field remains empty.
  • If the feed is inactive, no entry is made in the ingestion metrics table.
  • When a feed is active, the last_heartbeat_time, log_count, or log_volume field is populated.

Filtering ingestion metrics

You can filter ingestion metrics based on the field values. For example, out-of-band processor feeds have collector_id as aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa. Here is an example query to filter out-of-band feeds:

SELECT
  component,
  collector_id,
  count(component)
FROM
  chronicle-tla.datalake.ingestion-metrics
WHERE
  DATE(start_time) = DATE_SUB(CURRENT_DATE(), INTERVAL 60 DAY)
  AND component IN ("Out-of-Band Processor","Ingestion API", "Forwarder")
  AND (collector_id != "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
  OR collector_id is null)
group by 1,2

Ingestion metrics examples

The following table shows some metrics and example values to help you understand the ingestion_metrics schema fields:

Metrics component collector_id feed_id log_type start_time end_time input_type last_heartbeat_time log_volume drop_count log_count memory_used cpu_used disk_used buffer_used ingestion_source drop_reason_code
Heartbeat Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 syslog 2022-04-21T13:18:55.000 00:00
Log Bytes Count Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 pcap 149.0
Log Record Count Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 pcap 154.0
Drop Count (Backlog) Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 pcap 4.0 backlog
Drop Count (Invalid Config) Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 pcap 4.0 invalid_config
Drop Count (Regex) Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 pcap 4.0 regex
Log Record Count Ingestion API xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DHCP 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 3578.0
Log Bytes Count Ingestion API xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DHCP 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 2802.0
Log Record Count Out-of-Band Processor feeds/aaaaaaaaaaaaaa ARUBA_IPS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 3578.0
Log Bytes Count Out-of-Band Processor feeds/aaaaaaaaaaaaaa ARUBA_IPS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 319563.0
Last Ingested Timestamp Out-of-Band Processor feeds/aaaaaaaaaaaaaa ARUBA_IPS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 2022-04-21T13:18:55.000 00:00
Log Count Normalizer xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00
Log Size Normalizer xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00
Event Count Normalizer xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00
Container Memory Used Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 0.32
Container Disk Used Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 0.5
Container CPU Used Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 0.545
Buffer Used Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 0.562
Ingestion Source Forwarder xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PCAP_DNS 2022-04-21T13:14:50.924 00:00 2022-04-21T13:19:50.924 00:00 windows-spain-dc-1

What's next