Skip to main content
Version: NG-2.16

Advanced configurations

Agent: healthbeat

Generic Settings

You can specify advanced settings in the healthbeat.yml configuration file to control the general behaviour of Healthbeat and the events it publishes to vuSmartMaps.

This includes:

  • Healthbeat-specific global options (for example, startup delay between metricsets).
  • Generic agent options, common to all vuSmartMaps Beats-based agents (name, tags, custom fields, processors, etc.).

Global Healthbeat configuration options

metricbeat.max_start_delay

The maximum random delay to apply to the startup of a metricset.

Random delays in the range [0, max_start_delay) are applied to reduce the “thundering herd” effect that can occur if a fleet of machines running Healthbeat are restarted at the same time.

  • Type: duration (e.g. 10s, 1m)
  • Default: 10s
  • Value 0 disables the startup delay.
healthbeat.max_start_delay: 10s
timeseries.enabled

When this is enabled, Healthbeat adds a timeseries.instance field to all generated events. For a given metricset, this field is unique for every individual item being monitored (for example, per disk, per interface, per process).

  • Type: boolean
  • Default: false
timeseries.enabled: true

Agent: logbeat

Configure general settings

You configure Logbeat in logbeat.yml. Settings control:

  • Global Logbeat behaviour (registry, shutdown).

  • General Beat options like name, tags, custom fields.

Global Logbeat configuration options

These options live under the filebeat.* namespace.

filebeat.registry.path

Root path of the Logbeat registry. Relative paths are resolved relative to path.data.

filebeat.registry.path: registry
  • Default: ${path.data}/registry
  • The registry is only updated when new events are flushed (not on a fixed timer).
filebeat.registry.file_permissions

Permissions mask for registry data files (Unix only).

filebeat.registry.file_permissions: 0600
  • Default: 0600
  • Most permissive: 0640
  • Must be specified as octal.
filebeat.registry.flush

Controls how often registry changes are flushed to disk.

filebeat.registry.flush: 1s
  • Default: 1s
  • 0s → registry written after each successful batch publish.
filebeat.registry.migrate_file

Used when migrating from an older single-file registry to the new directory format.

filebeat.registry.path: `${path.data}/registry`
filebeat.registry.migrate_file: /path/to/old/registry_file

Logbeat will migrate only if the new registry directory does not already exist.

filebeat.shutdown_timeout

Maximum time Logbeat waits on shutdown for the publisher to flush events.

filebeat.shutdown_timeout: 5s
  • Default: disabled (no waiting; un-acked events may be resent on restart).

Configure inputs

You define inputs under filebeat.inputs to tell Logbeat which files to read and how to parse them.

filebeat.inputs:
- type: filestream
id: my-filestream-id
paths:
- /var/log/system.log
- /var/log/wifi.log

Each input is a YAML list item (-), and you can define multiple inputs.

Filestream is the improved replacement for the old log input.

Basic example:

filebeat.inputs:
- type: filestream
id: app-logs
paths:
- /var/log/app/*.log
warning

Each filestream input must have a unique id to track file state correctly.

Key options (filestream):

  • paths: list of glob paths.
  • exclude_lines / include_lines: regexp filters.
  • buffer_size: read buffer size (bytes).
  • message_max_bytes: max length of a single message.
  • parsers: pipeline (multiline, ndjson, container, syslog, include_message).
  • file_identity: fingerprint (default), native, path, inode_marker.
  • close.*, clean_*, backoff.*, harvester_limit: lifecycle & performance tuning.
Reading GZIP logs (beta)
filebeat.inputs:
- type: filestream
id: "gzip-filestream"
paths:
- /var/some-app/app.log*
gzip_experimental: true
  • Requires file_identity: fingerprint (default).
  • Logs are decompressed in memory; ~100KB extra per harvester.

log input

warning

The log input is deprecated. Use filestream instead.

Legacy example:

filebeat.inputs:
- type: log
paths:
- /var/log/messages
- /var/log/*.log

All the classic options (paths, encoding, exclude_lines, include_lines, ignore_older, close_*, clean_*, scan_frequency, tail_files, backoff, harvester_limit, file_identity, etc.) work similarly, but new configs should migrate to filestream.

Manage multiline messages

Use multiline settings to merge multi-line log events (stack traces, multi-line errors, custom blocks) into single events before sending to Kafka.

With filestream input

filebeat.inputs:
- type: filestream
id: java-traces
paths:
- /var/log/app/*.log
parsers:
- multiline:
type: pattern
pattern: '^\['
negate: true
match: after

With deprecated log input

filebeat.inputs:
- type: log
paths:
- /var/log/app/*.log
multiline.type: pattern
multiline.pattern: '^\['
multiline.negate: true
multiline.match: after

Core multiline options

  • multiline.type: pattern | count | while_pattern
  • multiline.pattern: '<regexp>'
  • multiline.negate: true | false
  • multiline.match: after | before
  • multiline.flush_pattern: '<regexp>'
  • multiline.max_lines: 500 (default)
  • multiline.timeout: 5s (default)
  • multiline.count_lines (for type: count)
  • multiline.skip_newline: true|false
Example – Java stack trace (indented lines)
multiline.type: pattern
multiline.pattern: '^[[:space:]]'
multiline.negate: false
multiline.match: after
Example – timestamped events (lines without timestamp belong to previous)
multiline.type: pattern
multiline.pattern: '^\[[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: true
multiline.match: after

Filestream – advanced behaviour

Some important filestream-specific features:

File identity (file_identity)

Controls how Logbeat distinguishes files:

  • fingerprint (default, recommended)
    Identifies files by hashing content (first N bytes). Works well with log rotation and cloud / networked filesystems.
  • native
    Uses inode + device id.
  • path
    Uses path (not safe with rename-based rotation).
  • inode_marker
    Uses inode + marker file for device identity.
warning

Changing the file identity method improperly can cause mass re-ingestion or duplicates.

Closing harvesters (close.* / close.on_state_change.*)

Examples (filestream):

close.on_state_change.inactive: 5m
close.on_state_change.renamed: false
close.on_state_change.removed: true
close.reader.on_eof: false
close.reader.after_interval: 0

Cleaning registry (clean_*)

clean_inactive: -1          # disable automatic cleanup (recommended default)
clean_removed: true

Removing fully ingested files (optional)

delete.enabled: true
delete.grace_period: 30m

Logbeat will remove files once:

  • Reader is closed,
  • EOF is reached, and
  • All events are acknowledged by the output.

Agent configuration options (common to all Beats-based agents)

Generic

These options are not namespaced and are supported by all vuSmartMaps Beats-based agents (Healthbeat, Logbeat, etc.). They control how events are labeled and enriched before being sent to vuSmartMaps.

name

Logical name of the agent instance.

If this option is not set, Healthbeat uses the hostname of the server. The value is included as agent.name in each published event, and can be used to group or filter events per agent instance.

name: "payments-node-01"

tags

A list of tags that Healthbeat includes in the tags field of each event.

Tags make it easy to group servers or instances by logical properties (application, environment, tier, etc.), and to filter or build dashboards in vuSmartMaps.

tags: ["payments-service", "web-tier", "prod"]

fields

Optional custom fields to add additional information to the output.

  • Supported types: scalar values, arrays, maps, or any nested combination.
  • By default, these fields are grouped under a fields sub-object in the event.
fields: {
project: "payments",
instance_id: "574734885120952459",
owner_team: "SRE"
}

fields_under_root

If true, the custom fields are stored as top-level fields in the event instead of under the fields object.

warning

If a custom field name conflicts with an existing field, the custom value overwrites the original.

fields_under_root: true
fields:
instance_id: "i-10a64379"
region: "us-east-1"

processors

A list of processors applied to the data generated by Healthbeat before it is sent to vuSmartMaps (for example, to drop fields, rename fields, add metadata, or filter events).

processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- drop_fields:
fields: ["host.hostname", "host.os.build"]

(You can reference the common “Processors” documentation for the full list of supported processors and usage patterns.)

max_procs

Sets the maximum number of CPUs that Healthbeat can use concurrently.

  • Default: number of logical CPUs on the system.
  • In most deployments this is left at the default; in constrained environments you can cap it explicitly.
max_procs: 2
note

The installer’s --cpulimit option also influences max_procs and related resource limits; this YAML option is the manual override at the configuration level.

timestamp.precision

Controls the precision of all timestamps added by Healthbeat.

  • Default: millisecond

  • Valid values:

    • millisecond
    • microsecond
    • nanosecond
timestamp.precision: microsecond

Kafka Output

Configure the Kafka output

The Kafka output sends events to vuSmartMaps.

Note on Kafka timestamps and retention
For Kafka 0.10.0.0+, the event timestamp is normally set by the producer (beat agent) to the original event time.
If your Kafka topic uses a time-based retention policy, an event created long ago but produced now might be dropped immediately (because its timestamp is already older than the retention window).
To avoid this, set the broker config:

log.message.timestamp.type=LogAppendTime

So Kafka uses the append time (arrival time) instead of the original event time for retention.

Example configuration

output.kafka:
# Initial brokers used to fetch cluster metadata
hosts: ["kafka1:9092", "kafka2:9092", "kafka3:9092"]

# Dynamic topic selection
topic: '%{[fields.log_topic]}'

# Partitioning strategy
partition.round_robin:
reachable_only: false

required_acks: 1
compression: gzip
max_message_bytes: 1000000
note

Events larger than max_message_bytes will be dropped.
Make sure the beat agent does not generate events larger than this limit (or increase the limit and broker message.max.bytes accordingly).

Port reminder (vuSmartMaps)

  • Till 2.16 build: Kafka/collector endpoints are typically on 9092 (non-SSL) and 9094 (SSL).
    Confirm with your platform version and network design before finalising hosts.

Compatibility

Beat agent’s Kafka output is compatible with:

  • Kafka 0.8.2.0 and later (older versions may work but are not supported).
  • When using Kafka 4.0+, set version to at least "2.1.0".

The version setting controls the protocol features used by the client; it does not prevent Healthbeat from talking to newer Kafka brokers.

Configuration options (output.kafka in *beat.yml)

All of the following options live under the output.kafka: section.

enabled

Enable or disable the Kafka output.

output.kafka:
enabled: true
  • Default: true.
  • If false, Kafka output is disabled.
hosts

List of bootstrap broker addresses used to fetch Kafka cluster metadata (topics, partitions, leaders).

output.kafka:
hosts: ["kafka1:9092", "kafka2:9092"]
version

Kafka protocol version Healthbeat should use:

output.kafka:
version: "2.1.0"
  • Defaults to "2.1.0".
  • Valid values: from "0.8.2.0" up to "2.6.0".
  • For Kafka 4.0+, use "2.1.0" or higher.
username / password

Credentials for SASL authentication (for example SASL/PLAIN or SCRAM):

output.kafka:
username: "kafka_user"
password: "kafka_password"

If you set a username, you must also set a password.

sasl.mechanism

SASL mechanism to use when username/password are configured:

output.kafka:
sasl.mechanism: "SCRAM-SHA-512"

Supported values:

  • PLAIN – SASL/PLAIN

  • SCRAM-SHA-256

  • SCRAM-SHA-512

If sasl.mechanism is not set:

  • If username & password are present → defaults to PLAIN.

  • Otherwise → SASL authentication is disabled.

To use Kerberos (GSSAPI), leave sasl.mechanism is empty and uses the Kerberos block instead (see below).

topic

Kafka topic name (or format string) for produced events.

You can:

Set a fixed topic:

 output.kafka:
topic: "healthbeat"

Use a format string based on ECS fields:

 output.kafka:
topic: '%{[data_stream.type]}-%{[data_stream.dataset]}-%{[data_stream.namespace]}'

Use a custom field, for example, fields.log_topic:

 output.kafka:
topic: '%{[fields.log_topic]}'

To populate fields.log_topic, you can use an add_fields processor in your input/module config:

processors:
- add_fields:
target: ''
fields:
log_topic: '%{[data_stream.type]}-%{[data_stream.dataset]}-%{[data_stream.namespace]}'
topics

Advanced topic routing using a list of selector rules. Healthbeat applies the first matching rule; if no rule matches, the topic setting is used.

Each rule supports:

  • topic – format string.
  • mappings – map returned topic to new names.
  • default – default name if no mapping matches.
  • when – conditional (same syntax as processors).

Example (route CRITICAL/ERR messages to dedicated topics):

output.kafka:
hosts: ["localhost:9092"]
topic: "logs-%{[agent.version]}"
topics:
- topic: "critical-%{[agent.version]}"
when.contains:
message: "CRITICAL"
- topic: "error-%{[agent.version]}"
when.contains:
message: "ERR"

Resulting topics: critical-<version>, error-<version>, logs-<version>.

key

Optional formatted string for the Kafka message key (used by brokers for partitioning):

output.kafka:
key: '%{[host.name]}'

If not set, Kafka chooses the key according to its default behaviour.

partition

Partitioning strategy:

output.kafka:
partition.hash:
hash: ["host.name"]
random: true

Supported strategies:

  • random
  • round_robin
  • hash (default)

Additional tuning:

  • random.group_events
  • round_robin.group_events
  • hash.hash (fields list)
  • hash.random (fallback to random if no hash/key)

All partitioners can also set:

partition.round_robin:
reachable_only: true

If reachable_only is true, events are sent only to available partitions (but may become unevenly distributed).

headers

Optional static headers to add to every produced Kafka message:

output.kafka:
headers:
- key: "environment"
value: "prod"
- key: "source"
value: "healthbeat

Values must be strings.

client_id

Client ID used in Kafka logs/metrics:

output.kafka:
client_id: "server1"

Default is "beats".

codec

Controls how events are encoded before sending to Kafka. If omitted, events are JSON encoded.

output.kafka:
codec:
format: json

(Refer to your “output codec” section if you support additional formats.)

metadata

Controls how often Healthbeat refreshes Kafka metadata (brokers, topics, partitions, leaders):

output.kafka:
metadata:
refresh_frequency: 10m
full: false
retry.max: 3
retry.backoff: 250ms
  • refresh_frequency – how often to refresh metadata (default: 10m).
  • full – true to fetch metadata for all topics, false for configured topics only (default: false).
  • retry.max – number of metadata retries (default: 3).
  • retry.backoff – backoff between metadata retries (default: 250ms).
max_retries

Number of retries for publishing events after a failure:

output.kafka:
max_retries: 3
  • < 0 → retry indefinitely until success.
  • Default: 3.
backoff.init / backoff.max

Exponential backoff between retry attempts:

output.kafka:
backoff.init: 1s
backoff.max: 60s
  • backoff.init – first delay (default: 1s).
  • backoff.max – maximum delay (default: 60s).
bulk_max_size

Maximum number of events per Kafka batch:

output.kafka:
bulk_max_size: 2048

Default: 2048.

bulk_flush_frequency

Time-based flush for partial batches:

output.kafka:
bulk_flush_frequency: 0s
  • 0 (default) – no time-based flush; flush is size-driven.
timeout

Timeout for responses from Kafka brokers:

output.kafka:
timeout: 30s

Default: 30s.

broker_timeout

Maximum time the broker waits for the required ACKs:

output.kafka:
broker_timeout: 10s

Default: 10s.

channel_buffer_size

Number of messages buffered per broker in the output pipeline:

output.kafka:
channel_buffer_size: 256

Default: 256.

keep_alive

TCP keep-alive period:

output.kafka:
keep_alive: 0s
  • 0s (default) disables keep-alives.
compression / compression_level

Compression codec and level:

output.kafka:
compression: gzip
compression_level: 4
  • compression – one of: none, snappy, lz4, gzip, zstd.

    • Default: gzip.
    • Azure Event Hub for Kafka: must be none.
  • compression_level (for gzip):

    • 0 – disable compression.
    • 1–9 – speed vs compression tradeoff (default: 4).
max_message_bytes

Maximum size of the encoded message:

output.kafka:
max_message_bytes: 1000000
  • Default: 1000000 bytes.
  • Must be ≤ broker message.max.bytes.
  • Larger events are dropped.
required_acks

Reliability level:

output.kafka:
required_acks: 1

Values:

  • 0 – do not wait for any ACK (high risk of silent loss).
  • 1 – wait for leader to commit (default).
  • -1 – wait for all replicas to commit.
ssl

SSL/TLS settings for broker connections, for example:

output.kafka:
ssl:
enabled: true
certificate_authorities: ["/path/to/ca.pem"]
certificate: "/path/to/client.crt"
key: "/path/to/client.key"

Ensure Kafka’s keystore uses a cipher supported by the Healthbeat Kafka client (typically -keyalg RSA).

queue

Internal queue options for buffering events before they are sent to Kafka.

note

queue options can be set either at the top level in healthbeat.yml or under the output section – but not both.

Internal Queue

Configure the internal queue

Beat agents uses an internal queue to store events before they are published to the configured output (Kafka, HTTPS, etc.). The queue:

  • Buffers events in memory and/or disk.
  • Groups events into batches.
  • Hands batches to the output, which sends them with bulk operations.

You can configure the queue either:

  • in the top-level queue section of *beat.yml, or
  • under the output section (for some outputs),

but not both at the same time. Only one queue type can be active.

Example (memory queue with 4096 events):

queue.mem:
events: 4096

Configure the memory queue (queue.mem)

The memory queue keeps all pending events in RAM.

  • If the queue is full, Healthbeat cannot insert new events until the output acknowledges or drops some events.

  • The memory queue is controlled by:

    • events
    • flush.min_events
    • flush.timeout

Batching behavior

  • flush.min_events and flush.timeout together control how batches are filled.

  • If the output also has bulk_max_size, the actual batch size will be:

    • min(bulk_max_size, flush.min_events).
note

flush.min_events is considered legacy for controlling batch size. New configurations should prefer setting batch size via the output’s bulk_max_size parameter.

Synchronous vs asynchronous mode

  • Synchronous mode
    • Events are returned to the output as soon as they are available.
    • Use when you want to minimize latency.
    • Config:
      • flush.timeout: 0
        (or for backwards compatibility: flush.min_events: 0 or 1, which also caps batch size at half the queue capacity).
  • Asynchronous mode
    • The queue waits up to flush.timeout to try to fill the requested batch.
    • If flush.timeout expires, a partial batch with whatever is available is returned.
    • Config: flush.timeout: <positive duration>, e.g. 5s.

Example – async mode with size + time triggers

queue.mem:
events: 4096
flush.min_events: 512
flush.timeout: 5s

This configuration forwards events when:

  • There are enough events to fill the output’s requested batch (usually driven by bulk_max_size, capped at 512 by flush.min_events), or
  • Events have been waiting 5 seconds.
queue.mem options

All options are under queue.mem::

events

Number of events the queue can hold in memory.

queue.mem:
events: 4096
  • Default: 3200.
flush.min_events

Controls batch size and, in some cases, mode:

  • If > 1:

    • Maximum number of events per batch.
    • The output waits until:
      • that many events accumulate, or
      • flush.timeout elapses.
  • If 0 or 1:

    • Batch size is set to half the queue capacity.
    • Queue switches to synchronous mode (equivalent to flush.timeout: 0).
queue.mem:
flush.min_events: 512
  • Default: 1600.
flush.timeout

Maximum time the queue will wait to fill a batch request from the output.

  • If 0s → synchronous mode: events returned immediately.
  • If > 0 → asynchronous mode: wait up to this duration, then return whatever is available.
queue.mem:
flush.timeout: 5s
  • Default: 10s.

Configure the disk queue (queue.disk)

The disk queue stores pending events on disk instead of (only) in main memory.

Characteristics:

  • Can buffer far more events than the memory queue.
  • Survives Healthbeat or host restarts (until events are sent).
  • Slightly higher overhead because each event is written to and read from disk.

This is useful when:

  • You need high reliability (don’t want to lose events during outages).
  • Disk is not the main bottleneck.

To enable the disk queue with default behavior, set at least max_size:

queue.disk:
max_size: 10GB

The queue:

  • Uses up to max_size on disk, but only as much as needed.
  • Deletes data from disk once the events are successfully sent.
queue.disk options

All options are under queue.disk::

path

Directory to store disk queue data files.

queue.disk:
path: "/var/lib/healthbeat/diskqueue"
  • Default: "${path.data}/diskqueue".
    (Typically resolves under Healthbeat’s data directory.)
max_size (required)

Maximum total size of the queue on disk.

queue.disk:
max_size: 10GB
  • Required.
  • If the queue hits this limit:
    • Inputs may pause or drop events, depending on their configuration.
  • 0 = no enforced maximum:
    • Queue can grow until the disk is full.
    • Use with great caution, ideally only on a dedicated data partition.
  • Default: 10GB.
segment_size

Size of each internal segment file.

queue.disk:
segment_size: 1GB
  • Default: max_size / 10.
  • Smaller segments:
    • More files.
    • Faster deletion of old data.
  • Larger segments:
    • Fewer files.
    • Some data may remain on disk longer before deletion.

In most cases, you can leave this at the default.

read_ahead

How many events to read from disk into memory while waiting for the output to request them.

queue.disk:
read_ahead: 512
  • Default: 512.
  • Increase if outputs are slowed down because they cannot read enough events at once (at the cost of more memory).
write_ahead

How many events the queue can accept and hold in memory while waiting for them to be written to disk.

queue.disk:
write_ahead: 2048
  • Default: 2048.
  • Decrease if the queue’s memory usage is too high because disk writes are too slow.
  • Increase if inputs are blocked or dropping events because disk is the bottleneck (more memory usage, higher throughput).
retry_interval

Base interval between retries after disk-related errors (permissions, disk full, etc.).

queue.disk:
retry_interval: 1s
  • Default: 1s.
max_retry_interval

Maximum backoff between retries after consecutive disk errors.

queue.disk:
max_retry_interval: 30s
  • Default: 30s.

  • Increase if you want fewer logs / less load when the disk is unavailable for a long time.

Where to configure the queue?

You can configure queue.mem or queue.disk:

  • at the top level in *beat.yml, or
  • under the output section (depending on output capabilities),

but not both places at the same time, and only one queue type (memory or disk) can be active.

Updating JAVA Agent’s Startup Options

UNIX systems

This is applicable for all the agents(vuHealthagent/vuAppagent/vuLogagent)

Changing startup options

AIX

On AIX, the agent is managed by init scripts under the agent home.
Both the startup script and supervisor.conf lives in the same directory:

<AGENT_HOME>/etc/rc.d/init.d/
├── vulogagent # startup script
└── supervisor.conf # command-line args & resource caps

To change vuLogAgent command-line options

Edit:

<AGENT_HOME>/etc/rc.d/init.d/supervisor.conf
Solaris (SunOS)

On Solaris (SunOS), the layout is similar, but under etc/init.d:

<AGENT_HOME>/etc/init.d/
├── vulogagent # startup script
└── supervisor.conf # command-line args & resource caps

To modify vuLogAgent command-line arguments:

Edit:

 <AGENT_HOME>/etc/init.d/supervisor.conf
HP-UX

On HPUX, the layout is similar, but under sbin/init.d:

<AGENT_HOME>/sbin/init.d
├── vulogagent # startup script
└── supervisor.conf # command-line args & resource caps

To modify vuLogAgent command-line arguments:

Edit:

 <AGENT_HOME>/sbin/init.d/supervisor.conf

Windows and Linux (vuAppagent)

This section is applicable only for vuAppagent on Linux and Windows.

Linux

On Linux, vuAppagent is managed by a systemd startup wrapper under the agent home:

<AGENT_HOME>/etc/systemd/vuappagent-start
└── vuappagent-start # startup script / wrapper with JVM + agent options

To modify vuAppagent command-line arguments:

Edit:

<AGENT_HOME>/etc/systemd/vuappagent-start

Windows

On Windows, vuAppagent startup options are controlled through an XML configuration file under the agent home:

<AGENT_HOME>\vuappagent\vuappagent.xml
└── vuappagent.xml # Windows service / wrapper configuration

To modify vuAppagent command-line arguments:

Edit:

<AGENT_HOME>\vuappagent\vuappagent.xml

Agent: vuLogagent

command-line options(startup)

You can control vuLogAgent behaviour using the following command-line flags:

OptionDescription
-quietRun in quiet mode – only errors are logged.
-debugRun in debug mode – verbose diagnostic logging.
-traceRun in trace mode – very detailed, low-level logging (useful for deep troubleshooting).
-tailStart reading new files from the end, instead of from the beginning (similar to tail -f).
-versionPrint the agent version and exit.
-config <path/to/config/file>Path to the configuration file to load (for example: -config /opt/vu/vulogagent/conf.d/vulogagent.json).
-idletimeout <seconds>Time between file reads (idle polling interval) in seconds. Example: -idletimeout 5000.
-spoolsize <count>Event count threshold for the internal spool/buffer. When this many events are queued, the agent forces a network flush. Example: -spoolsize 1024.
-signaturelength <bytes>Maximum length of the file signature used for file identity / sincedb tracking. Default: 4096.
-logfile <path/to/logfile>Path and file name for the agent log file.
-logfilesize <size>Maximum size of each log file, e.g. 10MB. Default: 10MB.
-logfilenumber <count>Number of rotated log files to keep. Default: 5.
-workers <count>Number of worker threads used for event processing and shipping. Example: -workers 16.
-diagnosticsfrequency <minutes>How often to emit diagnostic reports (health / stats) in minutes. Example: -diagnosticsfrequency 5.
-sincedb <filename>Name/path of the sincedb state file used to remember file offsets. Example: -sincedb ".logstash-forwarder-java".
-workerBufferSize <count>Event buffer size per worker – maximum number of events buffered per worker before backpressure kicks in. Default: 10000.
-workerQueueSize <count>Task queue size per worker – maximum number of tasks queued per worker. Default: 1000.
-heartbeatport <port>Port used for the agent heartbeat / health check listener (if enabled).

Configuring vuLogAgent

After installation, vuLogAgent is configured via its JSON config file:

<VULOGAGENT_HOME>/conf.d/vulogagent.json

1. The configuration Parameters

  1. Shipper / Target IP

    • The remote vuSmartMaps system where data must be sent.
    • Example: 10.10.10.5 or vu-smartmaps.example.com.
  2. PROTOCOL
    Protocol used to send events:

    • kafka
    • logstash (TCP / Beats-like)
  3. PORT

    • The TCP port on which vuSmartMaps is listening for this agent’s data.
    • Must match the configured listener in vuSmartMaps.
  4. TIMEOUT

    • Maximum time (in milliseconds) to wait for a network connection or response.
    • Example: 15000 (15 seconds).
  5. LOGPATH

    • The path(s) to logs that vuLogAgent should ship.

    • Can be:

      • A single file: /var/log/messages
      • A wildcard: /var/log/*.log
      • A wildcard directory: /var/log/httpd/httpd-*.log
  6. LOGTYPE

    • A logical type used to identify the log source (enriches events).
    • Typical values: "syslog", "apache", "application", etc.
    • This is usually mapped to fields.type in the config.
  7. MULTILINE PATTERN

    • Defines how multiline log messages (e.g. stack traces) are grouped.
    • The pattern is a regular expression that detects the “start” (or continuation) of a multiline message.
    • You can specify:
A single pattern, e.g.

^INFO
Or multiple alternatives, e.g.

^INFO|^ERROR

vuLogAgent then groups lines between these “pattern hits” into one logical message.

  1. NEGATE & WHAT (multiline strategy)

    These two options define how lines are combined:

    • negate: true or false
    • what: next or previous
negatewhatMeaning (high-level)
falsenextLines that match the pattern are appended to the previous non-matching line.
falsebeforeLines that match the pattern are prepended to the next non-matching line.
truenextLines that do NOT match the pattern are appended to the previous match.
truebeforeLines that do NOT match the pattern are prepended to the next match.
  1. DEAD TIME

    • dead time tells vuLogAgent to ignore log files that have not been modified within the specified duration.
    • Time suffixes:
      • m – minutes (e.g. 30m)
      • h – hours (e.g. 1h)
    • Any file older than this period (based on modification time) is not harvested.
  2. CLOSE TIMEOUT

  • close timeout is the maximum duration any file is allowed to remain open in vuLogAgent’s internal watch map.
  • After this time, vuLogAgent closes the file handle (even if the file remains present).
  • Same time suffixes as dead time, e.g. 30m, 1h.
  1. FILTER PATTERN & NEGATE (line filtering)

vuLogAgent can include or exclude lines using a regex filter:

  • pattern – regular expression to match lines.

  • negate:

    • false → keep lines that match the pattern.
    • true → drop lines that match the pattern.

Examples:

  • pattern: ^b, negate: false
    → export only lines starting with b.
  • pattern: ^b, negate: true
    → export all lines except those starting with b.
  1. TOPIC (Kafka only)
  • Kafka topic to which events are published when protocol is kafka.
  • Must match a topic configured on the Kafka cluster used by vuSmartMaps.

2. Understanding vulogagent.json

This section explains the main blocks in the vuLogAgent configuration file, typically named:

<VULOGAGENT_HOME>/conf.d/vulogagent.json
2.1 Common blocks

network

Defines how vuLogAgent connects to the remote receiver:

"network": {
"type": "kafka",
"topic": "%{[topic_tag]}",
"sslEnabled": true,
"trustStore": "/path/to/client-truststore.jks",
"trustStorePassword": "<TRUSTSTORE_PASSWORD>",
"hostVerification": false,
"servers": [
"kafka-broker-1.example.com:9094",
"kafka-broker-2.example.com:9094",
"kafka-broker-3.example.com:9094"
],
"timeout": "30",
"ackType": 1,
"retryConfig": 3
}

You can describe each field like this:

type

Protocol/target type. For Kafka use:

"type": "kafka"

topic

Kafka topic name or format string (e.g. "%{[type]}" to derive topic from field topic_tag).

sslEnabled

true to enable SSL/TLS for Kafka connections; false for plain-text.

trustStore

Path to the Java truststore (JKS/PKCS12) containing the Kafka broker CA:

"trustStore": "/path/to/client-truststore.jks"

trustStorePassword

Password protecting the truststore:

"trustStorePassword": "<TRUSTSTORE_PASSWORD>"

In production, avoid hardcoding; use secure parameter / secret management.

hostVerification

Set to true to enforce hostname verification against the certificate’s CN/SAN, false to skip it.

servers

List of Kafka broker endpoints:

"servers": [
"broker1:9094",
"broker2:9094"
]

timeout

Network timeout in seconds (as string, e.g. "30").

ackType

Kafka acknowledgment level:

  • 0 – no ACK (fast, but unsafe)
  • 1 – leader-only ACK (default)
  • -1 – all replicas must ACK (safest).

retryConfig

Number of retries on send failure (e.g. 3).

files

Defines which logs to read and how to treat them:

"files": [
{
"paths": [
"/var/log/messages",
"/var/log/*.log"
],
"fields": {
"type": "syslog"
},
"dead time": "12h",
"close timeout": "30m"
}
]
  • paths
    • List of paths (files, globs, or wildcard directories).
  • fields
    • Extra metadata attached to each event from these paths (e.g. "type": "syslog" or "type": "apache").
  • dead time
    • Ignore files not modified within this timespan.
  • close timeout
    • Maximum duration a file is allowed to stay open in vuLogAgent’s internal watch map.

filter

Optional block to include/exclude lines:

"filter": {
"pattern": "INFO|ERROR",
"negate": "false"
}
  • pattern
    • Regular expression used to match lines.
  • negate
    • "false" → include only matching lines.
    • "true" → drop matching lines.

3. Sample configurations

3.1 Single-line log monitoring – basic

Example 1 – multiple paths and log types


"network": {
"type": "kafka",
"topic": "%{[topic_tag]}",
"sslEnabled": true,
"trustStore": "/path/to/client-truststore.jks",
"trustStorePassword": "<TRUSTSTORE_PASSWORD>",
"hostVerification": false,
"servers": [
"kafka-broker-1.example.com:9094",
"kafka-broker-2.example.com:9094",
"kafka-broker-3.example.com:9094"
],
"timeout": "30",
"ackType": 1,
"retryConfig": 3
},
"files": [
{
"paths": [
"/var/log/messages",
"/var/log/*.log"
],
"fields": {
"type": "syslog"
}
},
{
"paths": [
"/var/log/apache/httpd-*.log"
],
"fields": {
"type": "apache"
},
"dead time": "12h"
}
]
}

What this does

  • Sends all events to localhost:5043 over TCP.
  • Ships:
    • /var/log/messages and /var/log/*.log as "type": "syslog".
    • /var/log/apache/httpd-*.log as "type": "apache".
  • Ignores Apache logs that haven’t been modified in the last 12 hours.
3.2 Single-line log monitoring – with filtering

Example 2 – filter by INFO/ERROR

"network": {
"type": "kafka",
"topic": "%{[topic_tag]}",
"sslEnabled": true,
"trustStore": "/path/to/client-truststore.jks",
"trustStorePassword": "<TRUSTSTORE_PASSWORD>",
"hostVerification": false,
"servers": [
"kafka-broker-1.example.com:9094",
"kafka-broker-2.example.com:9094",
"kafka-broker-3.example.com:9094"
],
"timeout": "30",
"ackType": 1,
"retryConfig": 3
},
"files": [
{
"paths": [
"/var/log/sample.log"
],
"fields": {
"type": "syslog"
},
"dead time": "2h",
"filter": {
"pattern": "INFO|ERROR",
"negate": "false"
}
}
]
}

What this does

  • Ships only lines in /var/log/sample.log that contain INFO or ERROR.
  • Ignores files older than 2 hours (dead time).

4. Multiline configuration

Use the multiline block to stitch together related lines (e.g. stack traces) into a single event.

4.1 Multiline block
"multiline": {
"pattern": "A\\[",
"negate": "true",
"what": "previous"
}

Fields:

  • pattern
    • Regex pattern to match. In the example: A\[
  • negate
    • true → treat non-matching lines as continuation lines.
  • what
    • previous → attach continuation lines to the previous matching line.
    • next → attach continuation lines to the next matching line.
4.2 Full multiline example
"network": {
"type": "kafka",
"topic": "%{[topic_tag]}",
"sslEnabled": true,
"trustStore": "/path/to/client-truststore.jks",
"trustStorePassword": "<TRUSTSTORE_PASSWORD>",
"hostVerification": false,
"servers": [
"kafka-broker-1.example.com:9094",
"kafka-broker-2.example.com:9094",
"kafka-broker-3.example.com:9094"
],
"timeout": "30",
"ackType": 1,
"retryConfig": 3
},
"files": [
{
"paths": [
"/var/log/messages",
"/var/log/*.log"
],
"multiline": {
"pattern": "A\\[",
"negate": "true",
"what": "previous"
},
"fields": {
"type": "syslog"
}
}
]
}

What this does

  • Watches /var/log/messages and /var/log/*.log.
  • Groups lines until they hit a line matching A[:
    • Line with A[ is treated as start.
    • Non-matching lines before the next A[ are attached to the previous event.