Fluentd is an open source software that allows you to get events in many methods transform and ship them to various destinations and in a configurable manner. Once installed on a server, it runs in the background to collect, parse, transform and ship various types of data. td-agent is a stable distribution package of Fluentd, QAed by Treasure Data, and using it is recommended. If you are setting up your environment from scratch, we recommend on td-agent v4, which encapsulates fluentd v1 inside.
- Learn how to install Fluentd in your environment.
- To see a comparison between the versions, go here.
Capture syslog
Events that conform to the RFC3164\5424
standards can be captured using the syslog input plugin.
BSD vs. IETF
RFC3164 (BSD)
<priority>timestamp hostname application: message
In Fluentd, the application field is referred to as ident.
RFC5424 (IETF)
<priority>VERSION ISOTIMESTAMP HOSTNAME APPLICATION PID MESSAGEID STRUCTURED-DATA MSG
Transport layer
When you configure a syslog source, you choose a transfer protocol, either TCP or UDP. TCP is the recommended protocol, as it guarantees delivery of data in order and without any dropped log messages.
Consider using UDP in extreme cases where you have network and CPU utilization issues that need to be worked around combined with an extremely high volume of log messages.
support_colonless_ident
If your message does not contain the ident field, tune the syslog parser and set support_colonless_ident flag to false. This way you'll avoid a situation where the message's prefix is hijacked by this field.
Example
<source>
@type syslog
port 5140
bind 0.0.0.0
<transport tcp>
</transport>
<parse>
@type syslog
parser_type string
support_colonless_ident false
</parse>
tag MY-DATA-TYPE
</source>
Forward To S3
Outputs Format
Set this field carefully in order to achieve the desired seamless effect. A correct configuration will make the S3 object's lines look exactly as originally generated by the product.
If the syslog plugin parser is in use for the exact data type, pick up the single_value formatter. The syslog parser stores the entire message in the message field while the single_value formatter output solely the value of this field by default, as desired.
If the JSON plugin parser is in use, pick up the JSON formatter to reconstruct the original structure.
Authentication
S3 output plugin provides several credential methods for authentication and authorization, including: IAM user (i.e., access key and secret), IAM role and instance profile. All methods and their corresponding parameters are documented here.
Buffering
Many considerations should be taken when you come to fine tune your environment. The full parameters list is documented here.
Compression
Store compressed files to reduce network traffic and save storage costs. Hunters detects and handles gzip files without requiring any special action.
Using the built-in gzip compressor suffers from a main drawback, which is blocking other jobs at the time compression takes places (due to Ruby's GIL ). S3/Treasure Data plugin allows compression outside of the Fluentd process, using gzip. This frees up the Ruby interpreter while allowing Fluentd to process other tasks. Set store_as to gzip_command.
Example
<match MY-DATA-TYPE.**>
@type s3
store_as gzip_command
path DATA-TYPE/%Y/%m/%d/
aws_key_id xxxxx
aws_sec_key xxxxx
s3_bucket MY-BUCKET-NAME
s3_region us-west-2
slow_flush_log_threshold 40s
s3_object_key_format %{path}%{time_slice}_%{chunk_id}.%{file_extension}
<buffer time,host>
@type file
path /data/fluent/buffer/MY-DATA-TYPE
timekey 5m
timekey_wait 0s
chunk_limit_size 64MB
total_limit_size 1024MB
retry_timeout 1h
retry_type exponential_backoff
retry_max_interval 30
flush_mode interval
flush_interval 60s
flush_thread_count 2
overflow_action drop_oldest_chunk
</buffer>
<format>
@type single_value
</format>
</match>
Local Fluentd Server Installation
Overview
This article is designed to help you deploy a local Fluentd server.
The instructions provided are a recommendation and deemed a best practice. Hunters will not be able to provide support for any issues with your Fluentd server.
It is recommended to consult with a Fluentd consulting expert (see here for a list of recommended Enterprise consulting firms supporting Fluentd).
There are many ways that you might choose to build and deploy this server. Some possible configurations include:
- Single Fluentd server listening on a variety of network ports.
- Single / Multiple Fluentd servers forwarding to other Fluentd servers.
- Fluentd reading from local files and tailing them over time.
- Fluentd integrated with syslog-ng or other syslog server.
- Any combination of the above.
There are a number of ways that a capture solution built on Fluentd can be created, however this guide covers the first option above, a single Fluentd server listening on a variety of ports. This article assumes a few more points:
-
Ubuntu 20.04 server will be used to host this box. Older versions of Ubuntu will work with very small changes to these steps. Other distributions of Linux could also be used, however the necessary modifications are not covered in this guide.
-
Fluentd is CPU bound before anything else, which means that the server’s performance will be dictated by the number and type of CPUs only.
Notes
- Ram is only marginally used as this guide instructs always buffering to disk and NOT ram.
- The IOPS to the disk do not seem to influence performance.
- All the data that is written to disk is gzipped by the configurations in this guide, hence lowering IO requirements and increasing CPU further.
VM Configuration Examples
- When building Fluentd servers in AWS, it is generally to advised use an m5.Xlarge as a base machine.
- This machine is 4 CPUs and 16GB of ram, and is capable of running comfortably at 25K EPS.
- An m5.2xlarge (8 CPU / 32GB) is capable of running even beyond 50K EPS.
- Keep some of these numbers in mind as you size your server.
Installation Instructions
-
Starting from a base install of Ubuntu Server 20.04 LTS. Connect to your new server with a privileged account. You will need to run commands as root, so it’s mandatory that you have that level of access.
-
Upgrade the software on the server to make sure that the box is completely updated.
sudo apt update sudo apt upgrade
-
To install other Fluentd plugins (Prometheus, Azure Blob, GCP, etc) install prerequisites at this time.
sudo apt install build-essential ruby ruby-dev
-
Install additional debugging tools.
sudo apt install net-tools
-
Make changes to the underlying configuration of your Ubuntu server to increase a number of system limits. Edit the file
/etc/security/limits.conf
and add the following to the bottom of your limits.conf.root soft nofile 65536 root hard nofile 65536 * soft nofile 65536 * hard nofile 65536
-
Next edit the file at
/etc/sysctl.conf
and make the following changes.net.core.somaxconn = 1024 net.core.netdev_max_backlog = 5000 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_wmem = 4096 12582912 16777216 net.ipv4.tcp_rmem = 4096 12582912 16777216 net.ipv4.tcp_max_syn_backlog = 8096 net.ipv4.tcp_slow_start_after_idle = 0 net.ipv4.tcp_tw_reuse = 1 net.ipv4.ip_local_port_range = 10240 65535
-
Save both files after making these changes, and then reboot your server. Once it comes back up and you can login, you may continue.
sudo reboot
-
The next step is to install the td-agent software. td-agent is a version of the Fluentd software that is built and maintained by a company called Treasure Data, and is the version of Fluentd that we will be using in this walkthrough. At the time of this writing td-agent is at version 4.1.1. so that is the version that we will install. td-agent comes as a shell script that will add the proper software repository to your systems apt configuration and will then install the package. This is preferable as keeping td-agent up to date will now happen every time you update the server. To install td-agent on Ubuntu 20.04 (Focal)
# td-agent 4 curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-focal-td-agent4.sh | sh
If you are using an older version of Ubuntu, here are the installers. For Ubuntu 18.04 (Bionic Beaver)
# td-agent 4 curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-bionic-td-agent4.sh | sh
And for Ubuntu 16.04 (Xential) the installer is
# td-agent 4 curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent4.sh | sh
-
Install any other needed Fluentd plugins. Below is how you would install the Prometheus plugin (to expose pipeline metrics) and the Azure plugin, if you need to write to Azure blob storage. The S3 plugin comes with td-agent so we don’t need to install that. A complete list of plugins is available here.
sudo gem install fluent-plugin-prometheus sudo gem install fluent-plugin-azurestorage
-
If you are going to utilize the configurations below, you need to create and
chown
a directory. This directory is used for chunk storage, that is temporary storage of logs until it’s time to upload those files (based on the timekey setting).sudo mkdir /var/spool/td-agent/ sudo chown td-agent:td-agent /var/spool/td-agent/
Your installation is now complete.
Fluentd Configuration
Before we dig into the configuration, keep the following paths in mind.
/etc/td-agent/td-agent.conf
is the default configuration file./var/log/td-agent/
is the path where all log files are stored, and the main file istd-agent.log
/var/spool/td-agent/
you need to create this directory for chunk storage (chown totd-agent:td-agent
).
You need to create a proper configuration in the td-agent.conf file so td-agent knows how to run, on what ports to listen, and where to upload received data. This guide suggests building out your td-agent file using a building block approach. Or simply think about your data sources as a pipeline. That would be the source → filter(s) → outputs. You only need to build a few source stanzas (one for TCP , one for UDP, and one for any unique sources), a single filter if you even need it and likely a single output (with some modifications for each pipeline).
Source Configuration
An example source stanza might look like this. To learn more about the Fluentd syslog input, it is advised to read Fluentd’s documentation.
<source>
@type syslog
port 5000
bind 0.0.0.0
@log_level trace
<parse>
@type syslog
support_colonless_ident false
message_format rfc5424
</parse>
<transport tcp>
</transport>
tag default_syslog
</source>
In the above configuration there a few things to note.
Source
- The “port 5000” and the “listen 0.0.0.0” will dictate on what ports fluentd is listening and on what interface(s).
- 0.0.0.0 means all interface(s) in this example.
- Remember: If you need to listen on ports below 1024 you will need to change your Fluentd configuration to run Fluentd as root.
Modify Fluentd Systemd Service File
If you’re using systemd to manage Fluentd, you can modify the service file to run as the root user. Locate the Fluentd service file, which is typically located at /lib/systemd/system/td-agent.service
or /etc/systemd/system/fluentd.service.
sudo nano /lib/systemd/system/td-agent.service
Look for the User and Group directives and comment them out or change them to root:
[Service]
# User=td-agent
# Group=td-agent
User=root
Group=root
Restart the Service
After making the changes, you need to reload the systemd configuration and restart the service:
sudo systemctl daemon-reload
sudo systemctl restart td-agent
Parse
The parse section of the example above, tells Fluentd how to parse each line that is received. This is the section that seems to require the most customization in this guide’s configuration. This is due to the differences in the message format itself, even for tools that claim to support a specific syslog standard (RFC3164, and RFC5424) the actual bytes on the wire might change from the specification. When working on a parsing statement, always refer to the documentation.
Transport
The transport section is how you specify TCP or UDP connectivity will be used. If you would like to use both, then create two identical
Tag
The last part, the tag is one of the most important settings, so this guide covers it in more depth. Tags are used to determine routing within Fluentd for all events. Each event will be tagged at ingestion time by the tag that is listed on that source. In this case default_syslog, however for syslog that is not the only tag that will be applied.
Here is an example that shows an event that was captured by the source statement above:
2021-06-25T22:45:34+00:00 default_syslog.auth.info {"host":"localhost","ident":"prg00000","pid":"1234","msgid":"-","extradata":"-","message":"seq: 0000000008, thread: 0000, runid: 1624661134, stamp: 2021-06-25T22:45:34 PADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPADDPAD"}
By looking at the tag, it’s visible that it is default_syslog.auth.info not default_syslog as you might expect. What Fluentd actually does is to build the tag as tag_value.facility.severity and that requires a slight modification to the
Instead of writing your tag matches as default_syslog.* write them as default_syslog.**
Filter Configuration
We will use filter stanzas to enable Prometheus metrics on the ingest pipelines that I create. This allows to monitor and understand the performance of your Fluentd servers. If you have no need for metrics, you can skip this step.
An example filter stanza might look like this:
<filter default_syslog.**>
@type prometheus
<metric>
name default_syslog_input_num_records_total
type counter
desc The total number of records sent to the default_syslog collector.
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
<metric>
name fluentd_input_status_num_records_total
type counter
desc The total number of records sent to all inputs.
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
</filter>
In this example, we are creating two counters. One is default_syslog_input_num_records_total and the second is fluentd_input_status_num_records_total. For each line that is received we are incrementing both counters, this will give a counter with the number of events per pipeline and a total for all pipelines. A very good initial dashboard to view the metics generated here is this dashboard for Grafana.
Match Configuration
A simple local output configuration might look like this:
<match syslog.**>
@type file
path /var/spool/td-agent/default_syslog
compress gzip
<buffer time>
@type file
timekey 5m
timekey_use_utc true
timekey_wait 60s
flush_at_shutdown true
</buffer>
</match>
This configuration will take all data sent to the default_syslog input to gzipped files that will live in /var/spool/td-agent/default_syslog/. Those files will be written every five minutes (timekey value).
A more complex configuration with s3 support might look like.
<match syslog.**>
@type s3
aws_key_id YOUR_AWS_KEY_ID_HERE
aws_sec_key YOUR_AWS_SECRET_KEY_HERE
s3_bucket YOUR_S3_BUCKET_NAME_HERE
s3_region YOUR_S3_REGION_HERE
path syslog/
<buffer time>
@type file
path /opt/s3
timekey 5m
timekey_use_utc true
timekey_wait 60s
chunk_limit_size 128m
flush_at_shutdown true
</buffer>
</match>
And a more complex configuration with S3 output and prometheus metrics might look like:
<match default_syslog.**>
@type copy
<store>
@type s3
aws_key_id YOUR_AWS_KEY_ID_HERE
aws_sec_key YOUR_AWS_SECRET_KEY_HERE
s3_bucket YOUR_S3_BUCKET_NAME_HERE
s3_region YOUR_S3_REGION_HERE
path syslog/
<buffer time>
@type file
path /opt/s3
timekey 5m
timekey_use_utc true
timekey_wait 60s
chunk_limit_size 128m
flush_at_shutdown true
</buffer>
</store>
<store>
@type prometheus
<metric>
name default_syslog_output_num_records_total
type counter
desc The total number of records sent to the default_syslog output destination.
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
<metric>
name fluentd_output_status_num_records_total
type counter
desc The total number of records sent to all outputs.
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
</store>
</match>
An example full configuration is as follows:
# - Default Syslog Source Configurations - #
<source>
@type syslog
port 5000
bind 0.0.0.0
@log_level trace
<parse>
@type syslog
support_colonless_ident false
message_format rfc5424
rfc5424_time_format %Y-%m-%dT%H:%M:%S
with_priority true
</parse>
<transport tcp>
</transport>
tag default_syslog
</source>
# - This tracks the number of events into the default_syslog pipeline - #
<filter default_syslog.**>
@type prometheus
<metric>
name default_syslog_input_num_records_total
type counter
desc The total number of records sent to the default_syslog collector.
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
<metric>
name fluentd_input_status_num_records_total
type counter
desc The total number of records sent to all inputs.
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
</filter>
# in order to track both metrics and send data out an output, it's necessary to use a copy.
<match default_syslog.**>
@type copy
<store>
@type file
path /var/spool/td-agent/default_syslog
compress gzip
<buffer time>
@type file
timekey 5m
timekey_use_utc true
timekey_wait 60s
flush_at_shutdown true
</buffer>
</store>
<store>
@type prometheus
<metric>
name default_syslog_output_num_records_total
type counter
desc The total number of records sent to the default_syslog output destination.
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
<metric>
name fluentd_output_status_num_records_total
type counter
desc The total number of records sent to all outputs.
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
</store>
</match>
# End Of File