Skip to content

multiline_grok crashes when multiline_flush_interval is used. #107

@gadiego92

Description

@gadiego92

Describe the bug

I think there is a bug when using multiline_grok .

If Fluentd receives a multi line log entry with special characters and it flushes before it gets a new log entry Fluentd crashes.
This error only appears when multiline_flush_interval is used.
Fluentd is in a crash loop state after the error and it's not able to recover. A restart is needed.

To Reproduce
Just run the following command with the setup provided in the Configuration section:

/opt/td-agent/bin/fluentd --config /home/td-agent/fluentd.conf

Your Environment

  • Fluentd version: fluentd 1.16.2
  • TD Agent version: td-agent 4.5.1 fluentd 1.16.2 (d5685ada81ac89a35a79965f1e94bbe5952a5d3a
  • Operating system: AlmaLinux 9.2 (Turquoise Kodkod)
  • Kernel version: 5.14.0-284.30.1.el9_2.x86_64

Configuration

# /home/td-agent/fluentd.conf

<source>

  @type tail
  tag encoding

  path /tmp/fluentd-encoding/utf8.log
  read_from_head true
  multiline_flush_interval 5s

  <parse>

    @type multiline_grok

    custom_pattern_path /home/td-agent/patterns.conf
    grok_failure_key grokfailure
    multiline_start_regexp /\d{4}-\d{1,2}-\d{1,2}/

    <grok>
      pattern %{CUSTOM_LOG_FAILS}
    </grok>

    <grok>
      pattern %{CUSTOM_LOG_WORKS}
    </grok>

  </parse>

</source>

<match encoding>

    @type stdout

</match>
# /tmp/fluentd-encoding/utf8.log

2023-11-22 18:18:09.823+0100 Testing Berlin
2023-11-22 18:18:09.823+0100 Testing Nürnberg
2023-11-22 18:18:09.823+0100 Testing Hannover
2023-11-22 18:18:09.823+0100 Testing München
javax.management.RuntimeErrorException: null
      at Main.main(Main.java:16) ~[bin/:na]
# /home/td-agent/patterns.conf

CUSTOM_LOG_WORKS %{TIMESTAMP_ISO8601:timestamp} %{GREEDYDATA:message}
CUSTOM_LOG_FAILS %{HTTPDATE:timestamp} %{NUMBER:response}

Error Log

[td-agent@5200b897ecea ~]$ /opt/td-agent/bin/fluentd --config /home/td-agent/fluentd.conf
2023-12-12 17:40:48 +0100 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2023-12-12 17:40:48 +0100 [info]: parsing config file is succeeded path="/home/td-agent/fluentd.conf"
2023-12-12 17:40:48 +0100 [info]: gem 'fluentd' version '1.16.2'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-calyptia-monitoring' version '0.1.3'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-elasticsearch' version '5.3.0'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.1.0'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-grok-parser' version '2.6.2'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-kafka' version '0.19.0'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-metrics-cmetrics' version '0.1.2'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-opensearch' version '1.1.4'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-opensearch' version '1.1.3'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-prometheus' version '2.1.0'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-prometheus_pushgateway' version '0.1.1'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-record-modifier' version '2.1.1'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-s3' version '1.7.2'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-sd-dns' version '0.1.0'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-td' version '1.2.0'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-utmpx' version '0.5.0'
2023-12-12 17:40:48 +0100 [info]: gem 'fluent-plugin-webhdfs' version '1.5.0'
2023-12-12 17:40:48 +0100 [info]: Expanded the pattern %{CUSTOM_LOG_FAILS} into (?:(?<timestamp>(?:(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]))/(?:\b(?:[Jj]an(?:uary|uar)?|[Ff]eb(?:ruary|ruar)?|[Mm](?:a|ä)?r(?:ch|z)?|[Aa]pr(?:il)?|[Mm]a(?:y|i)?|[Jj]un(?:e|i)?|[Jj]ul(?:y|i)?|[Aa]ug(?:ust)?|[Ss]ep(?:tember)?|[Oo](?:c|k)?t(?:ober)?|[Nn]ov(?:ember)?|[Dd]e(?:c|z)(?:ember)?)\b)/(?:(?>\d\d){1,2}):(?:(?!<[0-9])(?:(?:2[0123]|[01]?[0-9])):(?:(?:[0-5][0-9]))(?::(?:(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)))(?![0-9])) (?:(?:[+-]?(?:[0-9]+)))) (?<response>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))))))
2023-12-12 17:40:48 +0100 [info]: Expanded the pattern %{CUSTOM_LOG_WORKS} into (?:(?<timestamp>(?:(?>\d\d){1,2})-(?:(?:0?[1-9]|1[0-2]))-(?:(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]))[T ](?:(?:2[0123]|[01]?[0-9])):?(?:(?:[0-5][0-9]))(?::?(?:(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)))?(?:(?:Z|[+-](?:(?:2[0123]|[01]?[0-9]))(?::?(?:(?:[0-5][0-9])))))?) (?<message>.*))
2023-12-12 17:40:48 +0100 [warn]: 'pos_file PATH' parameter is not set to a 'tail' source.
2023-12-12 17:40:48 +0100 [warn]: this parameter is highly recommended to save the position to resume tailing.
2023-12-12 17:40:48 +0100 [info]: using configuration file: <ROOT>
  <source>
    @type tail
    path "/tmp/fluentd-encoding/utf8.log"
    read_from_head true
    tag "encoding"
    multiline_flush_interval 5s
    <parse>
      @type "multiline_grok"
      multiline_start_regexp "/\\d{4}-\\d{1,2}-\\d{1,2}/"
      grok_failure_key "grokfailure"
      custom_pattern_path "/home/td-agent/patterns.conf"
      unmatched_lines
      <grok>
        pattern "%{CUSTOM_LOG_FAILS}"
      </grok>
      <grok>
        pattern "%{CUSTOM_LOG_WORKS}"
      </grok>
    </parse>
  </source>
  <match encoding>
    @type stdout
  </match>
</ROOT>
2023-12-12 17:40:48 +0100 [info]: starting fluentd-1.16.2 pid=857 ruby="3.1.4"
2023-12-12 17:40:48 +0100 [info]: spawn command to main:  cmdline=["/opt/td-agent/bin/ruby", "-Eascii-8bit:ascii-8bit", "/opt/td-agent/bin/fluentd", "--config", "/home/td-agent/fluentd.conf", "--under-supervisor"]
2023-12-12 17:40:48 +0100 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2023-12-12 17:40:48 +0100 [info]: adding match pattern="encoding" type="stdout"
2023-12-12 17:40:48 +0100 [info]: adding source type="tail"
2023-12-12 17:40:48 +0100 [info]: #0 Expanded the pattern %{CUSTOM_LOG_FAILS} into (?:(?<timestamp>(?:(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]))/(?:\b(?:[Jj]an(?:uary|uar)?|[Ff]eb(?:ruary|ruar)?|[Mm](?:a|ä)?r(?:ch|z)?|[Aa]pr(?:il)?|[Mm]a(?:y|i)?|[Jj]un(?:e|i)?|[Jj]ul(?:y|i)?|[Aa]ug(?:ust)?|[Ss]ep(?:tember)?|[Oo](?:c|k)?t(?:ober)?|[Nn]ov(?:ember)?|[Dd]e(?:c|z)(?:ember)?)\b)/(?:(?>\d\d){1,2}):(?:(?!<[0-9])(?:(?:2[0123]|[01]?[0-9])):(?:(?:[0-5][0-9]))(?::(?:(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)))(?![0-9])) (?:(?:[+-]?(?:[0-9]+)))) (?<response>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))))))
2023-12-12 17:40:48 +0100 [info]: #0 Expanded the pattern %{CUSTOM_LOG_WORKS} into (?:(?<timestamp>(?:(?>\d\d){1,2})-(?:(?:0?[1-9]|1[0-2]))-(?:(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]))[T ](?:(?:2[0123]|[01]?[0-9])):?(?:(?:[0-5][0-9]))(?::?(?:(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)))?(?:(?:Z|[+-](?:(?:2[0123]|[01]?[0-9]))(?::?(?:(?:[0-5][0-9])))))?) (?<message>.*))
2023-12-12 17:40:48 +0100 [warn]: #0 'pos_file PATH' parameter is not set to a 'tail' source.
2023-12-12 17:40:48 +0100 [warn]: #0 this parameter is highly recommended to save the position to resume tailing.
2023-12-12 17:40:48 +0100 [info]: #0 starting fluentd worker pid=862 ppid=857 worker=0
2023-12-12 17:40:48 +0100 [info]: #0 following tail of /tmp/fluentd-encoding/utf8.log
2023-12-12 17:40:48 +0100 [warn]: #0 invalid line found file="/tmp/fluentd-encoding/utf8.log" line="2023-11-22 18:18:09.823+0100 Testing N\xC3\xBCrnberg" error="incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)"
2023-12-12 17:40:48.714389300 +0100 encoding: {"timestamp":"2023-11-22 18:18:09.823+0100","message":"Testing Berlin"}
2023-12-12 17:40:48.714509014 +0100 encoding: {"timestamp":"2023-11-22 18:18:09.823+0100","message":"Testing Hannover"}
2023-12-12 17:40:48 +0100 [info]: #0 fluentd worker is now running worker=0
2023-12-12 17:40:53 +0100 [error]: #0 incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin/parser_regexp.rb:50:in `match'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin/parser_regexp.rb:50:in `parse'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluent-plugin-grok-parser-2.6.2/lib/fluent/plugin/parser_multiline_grok.rb:21:in `block in parse'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluent-plugin-grok-parser-2.6.2/lib/fluent/plugin/parser_multiline_grok.rb:20:in `each'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluent-plugin-grok-parser-2.6.2/lib/fluent/plugin/parser_multiline_grok.rb:20:in `parse'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin/in_tail.rb:606:in `flush_buffer'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin/in_tail.rb:1256:in `on_notify'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin/in_tail.rb:850:in `on_notify'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin/in_tail.rb:407:in `block in setup_watcher'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin/in_tail.rb:783:in `on_timer'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/cool.io-1.7.1/lib/cool.io/loop.rb:88:in `run_once'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/cool.io-1.7.1/lib/cool.io/loop.rb:88:in `run'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
  2023-12-12 17:40:53 +0100 [error]: #0 /opt/td-agent/lib/ruby/gems/3.1.0/gems/fluentd-1.16.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'

When the error appears Fluentd is not able to recover and it gets in a crash loop.

Additional context
If you just add a new line after the multi line log entry with special characters the error doesn't appear:

# /tmp/fluentd-encoding/utf8.log

2023-11-22 18:18:09.823+0100 Testing Berlin
2023-11-22 18:18:09.823+0100 Testing Nürnberg
2023-11-22 18:18:09.823+0100 Testing Hannover
2023-11-22 18:18:09.823+0100 Testing München
javax.management.RuntimeErrorException: null
      at Main.main(Main.java:16) ~[bin/:na]
2023-11-22 18:18:09.823+0100 Testing Stuttgart 

Output

2023-12-12 17:45:46 +0100 [info]: #0 starting fluentd worker pid=874 ppid=869 worker=0
2023-12-12 17:45:46 +0100 [info]: #0 following tail of /tmp/fluentd-encoding/utf8.log
2023-12-12 17:45:46 +0100 [warn]: #0 invalid line found file="/tmp/fluentd-encoding/utf8.log" line="2023-11-22 18:18:09.823+0100 Testing N\xC3\xBCrnberg" error="incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)"
2023-12-12 17:45:46 +0100 [warn]: #0 invalid line found file="/tmp/fluentd-encoding/utf8.log" line="2023-11-22 18:18:09.823+0100 Testing M\xC3\xBCnchen\njavax.management.RuntimeErrorException: null\n      at Main.main(Main.java:16) ~[bin/:na]" error="incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)"
2023-12-12 17:45:46.604890113 +0100 encoding: {"timestamp":"2023-11-22 18:18:09.823+0100","message":"Testing Berlin"}
2023-12-12 17:45:46.604995979 +0100 encoding: {"timestamp":"2023-11-22 18:18:09.823+0100","message":"Testing Hannover"}
2023-12-12 17:45:46 +0100 [info]: #0 fluentd worker is now running worker=0
2023-12-12 17:45:51.605177650 +0100 encoding: {"timestamp":"2023-11-22 18:18:09.823+0100","message":"Testing Stuttgart"}

There are some warning but Fluentd keeps working.

Conclussion
It seems it's a matter of how frequent Fluentd receives logs and the multiline_flush_interval parameter frequency.

If you get a multi line log entry with special characters there are two options:

  1. Fluentd flushes before it gets a new log entry. Fluentd crashes.
  2. Fluentd gets a new log entry before it flushes. Fluentd keeps working.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions