Skip to content

Commit 8381e7a

Browse files
committed
Feat: fully ECS compliant captures (#297)
The joint effort of pattern captures ECS-ification (#278) Additionally also includes: #298, #299, #273 resolves #278 fixes #248 fixes #258 closing #243 fixes #233 closing #173
1 parent 3b0ea51 commit 8381e7a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+5549
-931
lines changed

.ci/setup.sh

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
#!/bin/bash
2+
3+
set -ex
4+
5+
sudo yum install -y git

CHANGELOG.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,101 @@
1+
## 4.3.0
2+
3+
With **4.3.0** we're introducing a new set of pattern definitions compliant with Elastic Common Schema (ECS), on numerous
4+
places patterns are capturing names prescribed by the schema or use custom namespaces that do not conflict with ECS ones.
5+
6+
Changes are backwards compatible as much as possible and also include improvements to some of the existing patterns.
7+
8+
Besides fields having new names, values for numeric (integer or floating point) types are usually converted to their
9+
numeric representation to ease further event processing (e.g. `http.response.status_code` is now stored as an integer).
10+
11+
NOTE: to leverage the new ECS pattern set in Logstash a grok filter upgrade to version >= 4.4.0 is required.
12+
13+
- **aws**
14+
* in ECS mode we dropped the (incomplete) attempt to capture `rawrequest` from `S3_REQUEST_LINE`
15+
* `S3_ACCESS_LOG` will handle up-to-date S3 access-log formats (6 'new' field captures at the end)
16+
Host Id -> Signature Version -> Cipher Suite -> Authentication Type -> Host Header -> TLS version
17+
* `ELB_ACCESS_LOG` will handle optional (`-`) in legacy mode
18+
* null values such as `-` or `-1` time values (e.g. `ELB_ACCESS_LOG`'s `request_processing_time`)
19+
are not captured in ECS mode
20+
21+
- **bacula**
22+
- Fix: improve matching of `BACULA_HOST` as `HOSTNAME`
23+
- Fix: legacy `BACULA_` patterns to handle (optional) spaces
24+
- Fix: handle `BACULA_LOG` 'Job Id: X' prefix as optional
25+
- Fix: legacy matching of BACULA fatal error lines
26+
27+
- **bind**
28+
- `BIND9`'s legacy `querytype` was further split into multiple fields as:
29+
`dns.question.type` and `bind.log.question.flags`
30+
- `BIND9` patterns (legacy as well) were adjusted to handle Bind9 >= 9.11 compatibility
31+
- `BIND9_QUERYLOGBASE` was introduced for potential re-use
32+
33+
- **bro**
34+
* `BRO_` patterns are stricter in ECS mode - won't mistakenly match newer BRO/Zeek formats
35+
* place holders such as `(empty)` tags and `-` null values won't be captured
36+
* each `BRO_` pattern has a newer `ZEEK_` variant that supports latest Zeek 3.x versions
37+
e.g. `ZEEK_HTTP` as a replacement for `BRO_HTTP` (in ECS mode only),
38+
there's a new file **zeek** where all of the `ZEEK_XXX` pattern variants live
39+
40+
- **exim**
41+
* introduced `EXIM` (`EXIM_MESSAGE_ARRIVAL`) to match message arrival log lines - in ECS mode!
42+
43+
- **firewalls**
44+
* introduced `IPTABLES` pattern which is re-used within `SHOREWALL` and `SFW2`
45+
* `SHOREWALL` now supports IPv6 addresses (in ECS mode - due `IPTABLES` pattern)
46+
* `timestamp` fields will be captured for `SHOREWALL` and `SFW2` in legacy mode as well
47+
* `SHOREWALL` became less strict in containing the `kernel:` sub-string
48+
* `NETSCREENSESSIONLOG` properly handles optional `session_id=... reason=...` suffix
49+
* `interval` and `xlate_type` (legacy) CISCO fields are not captured in ECS mode
50+
51+
- **core** (grok-patterns)
52+
* `SYSLOGFACILITY` type casts facility code and priority in ECS mode
53+
* `SYSLOGTIMESTAMP` will be captured (from `SYSLOGBASE`) as `timestamp`
54+
* Fix: e-mail address's local part to match according to RFC (#273)
55+
56+
- **haproxy**
57+
* several ECS-ified fields will be type-casted to integer in ECS mode e.g. *haproxy.bytes_read*
58+
* fields containing null value (`-`) are no longer captured
59+
(e.g. in legacy mode `captured_request_cookie` gets captured even if `"-"`)
60+
61+
- **httpd**
62+
* optional fields (e.g. `http.request.referrer` or `user_agent`) are only captured when not null (`-`)
63+
* `source.port` (`clientport` in legacy mode) is considered optional
64+
* dropped raw data (`rawrequest` legacy field) in ECS mode
65+
* Fix: HTTPD_ERRORLOG should match when module missing (#299)
66+
67+
- **java**
68+
* `JAVASTACKTRACEPART`'s matched line number will be converted to an integer
69+
* `CATALINALOG` matching was updated to handle Tomcat 7/8/9 logging format
70+
* `TOMCATLOG` handles the default Tomcat 7/8/9 logging format
71+
* old (custom) legacy TOMCAT format is handled by the added `TOMCATLEGACY_LOG`
72+
* `TOMCATLOG` and `TOMCAT_DATESTAMP` still match the legacy format,
73+
however this might change at a later point - if you rely on the old format use `TOMCATLEGACY_` patterns
74+
75+
- **junos**
76+
* integer fields (e.g. `juniper.srx.elapsed_time`) are captured as integer values
77+
78+
- **linux-syslog**
79+
* `SYSLOG5424LINE` captures (overwrites) the `message` field instead of using a custom field name
80+
* regardless of the format used, in ECS mode, timestamps are always captured as `timestamp`
81+
* fields such as `log.syslog.facility.code` and `process.pid` are converted to integers
82+
83+
- **mcollective**
84+
* *mcollective-patterns* file was removed, it's all one *mcollective* in ECS mode
85+
* `MCOLLECTIVE`'s `process.pid` (`pid` previously) is not type-casted to an integer
86+
87+
- **nagios**
88+
* numeric fields such as `nagios.log.attempt` are converted to integer values in ECS mode
89+
90+
- **rails**
91+
* request duration times from `RAILS3` log will be converted to floating point values
92+
93+
- **squid**
94+
* `SQUID3`'s `duration` http.response `status_code` and `bytes` are type-casted to int
95+
* `SQUID3` pattern won't capture null ('-') `user.name` or `squid.response.content_type`
96+
* Fix: allow to parse SQUID log with status 0 (#298)
97+
* Fix: handle optional server address (#298)
98+
199
## 4.2.0
2100
- Fix: Java stack trace's JAVAFILE to better match generated names
3101
- Fix: match Information/INFORMATION in LOGLEVEL [#274](https://github.com/logstash-plugins/logstash-patterns-core/pull/274)

Gemfile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,6 @@ if Dir.exist?(logstash_path) && use_logstash_source
99
gem 'logstash-core', :path => "#{logstash_path}/logstash-core"
1010
gem 'logstash-core-plugin-api', :path => "#{logstash_path}/logstash-core-plugin-api"
1111
end
12+
13+
# TODO till filter grok with ECS support is released :
14+
gem 'logstash-filter-grok', git: 'https://github.com/kares/logstash-filter-grok.git', ref: 'ecs-1-support'

README.md

Lines changed: 11 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,29 +2,28 @@
22

33
[![Travis Build Status](https://travis-ci.com/logstash-plugins/logstash-patterns-core.svg)](https://travis-ci.com/logstash-plugins/logstash-patterns-core)
44

5-
This is a plugin for [Logstash](https://github.com/elastic/logstash).
5+
This plugin provides [pattern definitions][1] used by the [grok filter][2].
66

77
It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
88

99
## Documentation
1010

11-
Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/).
11+
Logstash provides infrastructure to automatically generate documentation for this plugin.
12+
We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc
13+
and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/).
1214

1315
- For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
1416
- For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide
1517

1618
## Need Help?
1719

18-
Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
20+
Need help? Try https://discuss.elastic.co/c/logstash discussion forum.
1921

2022
## Developing
2123

2224
### 1. Plugin Developement and Testing
2325

2426
#### Code
25-
- To get started, you'll need JRuby with the Bundler gem installed.
26-
27-
- Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example).
2827

2928
- Install dependencies
3029
```sh
@@ -51,20 +50,16 @@ bundle exec rspec
5150

5251
- Edit Logstash `Gemfile` and add the local plugin path, for example:
5352
```ruby
54-
gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
53+
gem "logstash-patterns-core", :path => "/your/local/logstash-patterns-core"
5554
```
5655
- Install plugin
5756
```sh
5857
# Logstash 2.3 and higher
5958
bin/logstash-plugin install --no-verify
60-
61-
# Prior to Logstash 2.3
62-
bin/plugin install --no-verify
63-
6459
```
6560
- Run Logstash with your plugin
6661
```sh
67-
bin/logstash -e 'filter {awesome {}}'
62+
bin/logstash -e 'filter { grok { } }'
6863
```
6964
At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
7065

@@ -74,16 +69,11 @@ You can use the same **2.1** method to run your plugin in an installed Logstash
7469

7570
- Build your plugin gem
7671
```sh
77-
gem build logstash-filter-awesome.gemspec
72+
gem build logstash-patterns-core.gemspec
7873
```
7974
- Install the plugin from the Logstash home
8075
```sh
81-
# Logstash 2.3 and higher
8276
bin/logstash-plugin install --no-verify
83-
84-
# Prior to Logstash 2.3
85-
bin/plugin install --no-verify
86-
8777
```
8878
- Start Logstash and proceed to test the plugin
8979

@@ -96,3 +86,6 @@ Programming is not a required skill. Whatever you've seen about open source and
9686
It is more important to the community that you are able to contribute.
9787

9888
For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
89+
90+
[1]: /tree/master/patterns
91+
[2]: https://github.com/logstash-plugins/logstash-filter-grok

lib/logstash/patterns/core.rb

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,16 @@ module Patterns
33
module Core
44
extend self
55

6-
def path
7-
::File.expand_path('../../../patterns/legacy', ::File.dirname(__FILE__))
6+
BASE_PATH = ::File.expand_path('../../../patterns', ::File.dirname(__FILE__))
7+
private_constant :BASE_PATH
8+
9+
def path(type = 'legacy')
10+
case type = type.to_s
11+
when 'legacy', 'ecs-v1'
12+
::File.join(BASE_PATH, type)
13+
else
14+
raise ArgumentError, "#{type.inspect} path not supported"
15+
end
816
end
917

1018
end

logstash-patterns-core.gemspec

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Gem::Specification.new do |s|
22

33
s.name = 'logstash-patterns-core'
4-
s.version = '4.2.0'
4+
s.version = '4.3.0'
55
s.licenses = ['Apache License (2.0)']
66
s.summary = "Patterns to be used in logstash"
77
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"

patterns/ecs-v1/aws

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
S3_REQUEST_LINE (?:%{WORD:[http][request][method]} %{NOTSPACE:[url][original]}(?: HTTP/%{NUMBER:[http][version]})?)
2+
3+
S3_ACCESS_LOG %{WORD:[aws][s3access][bucket_owner]} %{NOTSPACE:[aws][s3access][bucket]} \[%{HTTPDATE:timestamp}\] (?:-|%{IP:[client][ip]}) (?:-|%{NOTSPACE:[client][user][id]}) %{NOTSPACE:[aws][s3access][request_id]} %{NOTSPACE:[aws][s3access][operation]} (?:-|%{NOTSPACE:[aws][s3access][key]}) (?:-|"%{S3_REQUEST_LINE:[aws][s3access][request_uri]}") (?:-|%{INT:[http][response][status_code]:int}) (?:-|%{NOTSPACE:[aws][s3access][error_code]}) (?:-|%{INT:[aws][s3access][bytes_sent]:int}) (?:-|%{INT:[aws][s3access][object_size]:int}) (?:-|%{INT:[aws][s3access][total_time]:int}) (?:-|%{INT:[aws][s3access][turn_around_time]:int}) "(?:-|%{DATA:[http][request][referrer]})" "(?:-|%{DATA:[user_agent][original]})" (?:-|%{NOTSPACE:[aws][s3access][version_id]})(?: (?:-|%{NOTSPACE:[aws][s3access][host_id]}) (?:-|%{NOTSPACE:[aws][s3access][signature_version]}) (?:-|%{NOTSPACE:[tls][cipher]}) (?:-|%{NOTSPACE:[aws][s3access][authentication_type]}) (?:-|%{NOTSPACE:[aws][s3access][host_header]}) (?:-|%{NOTSPACE:[aws][s3access][tls_version]}))?
4+
# :long - %{INT:[aws][s3access][bytes_sent]:int}
5+
# :long - %{INT:[aws][s3access][object_size]:int}
6+
7+
ELB_URIHOST %{IPORHOST:[url][domain]}(?::%{POSINT:[url][port]:int})?
8+
ELB_URIPATHQUERY %{URIPATH:[url][path]}(?:\?%{URIQUERY:[url][query]})?
9+
# deprecated - old name:
10+
ELB_URIPATHPARAM %{ELB_URIPATHQUERY}
11+
ELB_URI %{URIPROTO:[url][scheme]}://(?:%{USER:[url][username]}(?::[^@]*)?@)?(?:%{ELB_URIHOST})?(?:%{ELB_URIPATHQUERY})?
12+
13+
ELB_REQUEST_LINE (?:%{WORD:[http][request][method]} %{ELB_URI:[url][original]}(?: HTTP/%{NUMBER:[http][version]})?)
14+
15+
# pattern supports 'regular' HTTP ELB format
16+
ELB_V1_HTTP_LOG %{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:[aws][elb][name]} %{IP:[source][ip]}:%{INT:[source][port]:int} (?:-|(?:%{IP:[aws][elb][backend][ip]}:%{INT:[aws][elb][backend][port]:int})) (?:-1|%{NUMBER:[aws][elb][request_processing_time][sec]:float}) (?:-1|%{NUMBER:[aws][elb][backend_processing_time][sec]:float}) (?:-1|%{NUMBER:[aws][elb][response_processing_time][sec]:float}) %{INT:[http][response][status_code]:int} (?:-|%{INT:[aws][elb][backend][http][response][status_code]:int}) %{INT:[http][request][body][bytes]:int} %{INT:[http][response][body][bytes]:int} "%{ELB_REQUEST_LINE}"(?: "(?:-|%{DATA:[user_agent][original]})" (?:-|%{NOTSPACE:[tls][cipher]}) (?:-|%{NOTSPACE:[aws][elb][ssl_protocol]}))?
17+
# :long - %{INT:[http][request][body][bytes]:int}
18+
# :long - %{INT:[http][response][body][bytes]:int}
19+
20+
ELB_ACCESS_LOG %{ELB_V1_HTTP_LOG}
21+
22+
# pattern used to match a shorted format, that's why we have the optional part (starting with *http.version*) at the end
23+
CLOUDFRONT_ACCESS_LOG (?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY}\t%{TIME})\t%{WORD:[aws][cloudfront][x_edge_location]}\t(?:-|%{INT:[destination][bytes]:int})\t%{IPORHOST:[source][ip]}\t%{WORD:[http][request][method]}\t%{HOSTNAME:[url][domain]}\t%{NOTSPACE:[url][path]}\t(?:(?:000)|%{INT:[http][response][status_code]:int})\t(?:-|%{DATA:[http][request][referrer]})\t%{DATA:[user_agent][original]}\t(?:-|%{DATA:[url][query]})\t(?:-|%{DATA:[aws][cloudfront][http][request][cookie]})\t%{WORD:[aws][cloudfront][x_edge_result_type]}\t%{NOTSPACE:[aws][cloudfront][x_edge_request_id]}\t%{HOSTNAME:[aws][cloudfront][http][request][host]}\t%{URIPROTO:[network][protocol]}\t(?:-|%{INT:[source][bytes]:int})\t%{NUMBER:[aws][cloudfront][time_taken]:float}\t(?:-|%{IP:[network][forwarded_ip]})\t(?:-|%{DATA:[aws][cloudfront][ssl_protocol]})\t(?:-|%{NOTSPACE:[tls][cipher]})\t%{WORD:[aws][cloudfront][x_edge_response_result_type]}(?:\t(?:-|HTTP/%{NUMBER:[http][version]})\t(?:-|%{DATA:[aws][cloudfront][fle_status]})\t(?:-|%{DATA:[aws][cloudfront][fle_encrypted_fields]})\t%{INT:[source][port]:int}\t%{NUMBER:[aws][cloudfront][time_to_first_byte]:float}\t(?:-|%{DATA:[aws][cloudfront][x_edge_detailed_result_type]})\t(?:-|%{NOTSPACE:[http][request][mime_type]})\t(?:-|%{INT:[aws][cloudfront][http][request][size]:int})\t(?:-|%{INT:[aws][cloudfront][http][request][range][start]:int})\t(?:-|%{INT:[aws][cloudfront][http][request][range][end]:int}))?
24+
# :long - %{INT:[destination][bytes]:int}
25+
# :long - %{INT:[source][bytes]:int}
26+
# :long - %{INT:[aws][cloudfront][http][request][size]:int}
27+
# :long - %{INT:[aws][cloudfront][http][request][range][start]:int}
28+
# :long - %{INT:[aws][cloudfront][http][request][range][end]:int}

0 commit comments

Comments
 (0)