Ship logs to Logstash with Lumberjack / Logstash Forwarder

In my previous post, I explained how to set up Logstash instances on your servers, acting as logs data shippers.

However, as you may already have noticed, Logstash instances have a non-negligible memory footprint on your servers, preventing it’s use where memory is limited. Furthermore, you must have Java installed on each platform you want to run Logstash.

This is where Logstash Forwarder (formerly Lumberjack) becomes interesting : this small tool, developed in Go, allows to securely ship compressed logs data (to a Logstash “indexer” for instance), with minimal resources usage, using the Lumberjack protocol.

I’ve hence decided to replace all of the Logstash “shipper” instances by Logstash Forwarder. This also means no more using Redis as a logs data broker, as Logstash Forwarder won’t talk with Redis (no encryption support). In consequence, if your Logstash indexer stops running, you may lose data once Logstash Forwarder’s spool max size is reached.

Installing Logstash Forwarder

To install Logstash Forwarder on your log shippers, we’ll need to compile it from sources : the full procedure is very well described in the project’s readme. I strongly recommend that you compile it once and make a package (either RPM or DEB) so yo can easily deploy it on all of your other servers.

Init script

Once installed from a package, Logstash Forwarder is located in /opt/logstash-forwarder. We’ll use the init script available in LF’s repository to handle startup :

$ cd /etc/init.d/
$ sudo wget https://raw.github.com/elasticsearch/logstash-forwarder/master/logstash-forwarder.init -O logstash-forwarder
$ chmod +x logstash-forwarder
$ update-rc.d logstash-forwarder defaults

SSL certificate generation

First of all, we’ll need to generate a SSL certificate that will be used to secure communications between your shippers and indexer :

$ openssl req -x509 -batch -nodes -newkey rsa:2048 -keyout logstash-forwarder.key -out logstash-forwarder.crt

Now move freshly created logstash-forwarder.key in /etc/ssl/private/ and logstash-forwarder.crt in /etc/ssl/certs/. Note that you’ll need both of these files on each of your shippers and indexer.

Configuration

We’re now ready configure Logstash Forwarder : config file is in JSON format, and will preferably be saved in /etc/logstash-forwarder (yes, it’s a file), as it’s the location defined in the init script we installed above.

Note : if you need to override any of the init script’s parameters (ie. config file location), create a file /etc/default/logstash-forwarder and set your custom parameters.

In the following configuration example, we’ll assume you want to track your iptables and Apache logs data, and that your indexer’s IP is 10.0.0.5 :

{
  "network": {
    "servers": [ "10.0.0.5:5043" ],
    "ssl certificate": "/etc/ssl/certs/logstash-forwarder.crt",
    "ssl key": "/etc/ssl/private/logstash-forwarder.key",
    "ssl ca": "/etc/ssl/certs/logstash-forwarder.crt"
  },
  "files": [
    {
      "paths": [ "/var/log/syslog" ],
      "fields": { "type": "iptables" }
    },
    {
      "paths": [ "/var/log/apache2/*access*.log" ],
      "fields": { "type": "apache" }
    }
  ]
}

iptables logs filtering

To avoid processing and transmitting all of syslog’s file data to our indexer, I recommend to filter your iptables log entries to have them in a separate file.

First of all, you need to have a specific criteria to filter on ; you may simply add “IPTABLES” to the log-prefix value of your iptables log rules, so it looks something like :

/sbin/iptables -A LogAndDrop -p tcp -j LOG --log-prefix "IPTABLES RULE 1 -- DROP" --log-level=info

If using rsyslog, you’ll have to create an iptables.conf file in /etc/rsyslog.d/ (usually all files in this directory will be read by rsyslog) and set up a very basic filtering rule :

if $programname == 'kernel' and $msg contains 'IPTABLES' then /var/log/iptables.log

Restart rsyslog. You can now replace the iptables log file path in your Logstash Forwarder config file.

Indexer side : Logstash configuration

Next step, edit the config of Logstash on your indexer server, and add the following input :

lumberjack {
  port => 5043
  type => "logs"
  ssl_certificate => "/etc/ssl/certs/logstash-forwarder.crt"
  ssl_key => "/etc/ssl/private/logstash-forwarder.key"
}

Also add these filters to extract fields from logs data :

filter {
  if [type] == "apache" {
    grok {
      pattern => "%{COMBINEDAPACHELOG}"
    }
  }

  if [type] == "iptables" {
    grok {
      patterns_dir => "/usr/share/grok/patterns/iptables"
      pattern => "%{IPTABLES}"
    }
  }
}

You may also want logs data not to be stored in Elasticsearch if Grok patterns didn’t match. In this case, add the following in the output section, surrounding your output plugins (elasticsearch for instance) :

if !("_grokparsefailure" in [tags]) {
  elasticsearch { bind_host => "10.0.0.5" }
}

About 

Freelance PHP Symfony2 & Magento developer, passionate about programming and electronics.

  • googleplus
  • twitter
  • Pingback: Collect & visualize your logs with Logstash, Elasticsearch & Redis | Michael Bouvy()

  • Interesting article. I didn’t realize logstash-forwarder didn’t talk to Redis. This is kind of a problem if one ever needs to do maintenance on the indexer.

    One workaround would be to have another simple Logstash instance on the Redis node that does no processing and just receives logs to foreward them into Redis. Each queue-server would have such a receiving Logstash-instance. The fail-over would be accomplished by setting multiple ‘servers’ on each logstash-forwarder node.

    Which gets me thinking that it would be nice if logstash-forwarder could be used as this bridge.

    • Michael BOUVY

      Yes, Logstash Forwarder wont talk to Redis since there is no (and will not be ?) support for encryption / compression in Redis (see https://github.com/elasticsearch/logstash-forwarder#future-protocol-discussion) … too bad 🙁

      • asif soomro

        Hi Michael,

        could we use multiple type like “fields”: { “type”: “error”, “Info”, “warn” } ? in logstash-forwarder.conf.

    • swestcott

      Node-logstash (https://github.com/bpaquet/node-logstash) is an alternative which does support Redis – no encryption though. Only has Pub/Sub currently, however there’s a pending PR to add support for lists.

  • Gene

    Why not ship directly from rsyslog?

  • Bill Zhuang

    the SSL certificate generated here expired date is too short only for month, prefer add -days 3650, something like that

  • jordansissel

    This isn’t true, btw: “you may lose data once Logstash Forwarder’s spool max size is reached.”

    One of the design goals of lsf (logstash forwarder) is for lossless transmission of logs. This is achieved through a protocol that uses receipt acknowledgement (“I got those logs!”) and a pipeline that stalls and waits when there’s a network fault or slowness. If logstash is down for maintenance, server failure, or network failure, lsf will simply pause and wait until the error is resolved – it doesn’t just drop logs. If you’ve configured lsf to talk to multiple servers, then a fault with one server will make lsf choose another server to connect to, roughly achieving overload protection (overloading 1 logstash server causes some lsf servers to connect to alternate logstash servers).

    @yggdrasil:disqus This answers your concerns, also.

    lsf reads files and pipes that over the network in a reliable, compressed, and secure way. If the transmission path is broken, file reading simply waits until that condition clears – lsf has lossless flow control by design 😉

    • jerrac

      @jordansissel:disqus So, if I’m reading your comment correctly, there is no need for a redis or rabbitmq queue if you use logstash-forwarder?

      Could you put that information in the README?

      What if you restart the server or logstash-forwarder service while the indexer is down?

    • Guest

      Hello. Is it possible though to have a logstash-forwarder point directly to a broker (redis, rabbitmq). Because you will need a logstash instance between the shipper and the broker, since you can’t define output {} at logstash-forwarder. Can you point out a scenario where this is a better solution instead of using beaver for example which can point directly to the broker? Because this way you can have one layer less in your infrastructure.

      PS. I’m aware for the lossless transimission of logs btw, I’m using lsg atm as well.

  • Ding Lei

    Usually there will be a broker between log source host and log server. The reason is that.

    “The only reason to use Redis or another broker, like RabbitMQ, is to avoid as much as
    possible to loose collected logs in the transition between Logstash adn Elasticsearch.

    Logstash has fairly poor caching capabilities (it’s not its main role, anyway) so you
    should use something in the middle to store temporarily those logs.

    Using a broker will also let you decoupling your log pipeline: you may be able to
    stop/restart some of your Logstash instances (in case of upgrade/maintenance)
    without affecting the flow of the logs through Elasticsearch.”

  • Ravi Hasija

    Great article Michael and great discussion in the comments. It seems to me that logstash forwarder provides faster performance, low memory footprint, and encryption of data. Where it fails it seems is in it’s caching capabilities and decoupling your log pipeline as Ding Li pointed out below. If that’s true and if the producer is faster than consumer then it makes sense to have redis or another broker in between. Is that a fair assessment? Can logstash forwarder ship to redis?

  • Murasakiiru

    Hi Michael,

    You wrote a nice article, I’m just trying ELK on my new servers, and logstash-forwarder seems good to use.

    But I have an issue with the certificate and private key, I create them with the command on the logstash-forwarder github : ” openssl req -x509 -batch -nodes -newkey rsa:2048 -keyout /etc/ssl/private/logstash-forwarder.key -out /etc/ssl/certs/logstash-forwarder.crt -subj /CN=my_fqdn “, then I send them to my logstash server and my logstash-forwarder instance.

    When I run logstash and logstash-forwarder, I have this error : “Read error looking for ack: read tcp 10.10.0.64:5043: i/o timeout”, I read maybe it can be network problems, but this is my virtual-box (logstash-forwarder instance) to my own computer (logstash server) so there is no network congestion.

    What can be the origin of this error message ?

    • Murasakiiru

      OK, i’ve juste found something : the error appears if two servers sends their logs on the same port on logstash server…

  • Murasakiiru

    Hi,

    Logstash-forwarder allow to replace logstash on my server I want to supervise. But I need an input like “Exec” available in logstash but apparently not in logstash-fowarder. So, am I forced to use a mixed “logstash” and “logstash-forwarder” on my servers, or is there a trick that I didn’t find ?

    Is there another way than use ONE port BY server I want to supervise ?

    Thanks,

  • Junjie lee

    Is this install logstash-forwarder on mac???

  • Daniel Mack

    Hello Michael nice tutorial,
    could you tell me if i have redis in front of the logstash-forwarder how and in which configuration file can i tell logstash-forwarder that it should take the data from redis?

    Perhaps, you can also tell me the input in this needed configuration file.,

    Greetings

    Dany