Syslogger: Forward syslog to Apache Kafka
来源:互联网 发布:淘宝网男士假发 编辑:程序博客网 时间:2024/06/16 07:03
Syslogger: Forward syslog to Apache Kafka
Our team rewrote a small but key component within our data infrastructure earlier this year: a daemon to forward log data toApache Kafka. Last week I made the repository public and thought it would be worth mentioning.
Our log data can be extremely valuable: helping us understand how users interact with our products and letting us measure the service we provide. Lots of services and daemons produce logs that we can usefully look at together to better understand the whole: nginx, varnish, our application services and more.
The problems with log files
Most of our applications and services would write their logs to files on disk. It’s convenient (to a point) when you can just tail a file to see what’s happening.
It’s ok when you’re administering a handful of machines with perhaps a few processes but it breaks when we start talking about the composition of systems. This is for a few reasons (nothing new of course but worth repeating):
- Aggregating log files across your servers is important. No person/process is an island after all. By extension its necessary to replicate log files to a centralised place- you start having custom code to replicate files at particular times all over the place. In batches.
- We can’t just continually write to a single unbounded file- we run the risk of consuming all available disk space and inadvertently affecting other processes running on the same machine.
Having said that, this is exactly what we did for a very long time (and still do in places). The primary issue we had was that we would lose data.
We were unable to track whether it was as a result of incorrectly handling the rotation event or that the volume of writes was sufficient for our tail process to gradually slowdown and drop the final segment of messages.
However, for all these reasons (being a poor citizen in the overall system and specific reliability problems) we decided to replace this section of our infrastructure.
Rsyslog and Syslogger
Rather than tailing files we’ll make sure we use syslog to write messages and rsyslog to forward those to files and aggregators. Since rsyslog communicates with a very simple TCP protocol we can write a daemon that will read a message send it directly to Kafka.
Rsyslog also has the advantage that it has a few features for more reliable message forwarding. For more detail please see“Reliable Forwarding of syslog Messages with Rsyslog”.
Implementing the system this way means that the bulk of the complexity we faced before can be borne by Rsyslog; our daemon just needs to read from TCP and send to Kafka.
There is another project that does exactly this but does so in a slightly different way:Syslog Collector. Centrally it chooses to queue messages in batches, we wanted to send them synchronously as TCP data are consumed. We tried to use this but couldn’t successfully consume messages and decided it would be relatively simple to re-implement and ensure we had exactly the behaviour we wanted.
Syslogger Design
Syslogger has a pretty simple set of principles behind it:
- Data are read and sent as part of the same process. Which is to say that a slow send into Kafka results in slow consumption from the listening socket. We lean heavily on Rsyslog.
- Initial Kafka broker details are extracted from ZooKeeper. Given consistent broker data is held there it makes sense to seed the connections from there; at uSwitch we have a production ZooKeeper quorum (on Amazon AWS) that uses Elastic IPs. Our broker IPs aren’t fixed, however.
- We try as much as possible to cleanly shut down the connection to ensure messages that have been delivered by Rsyslog are forwarded to Kafka. It’s impossible to guarantee this givenTCP data are exchanged with both send and receive buffers, but we try to limit it as much as possible.
Deployment
We build binaries and packages that are deployed onto EC2. The syslogger process is managed byUpstart and monitored usingmetrics that are published toRiemann (for more on this take a look at my previous post onpushing metrics from Go to Riemann) so they can be reported alongside our other reporting dashboards.
We’ve been extremely happy with its behaviour and performance in production thus far- it’s been running for a few months without issue.
From:http://oobaloo.co.uk/syslogger-forward-syslog-to-apache-kafka
- Syslogger: Forward syslog to Apache Kafka
- Configure ubuntu apache to forward the request to tomcat app
- Eventlog to Syslog Utility
- Bash: History to Syslog
- [Apache Kafka]Kafka集成
- [Apache Kafka]Kafka运维
- Apache Kafka
- Apache Kafka
- Apache Kafka
- Apache Kafka
- Apache Kafka
- Apache Kafka
- Apache Kafka
- Apache Kafka
- Apache Kafka
- How To Install Apache Kafka on Ubuntu 14.04
- How To Install Apache Kafka on Ubuntu 14.04
- syslog服务器和kafka消息转接器
- 并查集算法的简介与算法实现
- ubuntu下的vim设置配色方案
- java的网络连接Socket与ServerSocket与集合的用法
- hdu 5339 Untitled(回溯)
- 数论总结
- Syslogger: Forward syslog to Apache Kafka
- 线段树_HDU_1698
- 余弦推倒
- 新手学习Html易混淆的概念1
- php连接mysql数据库失败解决办法
- 内存对齐的规则以及作用
- 线段树 (区间修改 区间查询 延迟标记)
- 冒泡、快速、直接插入、选择排序算法(Java语言实现)
- VS2013 MFC opencv 播放视频