Kafka配置部分

来源：互联网发布：淘宝助理点打印没反应编辑：程序博客网时间：2024/05/19 23:01

原文地址：http://kafka.apache.org/

6.3 Kafka Configuration

Important Client Configurations

The most important producer configurations control

compression
sync vs async production
batch size (for async producers)

The most important consumer configuration is the fetch size.

All configurations are documented in the configuration section.

A Production Server Config

Here is our production server configuration:

# Replication configurationsnum.replica.fetchers=4replica.fetch.max.bytes=1048576replica.fetch.wait.max.ms=500replica.high.watermark.checkpoint.interval.ms=5000replica.socket.timeout.ms=30000replica.socket.receive.buffer.bytes=65536replica.lag.time.max.ms=10000controller.socket.timeout.ms=30000controller.message.queue.size=10# Log configurationnum.partitions=8message.max.bytes=1000000auto.create.topics.enable=truelog.index.interval.bytes=4096log.index.size.max.bytes=10485760log.retention.hours=168log.flush.interval.ms=10000log.flush.interval.messages=20000log.flush.scheduler.interval.ms=2000log.roll.hours=168log.retention.check.interval.ms=300000log.segment.bytes=1073741824# ZK configurationzookeeper.connection.timeout.ms=6000zookeeper.sync.time.ms=2000# Socket server configurationnum.io.threads=8num.network.threads=8socket.request.max.bytes=104857600socket.receive.buffer.bytes=1048576socket.send.buffer.bytes=1048576queued.max.requests=16fetch.purgatory.purge.interval.requests=100producer.purgatory.purge.interval.requests=100

Our client configuration varies a fair amount between different use cases.

Java Version

From a security perspective, we recommend you use the latest released version of JDK 1.8 as older freely available versions have disclosed security vulnerabilities. LinkedIn is currently running JDK 1.8 u5 (looking to upgrade to a newer version) with the G1 collector. If you decide to use the G1 collector (the current default) and you are still on JDK 1.7, make sure you are on u51 or newer. LinkedIn tried out u21 in testing, but they had a number of problems with the GC implementation in that version. LinkedIn's tuning looks like this:

-Xmx6g -Xms6g -XX:MetaspaceSize=96m -XX:+UseG1GC-XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80

For reference, here are the stats on one of LinkedIn's busiest clusters (at peak):

60 brokers
50k partitions (replication factor 2)
800k messages/sec in
300 MB/sec inbound, 1 GB/sec+ outbound

The tuning looks fairly aggressive, but all of the brokers in that cluster have a 90% GC pause time of about 21ms, and they're doing less than 1 young GC per second.

0 0