Spark Debug

来源:互联网 发布:樱井知香和黑人在线av 编辑:程序博客网 时间:2024/06/01 10:27

调试参数

参数说明:
spark.master 使用local模式,这样可以直接remote 连接过来debug
eventLog 在driver和executor之前的事件全部写本地,方便查看。不要使用hdfs模式,因为读写hdfs时,如果debug会造成hdfs client超时。

spark.sql.shuffle.partitions 减少sql的并行度,这么大的太多的并发,日志乱的没发看

spark.executor.heartbeatInterval 增加executor的心跳时间

spark.rpc.askTimeout 在DAGSchedule和taskmanager 是rpc(netty实现)通信的。默认120s,超时后DAGScheduler会job aborted due to stagefailure . ExecutorLostFailure

spark.sql.codegen
spark.sql.tunsten.enabled 这两个参数是想关闭codegen,但是没生效

spark.master                     local[2]  spark.eventLog.enabled           truespark.eventLog.dir               file:///var/log/spark-event/spark.serializer                 org.apache.spark.serializer.KryoSerializerspark.driver.memory              1gspark.executor.memory            1gspark.executor.instances         2spark.sql.shuffle.partitions     2spark.executor.heartbeatInterval 1800spark.rpc.askTimeout             1800spark.sql.codegen                falsespark.sql.tunsten.enabled        false# spark 2.02 spark.sql.tungsten.enabled                false# spark 2.02 spark.sql.codegen.fallback                truespark.logLineage                 true#spark.rpc.netty.dispatcher.numThreads  10#spark.driver.extraJavaOptions    -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=2345#spark.executor.extraJavaOptions  -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=2346

开启SPARK remote debug

在spark-env.sh中设置

export SPARK_JAVA_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=2345

LOG4J配置

log4j.rootCategory=INFO, Rlog4j.appender.R=org.apache.log4j.DailyRollingFileAppenderlog4j.appender.R.File=/var/log/spark/spark.loglog4j.appender.R.DatePattern='.'yyyy-MM-ddlog4j.appender.R.layout=org.apache.log4j.PatternLayoutlog4j.appender.R.layout.ConversionPattern=%d{HH:mm:ss},%p,%t,%c{3},%m%nlog4j.logger.org.apache.spark.sql.catalyst=TRACElog4j.logger.org.apache.spark.sql.execution=TRACElog4j.logger.org.apache.spark=TRACElog4j.logger.org.apache.spark.sql.hive=INFO
0 0
原创粉丝点击