Failed to merge incompatible data types StringType and BinaryType
来源:互联网 发布:居住证 知乎 编辑:程序博客网 时间:2024/05/21 01:28
使用spark1.4.0加载parquet报错:
org.apache.spark.SparkException: Failed to merge incompatible schemas StructType(StructField(ip,StringType,true), StructField(log_time,StringType,true), StructField(pos_type,StringType,true), StructField(pos_value,StringType,true), StructField(user_id,StringType,true), StructField(device_id,StringType,true), StructField(cookie_id,StringType,true), StructField(from_source,IntegerType,true), StructField(platform,StringType,true), StructField(version,StringType,true), StructField(channel,StringType,true), StructField(c_detail,StringType,true), StructField(user_role,StringType,true), StructField(user_type,StringType,true), StructField(school,StringType,true), StructField(child,StringType,true), StructField(list_version,StringType,true), StructField(tags,StringType,true), StructField(url,StringType,true), StructField(refer,StringType,true), StructField(deal_id,StringType,true), StructField(deal_n,IntegerType,true), StructField(deal_x,IntegerType,true), StructField(deal_y,IntegerType,true), StructField(deal_source_type,StringType,true), StructField(deal_exposure_time,StringType,true), StructField(exposure_num,StringType,true), StructField(img_version,StringType,true), StructField(screen_version,StringType,true), StructField(page,IntegerType,true), StructField(deal_show_type,StringType,true), StructField(log_time_stamp,LongType,true), StructField(deal_exposure_time_stamp,LongType,true)) and StructType(StructField(ip,BinaryType,true), StructField(log_time,BinaryType,true), StructField(pos_type,BinaryType,true), StructField(pos_value,BinaryType,true), StructField(user_id,BinaryType,true), StructField(device_id,BinaryType,true), StructField(cookie_id,BinaryType,true), StructField(from_source,IntegerType,true), StructField(platform,BinaryType,true), StructField(version,BinaryType,true), StructField(channel,BinaryType,true), StructField(c_detail,BinaryType,true), StructField(user_role,BinaryType,true), StructField(user_type,BinaryType,true), StructField(school,BinaryType,true), StructField(child,BinaryType,true), StructField(list_version,BinaryType,true), StructField(tags,BinaryType,true), StructField(url,BinaryType,true), StructField(refer,BinaryType,true), StructField(deal_id,BinaryType,true), StructField(deal_n,IntegerType,true), StructField(deal_x,IntegerType,true), StructField(deal_y,IntegerType,true), StructField(deal_source_type,BinaryType,true), StructField(deal_exposure_time,BinaryType,true), StructField(exposure_num,BinaryType,true), StructField(img_version,BinaryType,true), StructField(screen_version,BinaryType,true), StructField(page,IntegerType,true), StructField(deal_show_type,BinaryType,true), StructField(log_time_stamp,LongType,true), StructField(deal_exposure_time_stamp,LongType,true))
at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$readSchema$2.apply(newParquet.scala:531)
at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$readSchema$2.apply(newParquet.scala:529)
at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51)
at scala.collection.IndexedSeqOptimized$class.reduceLeft(IndexedSeqOptimized.scala:68)
at scala.collection.mutable.ArrayBuffer.reduceLeft(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.reduceLeftOption(TraversableOnce.scala:190)
at scala.collection.AbstractTraversable.reduceLeftOption(Traversable.scala:105)
at scala.collection.TraversableOnce$class.reduceOption(TraversableOnce.scala:197)
at scala.collection.AbstractTraversable.reduceOption(Traversable.scala:105)
at org.apache.spark.sql.parquet.ParquetRelation2$.readSchema(newParquet.scala:529)
at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.org$apache$spark$sql$parquet$ParquetRelation2$MetadataCache$$readSchema(newParquet.scala:434)
at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$11.apply(newParquet.scala:369)
at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$11.apply(newParquet.scala:369)
at scala.Option.orElse(Option.scala:257)
at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.refresh(newParquet.scala:369)
at org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$metadataCache$lzycompute(newParquet.scala:126)
at org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$metadataCache(newParquet.scala:124)
at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$dataSchema$1.apply(newParquet.scala:165)
at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$dataSchema$1.apply(newParquet.scala:165)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.parquet.ParquetRelation2.dataSchema(newParquet.scala:165)
at org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:506)
at org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:505)
at org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:438)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:264)
at com.zhe800.toona.lr.computation.QianBai$.main(QianBai.scala:817)
at com.zhe800.toona.lr.computation.QianBai.main(QianBai.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Failed to merge incompatible data types StringType and BinaryType
at org.apache.spark.sql.types.StructType$.merge(StructType.scala:265)
at org.apache.spark.sql.types.StructType$$anonfun$merge$1$$anonfun$apply$4.apply(StructType.scala:239)
at org.apache.spark.sql.types.StructType$$anonfun$merge$1$$anonfun$apply$4.apply(StructType.scala:237)
at scala.Option.map(Option.scala:145)
at org.apache.spark.sql.types.StructType$$anonfun$merge$1.apply(StructType.scala:237)
at org.apache.spark.sql.types.StructType$$anonfun$merge$1.apply(StructType.scala:233)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.sql.types.StructType$.merge(StructType.scala:233)
at org.apache.spark.sql.types.StructType.merge(StructType.scala:191)
at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$readSchema$2.apply(newParquet.scala:530)
... 36 more
这是直接拷贝的别人的代码,一样的环境一样的代码报错真是纠结。通过看spark官方文档:http://spark.apache.org/docs/latest/sql-programming-guide.html
其中有说明:
spark.sql.parquet.binaryAsString
falseSome other Parquet-producing systems, in particular Impala, Hive, and older versions of Spark SQL, do not differentiate between binary data and strings when writing out the Parquet schema. This flag tells Spark SQL to interpret binary data as a string to provide compatibility with these systems.因此在代码中添加:sqlContext.setConf("spark.sql.parquet.binaryAsString", "true")
即可解决。
- Failed to merge incompatible data types StringType and BinaryType
- E2010 Incompatible types: 'Array' and 'PAnsiChar'
- Type Incompatible operand types String and int
- Incompatible conditional operand types int and Double
- An introduction to Redis data types and abstractions
- Incompatible types
- JNI Types and Data Structures
- JNI Types and Data Structures
- Incompatible types:'TDBGridEh' and 'TDBGrid' 出现这个错误的解决方法
- E2010 Incompatible types: 'Char' and 'AnsiChar' 错误的处理
- Oracle data types and Microsoft SQL Server data types
- Incompatible pointer types assigning to 'id<>' from 'Class'
- incompatible pointer types assigning to 'int *' from 'char'
- incompatible pointer types assigning to 'nsmutablearray ' from 'nsarray '
- Incompatible pointer types assigning to 'NSMutableArray *' from 'NSArray *'
- ANSI and UnicodeCharacter and String Data Types
- Table of Delphi data types and C++ types
- XSD Date and Time Data Types
- STL中的谓词
- ecshop 后台增加上传图片项
- QT自定义控件
- 开涛老师的博客汇总 -- Web MVC 开发学习
- Java获取中文拼音、中文首字母缩写和中文首字母
- Failed to merge incompatible data types StringType and BinaryType
- linux-fdisk(转载)
- Core Data 异步查询(iOS 8 特性)
- 虚拟现实在工业仿真中的应用
- Python 多线程学习04
- PullToRefreshListView 应用讲解
- linux 多线程
- 如何让iOS 保持界面流畅
- 通过RTMP协议将AVC(H264)数据发送到流媒体服务器