大数据学习[09]:presto0.184集群|多数据源|问题
来源:互联网 发布:算卦一条街源码 编辑:程序博客网 时间:2024/05/17 08:15
摘要:下载安装presto0.184,配置presto0.184的集群模式,测试presto与多种数据相连接的连接器,包括Hive,Mysql等等。记录安装与配置所遇到的问题和解决方法。
前置
Hadoop,Hive参考:
大数据学习[04]:Hive安装配置:
http://blog.csdn.net/ld326/article/details/78023101
大数据学习[02]:hadoop安装配置: http://blog.csdn.net/ld326/article/details/78004402
官网
https://prestodb.io/
下载
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.184/presto-server-0.184.tar.gz
解压
tar zvf presto-server-0.184.tar.gz
创建文件夹
与目录presto-server-0.184相同的目录创建一个数据日志数据文件夹,presto-data; 为了方便之后升级这个文件不用修改;
配置
最后配置的文件结构
etc
├── catalog
│ └── hive.properties
├── config.properties
├── jvm.config
├── log.properties
└── node.properties
Create an etc directory
在presto的home目录下创建一个etc目录,主要配置以下的信息:
● Node Properties: environmental configuration specific to each node
● JVM Config: command line options for the Java Virtual Machine
● Config Properties: configuration for the Presto server
● Catalog Properties: configuration for Connectors (data sources)
节点属性配置
创建配置节点信息,先创建etc/node.properties文件,在一个机器上,节点是Presto安装的单例。简单配置一下etc/node.properties文件:
node.environment=pcluster
node.id=111
node.data-dir=/home/hadoop/data/presto
记得创建一个/home/hadoop/data/presto文件夹,当presto运行后会是这样的目录结构:
data└── presto ├── etc -> /home/hadoop/presto-server-0.184/etc ├── plugin -> /home/hadoop/presto-server-0.184/plugin └── var ├── log │ ├── http-request.log │ ├── launcher.log │ └── server.log └── run └── launcher.pid
● node.environment: The name of the environment. All Presto nodes in a cluster must have the same environment name.
node.environment:是一个presto事个集群环境的名字,所有的Presto节点都要有这个相同的名字。
● node.id: The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier.
node.id:Presto安装的唯一识别,对于每个节点这个id是唯一的;Presto无论重启或更新这个id是保持不变的。
● node.data-dir: The location (filesystem path) of the data directory. Presto will store logs and other data here.
nod.data-dir:Presto的数据储存的本地目录
JVM配置
文件:etc/jvm.config
-server
-Xmx1G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
配置Presto server
文件:etc/config.properties
presto server 中分两个角色:coordinator,workers;
Every Presto server can function as both a coordinator and a worker, but dedicating a single machine to only perform coordination work provides the best performance on larger clusters.
每台Presto服务器都可以做coordinatorgn与worker,但是在一个大的集群中,用单个机器作为coordination有好的性能。
coordinator[192.168.137.101]:
coordinator=truenode-scheduler.include-coordinator=falsehttp-server.http.port=8080query.max-memory=1GBquery.max-memory-per-node=512MBdiscovery-server.enabled=truediscovery.uri=http://hadoop01:8080
workers[192.168.137.102,192.168.137.103]:
coordinator=falsehttp-server.http.port=8080query.max-memory=1GBdiscovery.uri=http://hadoop01:8080
或想一台机器设置两个角色:
coordinator=truenode-scheduler.include-coordinator=truehttp-server.http.port=8080query.max-memory=1GBquery.max-memory-per-node=512MBdiscovery-server.enabled=truediscovery.uri=http://hadoop01:8080
这里就调整了一下:node-scheduler.include-coordinator,让主机也作为work节点。
注意:这里的hadoop01为主机名,如果配置成IP会出错。
说明:
● coordinator: Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution).
● node-scheduler.include-coordinator: Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution.
● http-server.http.port: Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external.
● query.max-memory: The maximum amount of distributed memory that a query may use.
● query.max-memory-per-node: The maximum amount of memory that a query may use on any one machine.
● discovery-server.enabled: Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port.
● discovery.uri: The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace example.net:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash.
配置日志Log
文件:etc/log.properties
配置内容
com.facebook.presto=INFO
配置连接数据源(Catalog Properties)
Connectors
● 5.1. Accumulo Connector
● 5.2. Black Hole Connector
● 5.3. Cassandra Connector
● 5.4. Hive Connector
● 5.5. Hive Security Configuration
● 5.6. JMX Connector
● 5.7. Kafka Connector
● 5.8. Kafka Connector Tutorial
● 5.9. Local File Connector
● 5.10. Memory Connector
● 5.11. MongoDB Connector
● 5.12. MySQL Connector
● 5.13. PostgreSQL Connector
● 5.14. Redis Connector
● 5.15. SQL Server Connector
● 5.16. System Connector
● 5.17. Thrift Connector
● 5.18. TPCDS Connector
● 5.19. TPCH Connector
配置hive的Catalog
vim hive.properties
connector.name=hive-hadoop2hive.metastore.uri=thrift://hadoop01:9083hive.config.resources=/home/hadoop/hadoop/etc/hadoop/core-site.xml, /home/hadoop/apache-hive-1.2.1-bin/conf/hdfs-site.xmlhive.allow-drop-table=true
记得hive-hadoop2这个名字不能乱起的哈,其它名字presto不认识的。
presto的hive配置属性说明表
启动
两种启动方法
daemon运行:bin/launcher start
foreground运行:bin/launcher run
如果用到supersive运行的话,要用foreground运行这个。
Presto命令行界面Presto CLI###
下载:
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.184/presto-cli-0.184-executable.jar
修改名为presto,修改运行权限chmod +x,运行
./presto --server localhost:8080 --catalog hive --schema default
提示:Run the CLI with the –help option to see the available options.
cli启动:
[hadoop@hadoop01 ~]$ ./presto.jar --server 192.168.137.101:8080 --catalog hive --schema default
–server要跟前面presto中config.properties中所配置的IP与端口一致。
2017-09-24T22:04:24.360+0800 INFO main Bootstrap PROPERTY DEFAULT RUNTIME DESCRIPTION2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.allow-corrupt-writes-for-testing false false Allow Hive connector to write data even when data will likely be corrupt2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.assume-canonical-partition-keys false false2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.bucket-execution true true Enable bucket-aware execution: only use a single worker per bucket2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.bucket-writing true true Enable writing to bucketed tables2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs.connect.max-retries 5 52017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs.connect.timeout 500.00ms 500.00ms2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs-timeout 60.00s 60.00s2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.domain-compaction-threshold 100 100 Maximum ranges to allow in a tuple domain without compacting it2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs.domain-socket-path null null2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.fs.cache.max-size 1000 1000 Hadoop FileSystem cache size2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.force-local-scheduling false false2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.hdfs.authentication.type NONE NONE HDFS authentication type2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.hdfs.impersonation.enabled false false Should Presto user be impersonated when communicating with HDFS2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.compression-codec GZIP GZIP2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore.authentication.type NONE NONE Hive Metastore authentication type2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.storage-format RCBINARY RCBINARY2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.immutable-partitions false false Can new data be inserted into existing partitions or existing unpartitioned tables2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs.ipc-ping-interval 10.00s 10.00s2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-concurrent-file-renames 20 202017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-initial-split-size 32MB 32MB2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-initial-splits 200 2002017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore-refresh-max-threads 100 1002017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-outstanding-splits 1000 10002017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore.partition-batch-size.max 100 1002017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-partitions-per-scan 100000 100000 Maximum allowed partitions for a single table scan2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-partitions-per-writers 100 100 Maximum number of partitions per writer2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-split-iterator-threads 1000 10002017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-split-size 64MB 64MB2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore-cache-maximum-size 10000 100002017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore-cache-ttl 0.00s 0.00s2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore-refresh-interval 0.00s 0.00s2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore.thrift.client.socks-proxy null null2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.metastore-timeout 10.00s 10.00s2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.metastore.partition-batch-size.min 10 102017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.orc.bloom-filters.enabled false false2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.orc.default-bloom-filter-fpp 0.05 0.05 ORC Bloom filter false positive probability2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.orc.max-buffer-size 8MB 8MB2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.max-merge-distance 1MB 1MB2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.max-read-block-size 16MB 16MB2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.optimized-writer.enabled false false2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.stream-buffer-size 8MB 8MB2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.parquet-optimized-reader.enabled false false2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.parquet-predicate-pushdown.enabled false false2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.per-transaction-metastore-cache-maximum-size 1000 10002017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.rcfile-optimized-writer.enabled true true2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.rcfile.writer.validate false false Validate RCFile after write by re-reading the whole file2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.recursive-directories false false2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.config.resources null [/home/hadoop/hadoop/etc/hadoop/core-site.xml, /home/hadoop/apache-hive-1.2.1-bin/conf/hdfs-site.xml]2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.respect-table-format true true Should new partitions be written using the existing table format or the default Presto format2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.skip-deletion-for-alter false false Skip deletion of old partition data when a partition is deleted and then inserted in the same transaction2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.table-statistics-enabled true true Enable use of table statistics2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.time-zone PRC PRC2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.use-column-names false false Access ORC columns using names from the file2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.parquet.use-column-names false false Access Parquet columns using names from the file2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.dfs.verify-checksum true true2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.write-validation-threads 16 16 Number of threads used for verifying data after a write2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.non-managed-table-writes-enabled false false Enable writes to non-managed (external) tables2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.pin-client-to-current-region false false Should the S3 client be pinned to the current EC2 region2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.aws-access-key null null2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.aws-secret-key [REDACTED] [REDACTED]2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.connect-timeout 5.00s 5.00s2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.encryption-materials-provider null null Use a custom encryption materials provider for S3 data encryption2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.endpoint null null2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.kms-key-id null null Use an AWS KMS key for S3 data encryption2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.max-backoff-time 10.00m 10.00m2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.max-client-retries 5 52017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.max-connections 500 5002017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.max-error-retries 10 102017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.max-retry-time 10.00m 10.00m2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.multipart.min-file-size 16MB 16MB Minimum file size for an S3 multipart upload2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.multipart.min-part-size 5MB 5MB Minimum part size for an S3 multipart upload2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.signer-type null null2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.socket-timeout 5.00s 5.00s2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.sse.enabled false false Enable S3 server side encryption2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.sse.kms-key-id null null KMS Key ID to use for S3 server-side encryption with KMS-managed key2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.sse.type S3 S3 Key management type for S3 server-side encryption (S3 or KMS)2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.ssl.enabled true true2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.staging-directory /tmp /tmp Temporary directory for staging files before uploading to S32017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.use-instance-credentials true true2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.user-agent-prefix The user agent prefix to use for S3 calls2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.metastore.uri null [thrift://hadoop01:9083] Hive metastore URIs (comma separated)2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.metastore thrift thrift2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-add-column false false Allow Hive connector to add column2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-drop-column false false Allow Hive connector to drop column2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-drop-table false true Allow Hive connector to drop table2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-rename-column false false Allow Hive connector to rename column2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-rename-table false false Allow Hive connector to rename table2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.security legacy legacy
查询:presto:default> show tables; Table ----------- dual student student01 student02 student03 student04 student05 (7 rows)Query 20170924_141223_00007_bg85h, FINISHED, 1 nodeSplits: 18 total, 18 done (100.00%)0:01 [7 rows, 175B] [11 rows/s, 290B/s]
presto:default> select * from student05; student_id | student_name | born | sex ------------+--------------+------------+----- dlmu_01 | happyprince | 2015-07-08 | 0 (1 row)Query 20170924_141229_00008_bg85h, FINISHED, 1 nodeSplits: 17 total, 17 done (100.00%)0:01 [1 rows, 358B] [1 rows/s, 448B/s]
presto:default> show tables; Table ----------- dual student student01 student02 student03 student04 student05 (7 rows)Query 20170924_145732_00004_uaaik, FINISHED, 2 nodesSplits: 18 total, 18 done (100.00%)0:01 [7 rows, 175B] [9 rows/s, 238B/s]
presto:default> select * from student; student_id | student_name ------------+-------------- dlmu_01 | happyprince (1 row)Query 20170924_145746_00005_uaaik, FINISHED, 2 nodesSplits: 17 total, 17 done (100.00%)0:05 [1 rows, 20B] [0 rows/s, 3B/s]
配置其它的连接器
查看系统的连接器数据
(这个系统带有的,不用配置) presto> show schemas from system;
Schema -------------------- information_schema jdbc metadata runtime (4 rows)Query 20170924_154001_00002_hnbrr, FINISHED, 2 nodesSplits: 18 total, 18 done (100.00%)0:02 [4 rows, 57B] [2 rows/s, 35B/s]
presto> show tables from system.information_schema;
Table ------------------------- __internal_partitions__ columns schemata table_privileges tables views (6 rows)Query 20170924_154021_00003_hnbrr, FINISHED, 2 nodesSplits: 18 total, 18 done (100.00%)0:02 [6 rows, 233B] [3 rows/s, 136B/s]
presto> select * from system.information_schema.columns; table_catalog | table_schema | table_name | column_name | ordinal_position | column_default | is_nullable | data_type | comment | extra_info ---------------+--------------------+-------------------------+------------------------------+------------------+----------------+-------------+----------------+---------+------------ system | runtime | queries | node_id | 1 | NULL | YES | varchar | NULL | NULL system | runtime | queries | query_id | 2 | NULL | YES | varchar | NULL | NULL system | runtime | queries | state | 3 | NULL | YES | varchar | NULL | NULL system | runtime | queries | user | 4 | NULL | YES | varchar | NULL | NULL system | runtime | queries | source | 5 | NULL | YES | varchar | NULL | NULL
配置JMX
[hadoop@hadoop01 catalog]$ vim jmx.propertiesconnector.name=jmxjmx.dump-tables=java.lang:type=Runtime,com.facebook.presto.execution.scheduler:name=NodeSchedulerjmx.dump-period=10sjmx.max-entries=86400
presto> SHOW TABLES FROM jmx.current; Table -------------------------------------------------------------------------------------------------------------------------------------------------- com.facebook.presto.execution.executor:name=multilevelsplitqueue com.facebook.presto.execution.executor:name=taskexecutor com.facebook.presto.execution.resourcegroups:name=internalresourcegroupmanager com.facebook.presto.execution.scheduler:name=nodescheduler com.facebook.presto.execution.scheduler:name=splitschedulerstats com.facebook.presto.execution:name=queryexecution com.facebook.presto.execution:name=querymanager com.facebook.presto.execution:name=remotetaskfactory com.facebook.presto.execution:name=taskmanager com.facebook.presto.execution:type=queryqueue,name=global,expansion=global com.facebook.presto.failuredetector:name=heartbeatfailuredetector com.facebook.presto.hive.metastore:type=cachinghivemetastore,name=hive com.facebook.presto.hive.metastore:type=thrifthivemetastore,name=hive com.facebook.presto.hive:type=fileformatdatasourcestats,name=hive com.facebook.presto.hive:type=namenodestats,name=hive com.facebook.presto.hive:type=prestos3filesystem,name=hive com.facebook.presto.memory:name=clustermemorymanager com.facebook.presto.memory:type=clustermemorypool,name=general com.facebook.presto.memory:type=clustermemorypool,name=reserved com.facebook.presto.memory:type=clustermemorypool,name=system com.facebook.presto.memory:type=memorypool,name=general com.facebook.presto.memory:type=memorypool,name=reserved com.facebook.presto.memory:type=memorypool,name=system com.facebook.presto.metadata:name=discoverynodemanager com.facebook.presto.operator.index:name=indexjoinlookupstats com.facebook.presto.security:name=accesscontrolmanager com.facebook.presto.server.remotetask:name=remotetaskstats com.facebook.presto.server:name=asynchttpexecutionmbean com.facebook.presto.server:name=exchangeexecutionmbean com.facebook.presto.server:name=taskresource com.facebook.presto.spiller:name=spillerfactory com.facebook.presto.sql.gen:name=expressioncompiler com.facebook.presto.sql.gen:name=joincompiler com.facebook.presto.sql.gen:name=joinfilterfunctioncompiler com.facebook.presto.sql.gen:name=joinprobecompiler com.facebook.presto.sql.gen:name=orderingcompiler com.facebook.presto.sql.gen:name=pagefunctioncompiler com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=addintermediateaggregations com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=aggregationexpressionrewrite com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=applyexpressionrewrite com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=createpartialtopn com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=eliminatecrossjoins Query 20170924_154511_00014_hnbrr, FINISHED, 1 nodeSplits: 18 total, 18 done (100.00%)0:03 [177 rows, 15.7KB] [67 rows/s, 5.98KB/s]
presto> SELECT node, vmname, vmversion -> FROM jmx.current."java.lang:type=runtime"; node | vmname | vmversion -------+-----------------------------------+------------ 12345 | Java HotSpot(TM) 64-Bit Server VM | 25.144-b01 111 | Java HotSpot(TM) 64-Bit Server VM | 25.144-b01 222 | Java HotSpot(TM) 64-Bit Server VM | 25.144-b01 (3 rows)Query 20170924_154519_00015_hnbrr, FINISHED, 3 nodesSplits: 19 total, 19 done (100.00%)0:00 [3 rows, 140B] [11 rows/s, 523B/s]
presto> SELECT openfiledescriptorcount, maxfiledescriptorcount -> FROM jmx.current."java.lang:type=operatingsystem";
openfiledescriptorcount | maxfiledescriptorcount -------------------------+------------------------ 978 | 4096 919 | 4096 925 | 4096 (3 rows)Query 20170924_154527_00016_hnbrr, FINISHED, 3 nodesSplits: 19 total, 19 done (100.00%)0:00 [3 rows, 48B] [8 rows/s, 130B/s]
配置本地文件连接【Local File Connector】
The local file connector allows querying data stored on the local file system of each worker
本地文件连接器,允许查询每个节点的数据文件存储的数据。
创建文件:etc/catalog/localfile.properties
connector.name=localfile
配置属性
Property Name Descriptionpresto-logs.http-request-log.location Directory or file where HTTP request logs are writtenpresto-logs.http-request-log.pattern If the log location is a directory this glob is used to match file names in the directoryLocal File Connector Schemas and TablesThe local file connector provides a single schema named logs. You can see all the available tables by running SHOW TABLES:SHOW TABLES FROM localfile.logs;http_request_logThis table contains the HTTP request logs from each node on the cluster.
测试结果:
presto> show schemas from localfile;
Schema -------------------- information_schema logs (2 rows)Query 20170924_154211_00006_hnbrr, FINISHED, 2 nodesSplits: 18 total, 18 done (100.00%)0:00 [2 rows, 32B] [4 rows/s, 79B/s]
presto> show tables from localfile.logs;
Table ------------------ http_request_log (1 row)Query 20170924_154222_00007_hnbrr, FINISHED, 2 nodesSplits: 18 total, 18 done (100.00%)0:00 [1 rows, 30B] [2 rows/s, 89B/s]
presto> show tables from localfile.information_schema;
Table ------------------------- __internal_partitions__ columns schemata table_privileges tables views (6 rows)Query 20170924_154232_00008_hnbrr, FINISHED, 1 nodeSplits: 18 total, 18 done (100.00%)0:00 [7 rows, 263B] [19 rows/s, 728B/s]
presto> select * from localfile.information_schema.tables;
table_catalog | table_schema | table_name | table_type ---------------+--------------------+-------------------------+------------ localfile | information_schema | columns | BASE TABLE localfile | information_schema | tables | BASE TABLE localfile | information_schema | views | BASE TABLE localfile | information_schema | schemata | BASE TABLE localfile | information_schema | __internal_partitions__ | BASE TABLE localfile | information_schema | table_privileges | BASE TABLE localfile | logs | http_request_log | BASE TABLE (7 rows)Query 20170924_154251_00009_hnbrr, FINISHED, 1 nodeSplits: 17 total, 17 done (100.00%)0:00 [7 rows, 466B] [32 rows/s, 2.11KB/s]
配置一个mysql测试
[hadoop@hadoop01 catalog]$ vim mysql.properties
配置内容为
connector.name=mysqlconnection-url=jdbc:mysql://hadoop01:3306connection-user=rootconnection-password=AAAaaa111
结果测试显示:
presto> SHOW SCHEMAS FROM mysql;
Schema -------------------- hive information_schema test (3 rows)Query 20170924_152408_00007_r5qdw, FINISHED, 2 nodesSplits: 18 total, 18 done (100.00%)0:00 [3 rows, 41B] [10 rows/s, 142B/s]
presto> SHOW tables FROM mysql.hive;
Table --------------------------- bucketing_cols cds columns_v2 database_params dbs func_ru funcs global_privs part_col_stats partition_key_vals partition_keys partition_params partitions roles sd_params sds sequence_table serde_params serdes skewed_col_names skewed_col_value_loc_map skewed_string_list skewed_string_list_values skewed_values sort_cols tab_col_stats table_params tbls version (29 rows)Query 20170924_152420_00008_r5qdw, FINISHED, 2 nodesSplits: 18 total, 18 done (100.00%)0:00 [29 rows, 737B] [66 rows/s, 1.66KB/s]
presto> desc mysql.hive.version;
Column | Type | Extra | Comment -----------------+--------------+-------+--------- ver_id | bigint | | schema_version | varchar(127) | | version_comment | varchar(255) | | (3 rows)Query 20170924_152526_00009_r5qdw, FINISHED, 1 nodeSplits: 18 total, 18 done (100.00%)0:01 [3 rows, 215B] [2 rows/s, 173B/s]
presto> select * from mysql.hive.version;
ver_id | schema_version | version_comment --------+----------------+----------------------------------------- 1 | 1.2.0 | Set by MetaStore hadoop@192.168.137.101 (1 row)Query 20170924_152559_00010_r5qdw, FINISHED, 1 nodeSplits: 17 total, 17 done (100.00%)0:02 [1 rows, 0B] [0 rows/s, 0B/s]
应用mysql局限性
问题
错误01
java.lang.IllegalArgumentException: No factory for connector hive at com.google.common.base.Preconditions.checkArgument(Preconditions.java:191) at com.facebook.presto.connector.ConnectorManager.createConnection(ConnectorManager.java:170) at com.facebook.presto.metadata.StaticCatalogStore.loadCatalog(StaticCatalogStore.java:99) at com.facebook.presto.metadata.StaticCatalogStore.loadCatalogs(StaticCatalogStore.java:77) at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:120) at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:67)
这个因为hive的名字写错了.记得要写hive-hadoop2这个名字不能乱起的哈,其它名字presto不认识的。
错误02
2017-09-24T22:25:57.586+0800 ERROR main com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:1) Error: Invalid configuration property node.environment: is malformed (for class io.airlift.node.NodeConfig.environment)2) Configuration property 'ordinator' was not used at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)2 errorscom.google.inject.CreationException: Unable to create injector, see the following errors:1) Error: Invalid configuration property node.environment: is malformed (for class io.airlift.node.NodeConfig.environment)2) Configuration property 'ordinator' was not used at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)2 errors at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:466) at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155) at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107) at com.google.inject.Guice.createInjector(Guice.java:96) at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:242) at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:116) at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:67)
修改一下节点的名字,节点名字写得有问题;集群用一串英文,节点ID用数字;或属性名字写错。要检查配置文件中属性名是否写错或写漏字母。实践过程中,这个是因为我复制配置时,漏了一个字母造成的。
错误3
2017-09-24T22:40:08.513+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Cannot connect to discovery server for announce: Announcement failed for http://192.169.137.101:80802017-09-24T22:40:08.514+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 1.16s. Next request will happen within 0.00s2017-09-24T22:40:09.278+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 744.71ms. Next request will happen within 1.00ms2017-09-24T22:40:09.539+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 258.22ms. Next request will happen within 2.00ms2017-09-24T22:40:10.282+0800 ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 739.25ms. Next request will happen within 4.00ms2017-09-24T22:40:10.515+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 223.25ms. Next request will happen within 8.00ms2017-09-24T22:40:10.562+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 30.86ms. Next request will happen within 16.00ms2017-09-24T22:40:11.293+0800 ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 698.12ms. Next request will happen within 32.00ms2017-09-24T22:40:11.534+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 175.65ms. Next request will happen within 64.00ms2017-09-24T22:40:12.362+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 699.87ms. Next request will happen within 128.00ms2017-09-24T22:40:12.665+0800 ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 45.73ms. Next request will happen within 256.00ms^C2017-09-24T22:40:13.600+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 422.26ms. Next request will happen within 512.00ms2017-09-24T22:40:14.667+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 66.66ms. Next request will happen within 1000.00ms^C2017-09-24T22:40:16.182+0800 ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 514.05ms. Next request will happen within 1000.00ms^V2017-09-24T22:40:17.604+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 421.29ms. Next request will happen within 1000.00ms
把IP修改成域名或主机名就可以了, 用IP不识别;
另外对重启了一下主结点的情况,其它节点会显示,先是找不到,然后就恢复了:
配置好的presto引擎,不论源数据是什么形式的数据都可以由这引擎去查询,对于windows的界面像oracle sql/pl那样的查询,可以考虑一下这个工具:
连接Presto:SQuirrel SQL Client安装配置:
http://blog.csdn.net/ld326/article/details/77987507
【作者:happyprince, http://blog.csdn.net/ld326/article/details/78088863】
- 大数据学习[09]:presto0.184集群|多数据源|问题
- 大数据学习--问题集锦(hadoop篇)--集群搭建
- 大数据学习[07]:elasticsearch5.6.1集群与问题
- springboot 多数据源问题
- 大数据多数据库提升非分表列查询速度
- 《spring-boot学习》-09-spring boot+mybatis多数据源
- Spring 多数据源事务配置问题
- Spring 多数据源事务配置问题
- Spring 多数据源事务配置问题
- Spring 多数据源事务配置问题
- SpringSide 3多数据源问题
- Spring 多数据源事务配置问题
- Spring 多数据源事务配置问题
- C3P0多数据源的死锁问题
- Spring 多数据源事务配置问题
- Spring 多数据源事务配置问题
- Spring 多数据源事务配置问题
- 多数据源配置文件生效问题
- java.lang.IllegalArgumentException: Circular placeholder reference 'server.port:**' in property
- shell编程——if语句 if -z -n -f -eq -ne -lt
- 欢迎使用CSDN-markdown编辑器
- 2017.9.25
- 自定义View从入门到懵逼系列(上)
- 大数据学习[09]:presto0.184集群|多数据源|问题
- 对象四大作用域
- Android 权限机制
- PL/Sql 导出导入数据库的方法
- JAVASE(Date)
- 单点登录实现(spring session+redis完成session共享
- 欢迎使用CSDN-markdown编辑器
- JAVASE(replace)
- sphereface v2