大数据基础(八) Spark 2.0.0下IPython和Notebook的安装配置
来源:互联网 发布:行会2修改数据 编辑:程序博客网 时间:2024/06/06 05:02
环境:
spark 2.0.0,anaconda2
1.spark ipython和notebook安装配置
方法一:
这个方法可以通过网页进入ipython notebook,另开终端可以进入pyspark如果装有Anaconda 就可以直接如下方式获得IPython界面的登陆,没有装Anaconda的参考最下边的链接自行安装ipython相关包。
vi ~/.bashrc
export PYSPARK_DRIVER_PYTHON=ipython
export PYSPARK_DRIVER_PYTHON_OPTS="notebook --NotebookApp.open_browser=False --NotebookApp.ip='*' --NotebookApp.port=8880"
source ~/.bashrc
重新启动pyspark
出现
ting a Notebook with PySpark
On the driver host, choose a directory notebook_directory to run the Notebook. notebook_directory contains the .ipynb files that represent the different notebooks that can be served.
In notebook_directory, run pyspark with your desired runtime options. You should see output like the following:
参考:
ipython和jupyter on spark 2.0.0
http://www.cloudera.com/documentation/enterprise/5-5-x/topics/spark_ipython.html
方法二:
方法二用ipython可以,但是jupyter有问题,不知道是不是个别的
It is also possible to launch the PySpark shell in IPython, the enhanced Python interpreter. PySpark works with IPython 1.0.0 and later. To use IPython, set the PYSPARK_DRIVER_PYTHON variable to ipython when running bin/pyspark:
$ PYSPARK_DRIVER_PYTHON=ipython ./bin/pyspark
To use the Jupyter notebook (previously known as the IPython notebook),
$ PYSPARK_DRIVER_PYTHON=jupyter ./bin/pyspark
You can customize the ipython or jupyter commands by setting PYSPARK_DRIVER_PYTHON_OPTS.
root@py-server:/server/bin# PYSPARK_DRIVER_PYTHON=ipython $SPARK_HOME/bin/pyspark
Python 2.7.12 |Anaconda 4.1.1 (64-bit)| (default, Jul 2 2016, 17:42:40)
Type "copyright", "credits" or "license" for more information.
IPython 4.2.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/08/03 22:24:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 2.0.0
/_/
Using Python version 2.7.12 (default, Jul 2 2016 17:42:40)
SparkSession available as 'spark'.
In [1]:
2. 使用:
Open http://notebook_host:8880/ in a browser.
比如:http://spark01:8880/
New->Python打开Python界面
Shift+Enter or Shift+Return执行命令
注意:
设置IPython后,pyspark就只能用IPython,除非恢复环境变量
3.测试例子
引用:《Spark for Python Developers》
file_in换成你自己的文件,如果是本地就用#那一句,hdfs就默认,修改一下具体地址即可。
0 0
- 大数据基础(八) Spark 2.0.0下IPython和Notebook的安装配置
- IPython notebook的安装配置
- linux/ubuntu下IPython、IPython Notebook(jupyter)的安装和基本使用
- Ubuntu系统下IPython Notebook的安装和远程访问配置
- windows下安装ipython和jupyter notebook
- Win7环境下IPython Notebook的安装
- IPython和IPython Notebook的安装和简单应用
- IPython和IPython Notebook的安装和简单应用
- IPython和IPython Notebook的安装和简单应用
- 大数据基础(四)Ubuntu sbt安装和Spark下的使用
- 安装配置远程ipython notebook
- ubuntu下安装 ipython notebook
- ubuntu下安装ipython notebook
- Linux下安装IPython Notebook
- win10 下安装IPython notebook
- ipython pyQt/notebook 的配置
- IPython Notebook 的安装方法
- Ubuntu系统下IPython Notebook的远程访问配置
- android全屏去掉title栏的多种实现方法
- 在WPF中实现图片一边下载一边显示
- Python学习笔记:iterator和iterable
- Android提醒微技巧Dialog、Toast和Snackbar
- 关于矩阵键盘IO口写法
- 大数据基础(八) Spark 2.0.0下IPython和Notebook的安装配置
- 专题 java中的数组
- linux/unix 段错误捕获(打印栈,addr2line使用)
- Solr集群的搭建
- 网络分析
- Scrapy实战之抓取豆瓣图书
- i2c设备驱动实例 ds1307为例
- Android UI高级控件中的ListView
- 【ZZULI】-1998-985的数字难题(思维)