安装流行脚本编辑器(jupyter notebook)流程

来源:互联网 发布:淘宝云客服兼职 编辑:程序博客网 时间:2024/05/01 11:02

jupyter notebook是一个流行的轻量的在线代码编辑器,可支持几十种程序语言.
jupyter notebook 功能也很丰富,做文档,数据科学分析,计算都非常方便.
jupyter notebook在window|linux上都有发行.window安装非常简单,linux安装比较复杂,本人为了安装jupyter notebook花了不少时间,现把教程分享一下.

安装python2.7安装包

(python3.x会有不兼容的地方,所以选2.7.)。

从官网下载python2.7.6的安装包。

https://www.python.org/ftp/python/2.7.6/Python-2.7.6.tar.xz

解压:

xz -d Python-2.7.6.tar.xztar -xvf Python-2.7.6.tar

解压后做以下几步:

sudo ./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"sudo make && make altinstall

如果没报错就代表 python2.7已经安装到了你的服务器上。

2,修改服务器原有python命令默认的python版本(一般是2.6或者更低)

在终端输入python命令,会发现系统原有版本为2.6,并没有使用我们的2.7版本。

这是你可以用which python命令查看该python命令调用的是那个位置的python,一般情况下在/usr/bin/python 这里。

但是这里的python指向的系统自带的2.6版本。而我们安装的python2.7的命令在/usr/local/python2.7/bin/python(前边的路径要根据你的安装路径确定)这里

我们只需把/usr/bin/python 删除掉:rm /usr/bin/python。然后做个软连接

sudo mv /usr/bin/python /usr/bin/python2.6sudo ln -s /usr/local/bin/python2.7 /usr/bin/python

这个时候 我们再一次在终端输入python命令

bingo!已经成了2.7版本。

3,yum工具已经不可以使用了

这时候你输入 yum install xxxx 会提示你yum模块找不到。

其实 yum 是依赖python 的。当我们修改了原有的python版本之后这个yum会调用我们的2.7版本的python,而我们2.7版本没有yum就会报错。

我们只需要 用 which yum 找到yum的地址,然后 编辑yum文件,然后把文件首行的

whereis yumsudo vi /usr/bin/yum#!/usr/bin/python 改成#!/usr/bin/python2.6

(其实在/usr/bin下边依然是有python2.6这个文件的)。这样子yum就又可以使用了。

4,安装setuptools和pip

大家知道pip是使用python很方便的工具,其依赖setuptool。所以首先我们要安装setuptool。(我直接从官网下载setuptool和pip的安装包)

(1)安装setuptool

安装时候报错 python的zlib模块找不到:

下载zlib&&zlib-dev

解压进入目录,

sudo ./configure --prefix=/usr/local/zlib-1.2.11sudo makesudo make && make installsudo yum -y install setuptool

(2)安装pip依赖包openssl和openssl-devel

安装pip时候又报了错误,错误是无法加载HTTPSHandler模块。

在网上找了下,是系统的openssl和openssl-devel没装。我的系统只是openssl-devel没装。然后就下载了这个模块安装。

sudo yum -y install opensslsudo yum -y install openssl-devel

然后重新编译安装python2.7,命令还是

sudo ./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"sudo make && make altinstall

下载pip

wget https://bootstrap.pypa.io/get-pip.py --no-check-certificate

安装pip

sudo python get-pip.py

查看pip版本

pip -V  

安装完pip最好安装一下py4j,因为pyspark环境需要这个module.

pip install py4j –upgrade

安装完重新编译,make一下python

安装jupyter

sudo pip install jupyter

查看jupyter版本:

jupyter --version

启动jupyter

jupyter notebook

发现报错:

Traceback (most recent call last):  File "/usr/local/bin/jupyter-notebook", line 7, in <module>    from notebook.notebookapp import main  File "/usr/local/lib/python2.7/site-packages/notebook/notebookapp.py", line 79, in <module>    from .services.sessions.sessionmanager import SessionManager  File "/usr/local/lib/python2.7/site-packages/notebook/services/sessions/sessionmanager.py", line 13, in <module>    from pysqlite2 import dbapi2 as sqlite3ImportError: No module named 'pysqlite2'

提示缺了sqlite3-dev,下载sqlite3-dev:

sudo wget http://www.sqlite.org/2014/sqlite-autoconf-3080500.tar.gz

或者:

sudo wget http://sqlite.org/2013/sqlite-autoconf-3080100.tar.gz

安装sqlite3-dev:

tar xvfz sqlite-autoconf-3080100.tar.gzcd sqlite-autoconf-3080100sudo ./configuresudo makesudo make install

注意:可能还需要libsqlite3-0-32bit-3.8.10.2-10.1.x86_64.rpm ,百度下载安装

然后重新编译python:

sudo ./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"sudo make && make altinstall

配置jupyter

修改/etc/profile

# jupyter------------------------export PYSPARK_DRIVER_PYTHON=jupyterexport PYSPARK_DRIVER_PYTHON_OPTS="notebook"

source /etc/profile

再:

jupyter notebook

如果发现问题仍然存在,可能是由于权限问题,编译不完全。使用:

make clean

再编译,反复几次看看。

改jupyter默认端口号:

sudo vi ~/.jupyter/jupyter_notebook_config.py

如果没有jupyter_notebook_config.py文件,创建一个:

jupyter notebook --generate-config

生成密码:输入shell命令:创建一个密钥

ipython

会出现:

In [1]: from notebook.auth import passwdIn [2]: passwd()Enter password: Verify password: Out[2]: 'sha1:ce23d945972f:34769685a7ccd3d08c84a18c63968a41f1140274'

把生成的密文‘sha:ce…’复制下来

修改默认配置:jupyter_notebook_config.py文件

c.NotebookApp.ip=’*’

c.NotebookApp.password = u’sha:ce…刚才复制的那个密文’

c.NotebookApp.open_browser = False

c.NotebookApp.port =8888 #随便指定一个端口

将默认端口号8888改成8990.

c.NotebookApp.port = 8990

再次启动jupyter notebook

如果登陆失败,则有可能是服务器防火墙设置的问题,此时最简单的方法是在本地建立一个ssh通道:

在本地终端中输入ssh username@address_of_remote -L127.0.0.1:1234:127.0.0.1:8888

便可以在localhost:1234直接访问远程的jupyter了。

最终可在浏览器中访问jupyter:http://10.0.0.120:8990

创建一个文件夹,用于存放jupyter编辑器写的脚本:

mkdir ~/jupyter_scriptchmod -R 777 ~/jupyter_script

点击页面右上角,new python,出现报错:

Permission denied: Untitled.ipynb

执行如下代码修改Jupyter的一部分文件的权限(执行完之后重新启动即可):

sudo chmod 777 ~/.local/share/jupyter/cd ~/.local/share/jupyter/lssudo chmod 777 runtime/cd runtime/ls

参考:http://www.cnblogs.com/uestc-mm/p/7168550.html

spark 编辑器安装

下载安装toree(spark2.1.0以上版本+scala2.11以上版本)

toree2.0下载网址:https://dist.apache.org/repos/dist/dev/incubator/toree/

最好使用toree-0.2.0.tar.gz版本

pip install -i https://pypi.anaconda.org/hyoon/simple toree

或离线下载:toree-0.2.0.dev1.tar.gz,安装:

pip install toree-0.2.0.dev1.tar.gz

或者:

pip install https://dist.apache.org/repos/dist/dev/incubator/toree/0.2.0/snapshots/dev1/toree-pip/toree-0.2.0.dev1.tar.gz

出现错误:

Processing ./toree-0.2.0.dev1.tar.gzRequirement already satisfied: jupyter_core<5.0,>=4.0 in ./lib/python2.7/site-packages (from toree==0.2.0.dev1)Collecting jupyter_client<5.0,>=4.0 (from toree==0.2.0.dev1)  Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x27af450>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/  Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x27af250>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/  Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x1d6f410>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/  Retrying (Retry(total=1, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x1d6f110>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/  Retrying (Retry(total=0, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x1d6fd10>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/  Could not find a version that satisfies the requirement jupyter_client<5.0,>=4.0 (from toree==0.2.0.dev1) (from versions: )No matching distribution found for jupyter_client<5.0,>=4.0 (from toree==0.2.0.dev1)

应该是gfw的原因,解决办法:添加几个google的dns(亲测可用)

参考索引:https://github.com/moby/moby/issues/30757

https://stackoverflow.com/questions/28668180/cant-install-pip-packages-inside-a-docker-container-with-ubuntu

(1):

nameserver 8.8.8.8nameserver 8.8.4.4If you want to add other DNS servers, have a look here.However this change won't be permanent (see this thread). To make it permanent : $ sudo nano /etc/dhcp/dhclient.confUncomment and edit the line with prepend domain-name-server : prepend domain-name-servers 8.8.8.8, 8.8.4.4;Restart dhclient : $ sudo dhclient.

(2)安装:

pip install toree-0.2.0.dev1.tar.gz

结果:

Processing ./toree-0.2.0.dev1.tar.gzRequirement already satisfied: jupyter_core<5.0,>=4.0 in ./lib/python2.7/site-packages (from toree==0.2.0.dev1)Collecting jupyter_client<5.0,>=4.0 (from toree==0.2.0.dev1)/usr/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.  SNIMissingWarning/usr/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.  InsecurePlatformWarning  Downloading jupyter_client-4.4.0-py2.py3-none-any.whl (76kB)```100% |████████████████████████████████| 81kB 194kB/s ```Requirement already satisfied: traitlets<5.0,>=4.0 in ./lib/python2.7/site-packages (from toree==0.2.0.dev1)Requirement already satisfied: pyzmq>=13 in ./lib/python2.7/site-packages (from jupyter_client<5.0,>=4.0->toree==0.2.0.dev1)Requirement already satisfied: decorator in ./lib/python2.7/site-packages (from traitlets<5.0,>=4.0->toree==0.2.0.dev1)Requirement already satisfied: ipython-genutils in ./lib/python2.7/site-packages (from traitlets<5.0,>=4.0->toree==0.2.0.dev1)Requirement already satisfied: enum34; python_version == "2.7" in ./lib/python2.7/site-packages (from traitlets<5.0,>=4.0->toree==0.2.0.dev1)Requirement already satisfied: six in ./lib/python2.7/site-packages (from traitlets<5.0,>=4.0->toree==0.2.0.dev1)Building wheels for collected packages: toree  Running setup.py bdist_wheel for toree ... done  Stored in directory: /home/infosouth/.cache/pip/wheels/05/8c/59/313ad78c88005d86c240c7891a8fde548f29f0d64203a9bc07Successfully built toreeInstalling collected packages: jupyter-client, toree  Found existing installation: jupyter-client 5.1.0```Uninstalling jupyter-client-5.1.0:  Successfully uninstalled jupyter-client-5.1.0```Successfully installed jupyter-client-4.4.0 toree-0.2.0.dev1

参考索引:https://github.com/apache/incubator-toree

spark1.5.0以上版本+scala2.10版本的需要下载安装toree0.1.0版本:

pip install https://dist.apache.org/repos/dist/dev/incubator/toree/0.1.0/snapshots/toree-0.1.0.dev8.tar.gzjupyter toree install

或者去apache.toree网站下载0.1.0版本的toree

安装spark-kernel 和scala(脚本):

# !/bin/bashjupyter toree install --spark_home=$SPARK_HOME --user #will install scala + spark kerneljupyter toree install --spark_home=$SPARK_HOME --interpreters=PySpark --userjupyter kernelspec listjupyter notebook #launch jupyter notebook

更改jupyter默认工作空间:

sudo vi ~/.jupyter/jupyter_notebook_config.py

找到c.NotebookApp.notebook_dir = ‘自己的位置’

配置跨域访问jupyter notebook出现的错误:

Refused to display

Content Security Policy directive: “frame-ancestors ‘self’”

sudo vi ~/.jupyter/jupyter_notebook_config.pyc.NotebookApp.allow_origin = '*'c.NotebookApp.trust_xheaders = Truec.NotebookApp.disable_check_xsrf = Truec.NotebookApp.tornado_settings = {    'headers': {            'Content-Security-Policy': ""    }}

参考索引:http://www.ruanyifeng.com/blog/2016/09/csp.html

并且添加:

cd ~/.jupytermkdir customchmod -R 755 customcd customsudo vi custom.js

添加内容(表示所有页面只在一个页面跳转切换):

define(['base/js/namespace'], function(Jupyter){Jupyter._target = '_self';});

运行小程序

运行python代码(例子):

pip install matplotlib

如果报错,则是gfw原因。用离线下载以下4个包:

numpy-1.13.3-cp27-cp27mu-manylinux1x8664.whl

matplotlib-2.1.0-cp27-cp27mu-manylinux1x8664.whl

six-1.11.0-py2.py3-none-any.whl

python_dateutil-2.6.0-py2.py3-none-any.whl

逐个安装

测试小程序:

%matplotlib inlineimport matplotlib.pyplot as pltimport numpy as npx = np.arange(20)y = x**2plt.plot(x, y)import numpy as npimport matplotlib as mplimport matplotlib.pyplot as plt# 通过rcParams设置全局横纵轴字体大小mpl.rcParams['xtick.labelsize'] = 24mpl.rcParams['ytick.labelsize'] = 24np.random.seed(42)# x轴的采样点x = np.linspace(0, 5, 100)# 通过下面曲线加上噪声生成数据,所以拟合模型就用y了……y = 2*np.sin(x) + 0.3*x**2y_data = y + np.random.normal(scale=0.3, size=100)# figure()指定图表名称plt.figure('data')# '.'标明画散点图,每个散点的形状是个圆plt.plot(x, y_data, '.')# 画模型的图,plot函数默认画连线图plt.figure('model')plt.plot(x, y)# 两个图画一起plt.figure('data & model')# 通过'k'指定线的颜色,lw指定线的宽度# 第三个参数除了颜色也可以指定线形,比如'r--'表示红色虚线# 更多属性可以参考官网:http://matplotlib.org/api/pyplot_api.htmlplt.plot(x, y, 'k', lw=3)# scatter可以更容易地生成散点图plt.scatter(x, y_data)# 将当前figure的图保存到文件result.pngplt.savefig('result.png')# 一定要加上这句才能让画好的图显示在屏幕上plt.show()

运行scala代码(例子):

var name="jupyter"println(f"hello $name")sc.parallelize(1 to 100).reduce(_+_)sc.parallelize(1 to 100).mean()val sc = new SparkContext(conf)val datas: Array[String] = Array("{'id':1,'name':'xl1','pwd':'xl123','sex':2}","{'id':2,'name':'xl2','pwd':'xl123','sex':1}","{'id':3,'name':'xl3','pwd':'xl123','sex':2}")sc.parallelize(datas).map(v => {    new Gson().fromJson(v, classOf[User])}).foreach(user => {println("id: " + user.id+" name: " + user.name+" pwd: " + user.pwd+" sex:" + user.sex)})}

运行通过,则说明已经安装成功了.

原创粉丝点击