Scikit-learn-python机器学习工具入门学习

来源:互联网 发布:excel跨表数据有效性 编辑:程序博客网 时间:2024/05/18 00:44

1、下载

https://github.com/scikit-learn/scikit-learn

官网:http://scikit-learn.org/stable/

2、安装

参考官网文档,需要numpy、scipy,我直接尝试在文件目录下

sudo python setup.py install
出现错误,提示如下:

>>> import sklearnTraceback (most recent call last):  File "<stdin>", line 1, in <module>  File "sklearn/__init__.py", line 37, in <module>    from . import __check_build  File "sklearn/__check_build/__init__.py", line 46, in <module>    raise_build_error(e)  File "sklearn/__check_build/__init__.py", line 41, in raise_build_error    %s""" % (e, local_dir, ''.join(dir_content).strip(), msg))ImportError: No module named _check_build___________________________________________________________________________Contents of sklearn/__check_build:__init__.py               __init__.pyc              _check_build.c_check_build.pyx          setup.py                  setup.pyc___________________________________________________________________________It seems that scikit-learn has not been built correctly.If you have installed scikit-learn from source, please do not forgetto build the package before using it: run `python setup.py install` or`make` in the source directory.If you have used an installer, please check that it is suited for yourPython version, your operating system and your platform.

尝试着重新安装numpy scipy 才发现Mac系统自己已经自带了许多类库了,如下:

CoreGraphics/                              OpenSSL/                                   PyObjC/                                    Twisted-12.2.0-py2.7.egg-info/             altgraph/                                  altgraph-0.10.1-py2.7.egg-info/            bdist_mpkg/                                bdist_mpkg-0.4.4-py2.7.egg-info/           bonjour/                                   dateutil/                                  macholib/                                  macholib-1.5-py2.7.egg-info/               matplotlib/                                modulegraph/                               modulegraph-0.10.1-py2.7.egg-info/         mpl_toolkits/                              numpy/                                     py2app/                                    py2app-0.7.1-py2.7.egg-info/               python_dateutil-1.5-py2.7.egg-info/        pytz/                                      pytz-2012d-py2.7.egg-info/                 scipy/                                     setuptools/                                setuptools-0.6c12dev_r88846-py2.7.egg-info/twisted/                                   xattr/                                     xattr-0.6.4-py2.7.egg-info/                zope/                                      zope.interface-3.8.0-py2.7.egg-info/  
后来尝试了好几种方法,使用pip和easy_install的方法,分别报错。我就在site-packages下删除了原来的文件,然后重新安装了,就成功了。(刚开始失败的原因可能是没有把终端重启,重新进入python)

3、测试学习

➜  ~  pythonPython 2.7.5 (default, Sep 12 2013, 21:33:34) [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> import sklearn>>> from sklearn import datasets>>> iris = datasets.load_iris()>>> digits = datasets.load_digits()>>> print(digits.data)[[  0.   0.   5. ...,   0.   0.   0.] [  0.   0.   0. ...,  10.   0.   0.] [  0.   0.   0. ...,  16.   9.   0.] ...,  [  0.   0.   1. ...,   6.   0.   0.] [  0.   0.   2. ...,  12.   0.   0.] [  0.   0.  10. ...,  12.   1.   0.]]>>> 

4、后续计划

想跟着自带的例子,将机器学习的常用算法做一个后续的总结,是不错的学习资料。

http://scikit-learn.org/stable/auto_examples/feature_selection_pipeline.html


0 0