C4.5工具使用方法

来源:互联网 发布:进销存软件哪个好 编辑:程序博客网 时间:2024/05/29 04:36
  

原文:http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/c4.5/tutorial.html

C4.5是决策树的经典算法,以上链接中对该算法进行了详细描述,并且给出了源程序和应用示例。

References:

  • P. Winston, 1992.

C4.5 is a software extension of the basic ID3 algorithm designed by Quinlan to address the following issues not dealt with by ID3:

  • Avoiding overfitting the data
    • Determining how deeply to grow a decision tree.
  • Reduced error pruning.
  • Rule post-pruning.
  • Handling continuous attributes.
    • e.g., temperature
  • Choosing an appropriate attribute selection measure.
  • Handling training data with missing attribute values.
  • Handling attributes with differing costs.
  • Improving computational efficiency.

It is installed for use on Grendel (grendel.icd.uregina.ca), but it may be set up on a local machine as follows:

C4.5 Release 8 Installation Instructions for UNIX

  1. Download the C4.5 source code.
  2. Decompress the archive:
    1. Type "tar xvzf c4.5r8.tar" (not universally supported), or, alternatively,
    2. Type "gunzip c4.5r8.tar.gz" to decompress the gzip archive, and then
      Type "tar xvf c4.5r8.tar" to decompress the tar archive.
  3. Change to ./R8/Src
  4. Type "make all" to compile the executables.
  5. Put the executables into a "bin" subdirectory and include it in the path for command-line usage.

Manual Pages

  • c4.5: using the c4.5 decision tree generator.
  • verbose c4.5: interpreting output generated by c4.5.
  • c4.5rules: using the c4.5 rule generator.
  • verbose c4.5rules: interpreting output generated by c4.5rules.
  • consult: uses a decision tree to classify items.
  • consultr: uses a rule set to classify items.

Examples

Click on the links below for examples of C4.5 usage:

  • Example 1 - Golf
    • A simple, detailed example of how C4.5 and C4.5rules work.
  • Example 2 - Sunburn
    • The sunburn example revisited.
  • Example 3 - Homonyms
    • Advanced usage of, and a practical application of, C4.5 and C4.5rules.