k-means算法的ruby实现

来源:互联网 发布:thinkphp分销系统源码 编辑:程序博客网 时间:2024/05/21 16:39

我胡汗三又回来了

既上次立下FLAG后,好久没碰博客了,脸上都消肿了, 最近科创项目需要分析数据,来确定羊的运动状态,立项的时候看过可以用k-means算法写,然而最近学长说这个算法太老了,推荐我看一下AP算法和python,我估摸着学长的小算盘,我写一下,他可以参考我的。我这么皮,怎么能屈服,正好看Ruby好久了,拿来练练手,k-means我是一定要写的,参考[http://www.csdn.net/article/2012-07-03/2807073-k-means]这位大哥的帖子,动手写了一下,运行起来还可以。算法原理直接看前边大神的贴子,我这里只给出代码。

下方贴代码

K-means.rb

#! /bin/ruby#_*_ coding:utf-8 _*_require 'pg'require './sport'conn = PG.connect(:dbname => 'dbname', :port => 5432, :user => 'username', :password =>'password',:host =>'localhost')#从postgresql中读取运动数据sportArr = Array.newres = conn.query("select * from goat where goatid='G1' and id < 4000 ")res.each do |row|    sportArr << Sport.new(row['id'],row['datatime'],row['sportx'],row['sporty'],row['sportz'])endseedArr = Array.newclassArr = Array.newclassNumArr = Array.newindexArr = Array[3252,3311,3381]sportArr.each_with_index do |item,i|    indexArr.each do |j|        if item.id.to_i == j.to_i then            seedArr << item            classArr << Array.new            classNumArr << 0        end    endendseedArr.each do |i|    i.printOutendif seedArr.empty? then    puts "seedArr is null"else    flag = true    while flag do        sportArr.each do |i|            index = 0            1.step(seedArr.size-1,1) do |j|                if i.similarity(seedArr[j]) > i.similarity(seedArr[index]) then#               if i.distanceWithZ(seedArr[j]) < i.distanceWithZ(seedArr[index]) then                    index = j                end            end            classArr[index] << i#           puts index        end        classArr.size.times do |i| puts "num class#{i}:#{classArr[i].size}"        end        seedArr.each_with_index do |item,i|            sum_x = 0;sum_y = 0;sum_z = 0;            classArr[i].each do |j|                sum_x += j.x                sum_y += j.y                sum_z += j.z            end            if classArr[i].size > 0 then                item.x = sum_x / classArr[i].size                item.y = sum_y / classArr[i].size                item.z = sum_z / classArr[i].size            else                item.x = 0                item.y = 0                item.z = 0            end        end        puts "种子已重新定位"        temp = 0        classNumArr.each_with_index do |item,i|            if item == classArr[i].size then                temp += 1            end        end        puts temp        if temp == seedArr.size then            puts "聚类结束"            flag = false        else            classNumArr.size.times do |i|                classNumArr[i] = classArr[i].size            end        end        classArr.each do |i|            i.clear        end    endendseedArr.each do |i|    i.printOutend

sport.rb

#_*_ coding:utf-8 _*_require 'mathn'class Sport    def initialize(inid,indatatime,inx,iny,inz)        @id = inid        @datatime = indatatime        @x = inx.to_f        @y = iny.to_f        @z = inz.to_f        @state = 'not analysis'    end    def x        @x    end    def y        @y    end    def z        @z    end    def id        @id    end    def datatime        @datatime    end    def state        @state    end    def x=(inx)        @x = inx    end    def y=(iny)        @y = iny    end    def state=(instate)        @state = instate    end    def z=(inz)        @z = inz    end    def len        Math.sqrt(@x*@x +@y*@y + @z*@z)    end    def similarity another #相似度计算,用的余弦量度量,这里可以重写成自己所需要的        (@x*another.x + @y*another.y + @z*another.z)/(len()*another.len())    end    def similarityWithZ another        temp = Math.sqrt(@z.abs*another.z.abs)        if temp >0 then        #   @z*another.z/Math.sqrt(@z.abs*another.z.abs)            @z*another.z/temp        else#           puts "#{@id} and #{another.id}"            0        end    end    def distanceWithZ another        (@z - another.z).abs    end    def printOut        puts "id:#{@id} datatime:#{@datatime} sportx:#{@x} sporty:#{@y} sportz:#{@z}"    endend

试运行结果

运行结果截图
毕竟只用了几个小时写成,如过各位老铁发现有什么不对的地方,请给我指正,邮箱: horpoppy@gmail.com