A Novel Method for Geographical Social Event Detection in Social Media

来源:互联网 发布:斯诺克比赛直播软件 编辑:程序博客网 时间:2024/05/17 08:52

keyword:

Social Event Detection, Geographical Temporal Pattern, Adaptive K-means Clustering


INTRODUCTION:

In this paper, we proposed a novel geographical social event detection method by geographical temporal pattern mining and content analysis. We first mine the geographical temporal pattern of the tweets in social activities, and discover the unusual geographical region by this pattern. Then we adopt the adaptive K-means clustering algorithm for the content of tweets, which involves in the area where unusual geographical area found. Next the geographical social event is detected by the number of the tweets in the cluster. Also, the detected geographical social events can be intuitively displayed by the highly-frequent keywords involved in the cluster.


UNUSUAL GEOGRAPHICAL AREA DISCOVERY

For the purpose of convenient analysis, one day is equally divided into 4 non-overlapping partitions. In order to better reflect change regularity of behaviors in the geographical area, we define the geographical temporal pattern (GTP) of behaviors as


where  is number of tweets posted in a certain geographical area at the time unit of the i day.




Adaptive K-means Clustering

Adaptive K-means算法:
Compute Cosine similarity between each two sample-pairs and set their average as basicStep.
Compute Cosine similarity between ith and the other samples, then the density value of this samples is set as the number of samples, whose distances are smaller than basicStep.
Sort all density values by descending order.
If the ratio of the biggest density and N is greater than 0.5, the step threshold is set as 0.8*basicStep and the density threshold is set as 100; Else if the ratio of the biggest density and N is smaller than 0.01, the step threshold is set as 1.5*basicStep and the density threshold is set as 10;
Set k’ as the number of the pseudo clustering,where k’ denotes the number of the samples whose density values are greater than the density threshold, and the number of the final clustering k is set as 0;
For I = 1 to k’, compute the Euclidean distance between ith sample and k’-I samples. If there is no sample whose distance is smaller than the step threshold, then execute k++,otherwise, k holds no change;


Geographical Social Event Detection by Clustering


0 0
原创粉丝点击