登录
首页精彩阅读R语言实战k-means聚类和关联规则算法
R语言实战k-means聚类和关联规则算法
2017-05-29
收藏

R语言实战k-means聚类和关联规则算法

 1、R语言关于k-means聚类
数据集格式如下所示:
[plain] view plain copy
    ,河东路与岙东路&河东路与聚贤桥路,河东路与岙东路&新悦路与岙东路,河东路与岙东路&火炬路与聚贤桥路,河东路与岙东路&火炬路与汇智桥路,河东路与岙东路&汇智桥与智力岛路,新悦路与岙东路&火炬路与聚贤桥路,新悦路与岙东路&河东路与聚贤桥路,新悦路与岙东路&河东路与岙东路,新悦路与岙东路&汇智桥与智力岛路,新悦路与岙东路&火炬路与汇智桥路,河东路与聚贤桥路&新悦路与岙东路,河东路与聚贤桥路&火炬路与聚贤桥路,河东路与聚贤桥路&河东路与岙东路,河东路与聚贤桥路&汇智桥与智力岛路,河东路与聚贤桥路&火炬路与汇智桥路,火炬路与汇智桥路&新悦路与岙东路,火炬路与汇智桥路&火炬路与聚贤桥路,火炬路与汇智桥路&汇智桥与智力岛路,火炬路与汇智桥路&河东路与聚贤桥路,火炬路与汇智桥路&河东路与岙东路,汇智桥与智力岛路&新悦路与岙东路,汇智桥与智力岛路&火炬路与聚贤桥路,汇智桥与智力岛路&火炬路与汇智桥路,汇智桥与智力岛路&河东路与岙东路,汇智桥与智力岛路&河东路与聚贤桥路,火炬路与聚贤桥路&新悦路与岙东路,火炬路与聚贤桥路&河东路与岙东路,火炬路与聚贤桥路&河东路与聚贤桥路,火炬路与聚贤桥路&汇智桥与智力岛路,火炬路与聚贤桥路&火炬路与汇智桥路  
    蓝鲁BP9G39,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0  
    蓝鲁B7M827,1,23,0,1,0,0,2,55,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0  
    蓝鲁BQ3M79,0,11,0,0,0,0,1,10,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0  
    蓝鲁BU008P,0,4,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0  
    蓝鲁BW6710,14,0,0,0,0,0,0,0,0,0,0,0,14,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0  
    蓝鲁BS180G,0,1,0,0,0,0,0,24,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0  
    蓝鲁B3HU73,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0  


代码:

[html] view plain copy

    library(fpc)  
    data<-read.csv('x.csv')  
    df<-data[2:31]  
    set.seed(252964)   
    (kmeans <- kmeans(na.omit(df), 100))   
    plotcluster(na.omit(df), kmeans$cluster)   #作图  
    kmeans           #表示查看聚类结果  
    kmeans$cluster   #表示查看聚类结果  
    kmeans$center    #表示查看聚类中心  
    write.csv(kmeans$cluster,'100classes.csv') #将聚类的结果写入到文件中  

2、R语言关联规则

数据集格式

[plain] view plain copy

    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0  
    0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0  

每列代表一个属性,表示出现这个属性,每行代表记录数

代码如下:

[html] view plain copy

    library(arules)  
    groceries <- read.transactions("groceries.csv")  
    summary(groceries)  

[html] view plain copy

    </pre><pre code_snippet_id="1620120" snippet_file_name="blog_20160322_6_7367204" name="code" class="html">/*Apriori算法*/  
    frequentsets=eclat(Groceries,parameter=list(support=0.05,maxlen=10)) #求频繁项集  
    inspect(frequentsets[1:10]) #察看求得的频繁项集  
    inspect(sort(frequentsets,by=”support”)[1:10]) #根据支持度对求得的频繁项集排序并察看(等价于inspect(sort(frequentsets)[1:10])  

[html] view plain copy

    </pre><pre code_snippet_id="1620120" snippet_file_name="blog_20160322_8_2841846" name="code" class="html">/*Eclat算法*/  

[html] view plain copy

    <p>rules=apriori(Groceries,parameter=list(support=0.01,confidence=0.01)) #求关联规则</p><p>summary(rules) #察看求得的关联规则之摘要</p><p>x=subset(rules,subset=rhs%in%”whole milk”&lift>=1.2) #求所需要的关联规则子集</p><p>inspect(sort(x,by=”support”)[1:5]) #根据支持度对求得的关联规则子集排序并察看</p><div>  
    </div> 

数据分析咨询请扫描二维码

客服在线
立即咨询