热线电话:13121318867

登录
2018-12-10 阅读量: 804
怎么理解分类项目?

分类项目

现在我们需要编写一个函数来将项目分类到一个组/集群。对于给定的项目,我们将找到它与每个均值的相似性,我们将该项目分类为最接近的项目。

def Classify(means,item):

# Classify item to the mean with minimum distance

minimum = sys.maxint;

index = -1;

for i in range(len(means)):

# Find distance from item to mean

dis = EuclideanDistance(item, means[i]);

if (dis < minimum):

minimum = dis;

index = i;

return index;

为了实际找到方法,我们将遍历所有项目,将它们分类到最近的集群并更新集群的均值。我们将重复该过程一定数量的迭代。如果在两次迭代之间没有项目更改分类,我们会在算法找到最佳解决方案时停止该过程。

以下函数将输入k(所需簇的数量),项目和最大迭代次数作为输入,并返回均值和簇。的项的分类存储在数组属于关联和项目的群集中的号被存储在clusterSizes

def CalculateMeans(k,items,maxIterations=100000):

# Find the minima and maxima for columns

cMin, cMax = FindColMinMax(items);

# Initialize means at random points

means = InitializeMeans(items,k,cMin,cMax);

# Initialize clusters, the array to hold

# the number of items in a class

clusterSizes= [0 for i in range(len(means))];

# An array to hold the cluster an item is in

belongsTo = [0 for i in range(len(items))];

# Calculate means

for e in range(maxIterations):

# If no change of cluster occurs, halt

noChange = True;

for i in range(len(items)):

item = items[i];

# Classify item into a cluster and update the

# corresponding means.

index = Classify(means,item);

clusterSizes[index] += 1;

cSize = clusterSizes[index];

means[index] = UpdateMean(cSize,means[index],item);

# Item changed cluster

if(index != belongsTo[i]):

noChange = False;

belongsTo[i] = index;

# Nothing changed, return

if (noChange):

break;

return means;

0.0000
1
关注作者
收藏
评论(0)

发表评论

暂无数据
推荐帖子