

Cluster
Analysis

 Cluster
analysis is a statistical procedure that attempts
to classify a group into like or ‘homogeneous’
subgroups. It is usually used as a segmentation tool
where people are grouped together into segments based
on their attitudes, behaviors, demographics, or some
combination of these. However cluster analysis can
also be used to cluster variables (instead of cases)
into like groups as well. The task is very analogous
to a coder developing a code list in that individual
responses are read and classified into groups that
capture the common meaning.

 Cluster
analysis is often considered to be more of an art
than a science. Of all the common statistical procedures,
cluster analysis gives the least statistical guidance
as to whether the solution it generates is meaningful
or not. The cluster analysis algorithm does not tell
the researcher the ‘correct’ number of clusters
in a data set. Instead, the researcher has to produce
and examine a number of different cluster solutions
and decide which solution is the best. So the analyst
may generate cluster solutions for two clusters, three,
four and so on up to 10 or more clusters. Between
different clustering algorithms, number of clusters
produced, and options for how the data is processed,
a considerable number of cluster solutions can be
generated.

 To
evaluate the solutions, the researcher generally compares
the individual groups (i.e., start by comparing the
groups in the two cluster solution, then compare the
groups in the three cluster solution and so on) for
each solution on a series of demographic, attitudinal
or other measures. Other statistical procedures can
be used in the evaluation process, but often times
the analyst tried to interpret each solution by how
it fits with the other variables, and chooses the
solution that seems to fit the best.

 Kohonen
Self Organizing Maps (SOMs) are a form of Neural Network
(an Artificial Intelligence technology) that also
‘clusters’ cases into like groups using
a different mathematical approach. To the researcher,
SOMs do just what a KMeans cluster program does,
but in a different way. However, if a SOM and KMeans
cluster program are told programmed to produce the
same number of cluster groups, the cases will be assigned
somewhat differently. Often times the SOM solution
will be superior.



