Hierarchical Clustering
Mads Møller
LinkedIn: https://www.linkedin.com/in/madsmoeller1/
Mail: mads.moeller@outlook.com
This paper is the sixth in the series about machine learning algorithms. The Hierarchical Clustering
algorithm is used for clustering problems. Hierarchical Clustering is also our first unsupervised machine
learning algorithm, meaning that our data do not need to be labelled. So you do not need to know your
target classes in order to use the Hierarchical Clustering algorithm.
1 Intuition
In many ways clustering can relate to the task of nearest-neighbor algorithms. We would like to group
observations that are similar given a dissimilarity. Each observation is in a group and different groups
do not overlap. A group is often in unsupervised learning referred to as a cluster. Within each group
we have inter homogeneity, meaning that observations within a cluster should be similar. We call the
collection of clusters a clustering. Within clustering we have inter heterogeneity, meaning that clusters
should be dissimilar to other clusters.
Often clustering methods are used for customer segmentation, grouping of products, recommender systems
(segmentation of customers and products). So indeed clustering methods are useful in the real-world as
well.
2 Hierarchical Clustering
There are different methods of clustering. In this paper we will inspect the clustering type called Hierar-
chical Clustering. In the next paper we will look at Partitioning Clustering.
2.1 Dissimilarities (between observations)
The basic way of measuring a dissimilarity is through a distance. We already looked into dissimilarities
in the paper about K Nearest Neighbors (KNN). Therefore, we will not go too much into dissimilarities
again, but some of the ways you could measure dissimilarities will be listed here:
1