Coronavirus Data Clustering using Complete Linkage Hierarchical Clustering Technique
The main code is in clustering.py file.
To run it in Linux command line type: python3 clustering.py
It is important that dataset has the name - 'COVID_1_unlabelled.csv' and has to kept in the same directory as the .py file.
The whole code takes about 30 seconds to run.
'Report.pdf' contains a brief report of the project.
'kmeans.txt' contains clusters obtained from k-means clustering.
'agglomerative.txt' contains clusters obtained from agglomerative clustering.
There's also an addition file 'Clustering.ipynb' which was used for generating plots and for analysis purposes.