Research Article

INTUITIVE DATA UNDERSTANDING: INTERACTIVE VISUALIZATIONS OF K-MEANS AND HIERARCHICAL CLUSTERING TECHNIQUES

ISSN: 3067-4395

DOI Prefix: 10.5281/zenodo.

Authors: Aditi Rajesh Sharma
Published: Volume 12, Issue 2 (2025)
Date: September 9, 2025

Abstract

This study examines the structure of geothermal geo-chemisty data in India using k-means and hierarchical cluster analyses. The first attempt to list hot springs in India was made by Schlagintweit in 1852. The Geological Survey of India published a special publication titled 'Geothermal Atlas of India', and the government of India constituted a 'Hot Spring Committee' to examine the possibility of developing geothermal plants for power generation and other uses. In the Puga valley and Parvati projects, it is estimated that it is possible to harness 5000 MWh of geothermal energy from Puga valley, sufficient to sustain a 20 MWe power plant.
The GSI, the repository of most information concerning geological and related data in the country, included an R&D item No. 7/WB-5 for the development of a computerised system of geothermal database system referred to as GTHERMIS in its field season 1993-94 program. The computational strategy involves assigning each data point to the cluster with the nearest center (or "centroid"), recalculating the centroids after each assignment, and repeating the procedure until the clusters are no longer statistically different.
The k-means method, developed by MacQueen (1967), is one of the most widely used non-hierarchical methods, particularly suitable for large amounts of data. The data is scaled before cluster analysis, and the results are visualized using interactive graphics