A Scalable System for Executing and Scoring K-Means Clustering Techniques and Its Impact on Applications in Agriculture

Golubovic, N; Krintz, C Wolski; Sethuramasamyraja, B; Liu, B.

Citation Details

We present Centaurus a scalable, open source, clustering service for K-means clustering of correlated, multidimensional data. Centaurus provides users with automatic deployment via public or private cloud resources, model selection (using Bayesian information criterion), and data visualisation. We apply Centaurus to a real-world, agricultural analytics application and compare its results to the industry standard clustering approach. The application uses soil electrical conductivity (EC) measurements, GPS coordinates, and elevation data from a field to produce a map of differing soil zones (so that management can be specialised for each). We use Centaurus and these datasets to empirically evaluate the impact of considering multiple K-means variants and large numbers of experiments. We show that Centaurus yields more consistent and useful clusterings than the competitive approach for use in zone-based soil decision-support applications where a hard decision is required. more »

Award ID(s):: 1703560

PAR ID:: 10091230

Author(s) / Creator(s):: Golubovic, N; Krintz, C Wolski; Sethuramasamyraja, B; Liu, B.

Date Published:: 2019-04-01

Journal Name:: International journal of big data intelligence

ISSN:: 2053-1397

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this