Abstract |
Cities are producing and collecting massive amount of data from various sources such as transportation network, energy sector, smart homes, tax records, surveys, mobile phones sensors etc. For citizens and municipalities wanting to interpret and understand society's trends and make decisions, a question they are immediately faced with is how to store and analyze the vast amount of data that their service will collect. One of the recent technologies that have a huge potential to enhance smart city services is big data analytics which have many challenges for analyzing urban datasets such as data volume. But it is not clear how analytics will be able to cope with such a volume. In this paper we introduce a benchmark study called SDAbench (SMART DATA ANALYTICS Benchmark); this work facilitates repeatable testing that can be easily extended to multiple methods use cases. SDAbench is envisioned as a suite of Benchmarks, each of which represents a distinct method. To date we have implemented a benchmark for two important clustering algorithms applied on data smart city, namely the K-Means and the Fuzzy C-Mean (FCM). We envision adding other benchmarks (e.g., processing / integration, classification, data reduction, visualisation, and finding association rules, etc.) and test each one under the SDAbench umbrella in the near future. |