Abstract
Recently there has been a considerable increase in the number of different Key-Value stores, for supporting data storage and applications on the cloud environment. While all these solutions try to offer highly available and scalable services on the cloud, they are significantly different with each other in terms of the architecture and types of the applications, they try to support. Considering three widely-used such systems: Cassandra, HBase and Voldemort; in this paper we compare them in terms of their support for different types of query workloads. We are mainly focused on the range queries. Unlike HBase and Cassandra that have built-in support for range queries, Voldemort does not support this type of queries via its available API. For this matter, practical techniques are presented on top of Voldemort to support range queries. Our performance evaluation is based on mixed query workloads, in the sense that they contain a combination of short and long range queries, beside other types of typical queries on key-value stores such as lookup and update. We show that there are trade-offs in the performance of the selected system and scheme, and the types of the query workloads that can be processed efficiently.
Similar content being viewed by others
References
Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical workloads. In: The Proceedings of VLDB Endowment, vol. 2 issue 1, pp. 922–933 (2009)
Agrawal, P., Silberstein, A., Cooper, B.F., Srivastava, U., Ramakrishnan, R.: Asynchronous view maintenance for vlsd databases. SIGMOD ’09. ACM (2009)
Aguilera, M.K., Golab, W., Golab, M.A.: A practical scalable distributed b-tree. In: Proceedings of VLDB Endow., vol. 1, pp. 598–609 (2008)
Andrzejak, A., Xu, Z.: Scalable, efficient range queries for grid information services. Peer-to-Peer Computing, pp. 33–40 (2002)
Apache CouchDB. http://couchdb.apache.org/. Accessed date Nov 2010
Apache HDFS. http://hadoop.apache.org/hdfs/. Accessed date Nov 2010
Aspnes, J., Kirsch, J., Krishnamurthy, A.: Load balancing and locality in range-queriable data structures. PODC ’04, pp. 115–124. ACM (2004)
Binnig, C., Kossmann, D., Kraska, T., Loesing, S.: How is the weather tomorrow?: towards a benchmark for the cloud. In: Proceedings of the 2nd International Workshop on Testing Database Systems, DBTest ’09, pp. 9:1–9:6. ACM (2009)
Brantner, M., Florescu, D., Graf, D.A., Kossmann, D., Kraska, T.: Building a database on s3. SIGMOD Conference, pp. 251–264 (2008)
Cassandra. http://cassandra.apache.org/. Accessed date Nov 2010
Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39, 12–27 (2011)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 2, 4:1–4:26 (2008)
Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.-A., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1, 1277–1288 (2008)
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. SoCC, pp. 143–154 (2010)
Ganesan, P., Bawa, M., Garcia-molina, H.:Online balancing of range-partitioned data with applications to Peer-to-Peer systems. In: VLDB, pp. 444–455 (2004)
Ganesan, P., Yang, B., Garcia-Molina, H.: One torus to rule them all: multidimensional queries in p2p systems. WebDB (2004)
Gray, J., Sundaresan, P., Englert, S., Baclawski, K., Weinberger, P.J.: Quickly generating billion-record synthetic databases. SIGMOD Rec. 23, 243–252 (1994)
Gupta, A., Agrawal, D., Abbadi, A.E.: Approximate range selection queries in peer-to-peer systems. CIDR (2003)
Hastorun, D., Jampani, M., Kakulapati, G., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazons highly available key-value store. In: Proceedings of SOSP, pp. 205–220 2007
HBase. http://hbase.apache.org/. Accessed date Nov 2010
Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: VLDB, pp. 661–672 (2005)
Lehman, P.L., Yao, S.B.: Efficient locking for concurrent operations on b-trees. ACM Trans. Database Syst. 6(4) 650–670 (1981)
Lomet, D.: Replicated indexes for distributed data. DIS ’96, IEEE Computer Society, pp. 108–119 (1996)
MongoDB. http://www.mongodb.org/. Accessed date Nov 2010
Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. SIGMOD Conference, pp. 165–178 (2009)
Pitoura, T., Ntarmos, N., Triantafillou, P.: Replication, load balancing and efficient range query processing in dhts. EDBT, pp. 131–148 (2006)
Project Voldemort. http://project-voldemort.com/. Accessed date Nov 2010
Ramabhadran, S., Ratnasamy, S., Hellerstein, J.M., Shenker, S.: Brief announcement: prefix hash tree. PODC ’04. ACM (2004)
Sahin, O.D., Gupta, A., Agrawal, D., Abbadi, A.E.: A peer-to-peer framework for caching range queries. ICDE, pp. 165–176 (2004)
Schütt, T., Schintke, F., Reinefeld, A.: Structured overlay without consistent hashing: empirical results. CCGRID (2006)
Schütt, T., Schintke, F., Reinefeld, A.: Range queries on structured overlay networks. Computer Communications, vol. 31 (2008)
Shi, Y., Meng, X., Zhao, J., Hu, X., Liu, B., Wang, H.: Benchmarking cloud-based data management systems, In: Proceedings of the second international workshop on Cloud data management. CloudDB ’10, pp. 47–54. ACM (2010)
Vo, H.T., Chen, C., Ooi, B.C.: Towards elastic transactional cloud storage with range query support. In: The Proceedings of VLDB Endowment, vol. 3, pp. 506–517 (2010)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pirzadeh, P., Tatemura, J., Po, O. et al. Performance Evaluation of Range Queries in Key Value Stores. J Grid Computing 10, 109–132 (2012). https://doi.org/10.1007/s10723-012-9214-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-012-9214-7