In this paper, we study this data staging problem by leveraging the dynamic programming (DP) techniques to optimally migrate, replicate, and cache the shared data items in cloud systems with or without some practical resource constraints in an efficient way while minimizing the monetary cost for transmitting and caching the data items.To this end, we follow the cost and network models and extend the analysis to multiple data items, each with single or multiple copies. Our results show that under homogeneous cost model, when the ratio of transmission cost and caching cost is low, a single copy of each data item can efficiently serve all the user requests. While in multicopy situation, we also consider the tradeoff between the transmission cost and caching cost by controlling the upper bounds of transmissions and copies. The upper bound can be given either on per-item basis or on all-item basis. We present efficient optimal solutions based on dynamic programming techniques to all these cases provided that the upper bound is polynomially bounded by the number of service requests and the number of distinct data items. In addition to the homogeneous cost model, we also briefly discuss this problem under a heterogeneous cost model with some simple yet practical restrictions and present a 2-approximation algorithm to the general case. We validate our findings by implementing a data staging solver, whereby conducting extensive simulation studies on the behaviors of the algorithms.
You are here: Home / ieee projects 2013 / Efficiently achieving data staging and caching in a cloud system