Interference-Aware Workload Placement for Improving Latency Distribution of Converged HPC/Big Data Cloud Infrastructures

Tzenetopoulos, Achilleas; Masouros, Dimosthenis; Xydis, Sotirios; Soudris, Dimitrios

doi:10.1007/978-3-031-04580-6_8

Achilleas Tzenetopoulos¹¹,
Dimosthenis Masouros¹¹,
Sotirios Xydis^11,12 &
…
Dimitrios Soudris¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13227))

Included in the following conference series:

International Conference on Embedded Computer Systems

1123 Accesses
4 Citations

Abstract

Recently, High Performance, Big Data, and Cloud Computing worlds tend to converge in terms of workload deployment with containerization technology acting as an enabler towards this direction. In such cases of application diversity and multi-tenancy, a universal scheduler able to satisfy the end-user needs for seamless, yet, efficient application deployment is required. While Kubernetes container orchestrator seems to be the answer that enables application-agnostic deployment, it still depends highly on coarse system metrics for its scheduling policies, thus, neglecting the performance degradation due to resource contention in the underlying system.

In this paper, we design and implement an interference-aware modular framework, able to balance incoming workload based on low-level metrics monitoring. We evaluate our proposed solution over different workload mixes and co-location scenarios showing that against the state-of-art, but interference unaware Kubernetes scheduler the proposed framework significantly improves the latency distribution of the converged cloud infrastructure, improving median latency up to 27% and reducing standard deviation up to 25%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Interference-Aware Orchestration in Kubernetes

A survey of Kubernetes scheduling algorithms

Article Open access 13 June 2023

Virtual Clusters: Isolated, Containerized HPC Environments in Kubernetes

References

Google cloud platform. https://www.cloud.google.com. Accessed 02 Feb 2021
grpc. https://grpc.io/. Accessed 02 Feb 2021
Protocol buffers. https://developers.google.com/protocol-buffers. Accessed 02 Feb 2021
Al Jawarneh, I.M., et al: Container orchestration engines: a thorough functional and performance comparison. In: ICC 2019–2019 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2019)
Google Scholar
Amazon, E.: Amazon web services (November 2012) (2015). http://aws.amazon.com/es/ec2/
Authors, P.: Prometheus-monitoring system & time series database (2017)
Google Scholar
Bauman, E., Ayoade, G., Lin, Z.: A survey on hypervisor-based monitoring: approaches, applications, and evolutions. ACM Comput. Surv. (CSUR) 48(1), 10 (2015)
Article Google Scholar
Blagodurov, S., Fedorova, A.: User-level scheduling on NUMA multicore systems under Linux. In: Linux Symposium, vol. 2011 (2011)
Google Scholar
Burns, B., Grant, B., Oppenheimer, D., Brewer, E., Wilkes, J.: Borg, omega, and Kubernetes: lessons learned from three container-management systems over a decade. Queue 14(1), 70–93 (2016)
Article Google Scholar
Cassandra, A.: Apache Cassandra. Website 13 (2014). http://planetcassandra.org/what-is-apache-cassandra
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154 (2010)
Google Scholar
Delimitrou, C., Kozyrakis, C.: ibench: quantifying interference for datacenter applications. In: 2013 IEEE International Symposium on Workload Characterization (IISWC), pp. 23–33. IEEE (2013)
Google Scholar
Dongarra, J., Heroux, M.A., Luszczek, P.: HPCG benchmark: a new metric for ranking high performance computing systems. Knoxville, Tennessee, pp. 1–11 (2015)
Google Scholar
Felter, W., Ferreira, A., Rajamony, R., Rubio, J.: An updated performance comparison of virtual machines and Linux containers. In: 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 171–172. IEEE (2015)
Google Scholar
Ferdman, M., et al.: Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (2012)
Google Scholar
Ferikoglou, A., Masouros, D., Tzenetopoulos, A., Xydis, S., Soudris, D.: Resource aware GPU scheduling in Kubernetes infrastructure. In: 12th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and 10th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2021), pp. 4:1–4:12 (2021)
Google Scholar
Gan, Y., Zhang, Y., Hu, K., Cheng, D., He, Y., Pancholi, M., Delimitrou, C.: Seer: Leveraging big data to navigate the complexity of performance debugging in cloud microservices. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 19–33 (2019)
Google Scholar
Garefalakis, P., Karanasos, K., Pietzuch, P., Suresh, A., Rao, S.: Medea: scheduling of long running applications in shared production clusters. In: Proceedings of the Thirteenth EuroSys Conference, p. 4. ACM (2018)
Google Scholar
Henning, J.L.: Spec cpu2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)
Article Google Scholar
Kanev, S., et al.: Profiling a warehouse-scale computer. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture, pp. 158–169 (2015)
Google Scholar
Mars, J., Tang, L., Hundt, R., Skadron, K., Soffa, M.L.: Bubble-up: increasing utilization in modern warehouse scale computers via sensible co-locations. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 248–259. ACM (2011)
Google Scholar
Mars, J., Vachharajani, N., Hundt, R., Soffa, M.L.: Contention aware execution: online contention detection and response. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 257–265. ACM (2010)
Google Scholar
Masouros, D., Xydis, S., Soudris, D.: Rusty: runtime interference-aware predictive monitoring for modern multi-tenant systems. IEEE Trans. Parallel Distrib. Syst. 32(1), 184–198 (2020)
Article Google Scholar
MySQL, A.: Mysql (2001)
Google Scholar
Naqvi, S.N.Z., Yfantidou, S., Zimányi, E.: Time series databases and influxdb. Studienarbeit, Université Libre de Bruxelles p. 12 (2017)
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Terpstra, D., Jagode, H., You, H., Dongarra, J.: Collecting performance data with PAPI-C. In: Müller, M., Resch, M., Schulz, A., Nagel, W. (eds.) Tools for High Performance Computing 2009, pp. 157–173. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11261-4_11
Chapter Google Scholar
Thomas Willham, R.D.: Intel$\text{\textregistered} $ performance counter monitor - a better way to measure CPU utilization. https://software.intel.com/content/www/us/en/develop/articles/intel-performance-counter-monitor.html
Google Scholar
Tzenetopoulos, A., Masouros, D., Xydis, S., Soudris, D.: Interference-aware orchestration in Kubernetes. In: Jagode, H., Anzt, H., Juckeland, G., Ltaief, H. (eds.) ISC High Performance 2020. LNCS, vol. 12321, pp. 321–330. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59851-8_21
Chapter Google Scholar
Wang, L., et al.: Bigdatabench: a big data benchmark suite from internet services. In: 2014 IEEE 20th international symposium on high performance computer architecture (HPCA), pp. 488–499. IEEE (2014)
Google Scholar
Wegrzynek, A.: Influxdb C++ client. https://github.com/awegrzyn/influxdb-cxx (2019)
Yang, H., Breslow, A., Mars, J., Tang, L.: Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers. ACM SIGARCH Comput. Archit. News 41(3), 607–618 (2013)
Article Google Scholar
Yasin, A., Ben-Asher, Y., Mendelson, A.: Deep-dive analysis of the data analytics workload in cloudsuite. In: 2014 IEEE International Symposium on Workload Characterization (IISWC), pp. 202–211. IEEE (2014)
Google Scholar
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I., et al.: Spark: cluster computing with working sets. HotCloud 10(10–10), 95 (2010)
Google Scholar
Zhuravlev, S., Blagodurov, S., Fedorova, A.: Addressing shared resource contention in multicore processors via scheduling. ACM SIGPLAN Notices 45(3), 129–142 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Technical University of Athens, Athens, Greece
Achilleas Tzenetopoulos, Dimosthenis Masouros, Sotirios Xydis & Dimitrios Soudris
Harokopio University of Athens, Athens, Greece
Sotirios Xydis

Authors

Achilleas Tzenetopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Dimosthenis Masouros
View author publications
You can also search for this author in PubMed Google Scholar
Sotirios Xydis
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Soudris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Achilleas Tzenetopoulos .

Editor information

Editors and Affiliations

University of California, San Diego, La Jolla, CA, USA
Alex Orailoglu
Fraunhofer IESE, Kaiserslautern, Germany
Matthias Jung
Brandenburg University of Technology, Cottbus, Germany
Marc Reichenbach

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tzenetopoulos, A., Masouros, D., Xydis, S., Soudris, D. (2022). Interference-Aware Workload Placement for Improving Latency Distribution of Converged HPC/Big Data Cloud Infrastructures. In: Orailoglu, A., Jung, M., Reichenbach, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2021. Lecture Notes in Computer Science, vol 13227. Springer, Cham. https://doi.org/10.1007/978-3-031-04580-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-04580-6_8
Published: 27 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04579-0
Online ISBN: 978-3-031-04580-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Interference-Aware Workload Placement for Improving Latency Distribution of Converged HPC/Big Data Cloud Infrastructures

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Interference-Aware Orchestration in Kubernetes

A survey of Kubernetes scheduling algorithms

Virtual Clusters: Isolated, Containerized HPC Environments in Kubernetes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Interference-Aware Workload Placement for Improving Latency Distribution of Converged HPC/Big Data Cloud Infrastructures

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Interference-Aware Orchestration in Kubernetes

A survey of Kubernetes scheduling algorithms

Virtual Clusters: Isolated, Containerized HPC Environments in Kubernetes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation