Abstract
Program comprehension accounts for a large portion of software development costs and effort. The academic literature contains mainly research on program comprehension of short code snippets, but comprehension at the system level is no less important. We claim that comprehending a software system is a distinct activity that differs from code comprehension. We interviewed experienced developers, architects, and managers in the software industry and open-source community, to uncover the meaning of program comprehension at the system level; later we conducted a survey to verify the findings. The interviews demonstrate, among other things, that system comprehension is largely detached from code and programming language, and includes scope that is not captured in the code. It focuses on one hand on the structure of the system, and on the other hand on the flows in the system, but less on the code itself. System comprehension is a continuous, unending, iterative process, which utilizes white-box and black-box approaches at different layers of the system depending on needs, and combines both bottom-up and top-down comprehension strategies. In summary, comprehending a system is not just comprehending the code at a larger scale, and it is not possible to comprehend large systems at the same level as comprehending code.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
In the hardware world there is a distinction between the terms ‘architecture’ and ‘micro-architecture’, reflecting the externally visible attributes of the system vs. its internal design. Software organizations in hardware corporations sometimes adopt this terminology. However in the software world the term ‘architecture’ usually refers mainly to the internal design.
References
Ajami S, Woodbridge Y, Feitelson DG (2019) Syntax, predicates, idioms — what really affects code complexity?. Empirical Softw Eng 24(1):287–328
Alomari HW, Jennings RA, Virote de Souza P, Stephen M, Gannod GC (2016) vizSlice: Visualizing large scale software slices. In: IEEE Working Conf. Softw. Visualization, pp 101–105
Alon U, Brody S, Levy O, Yahav E (2019) Code2seq: Generating sequences from structured representations of code. In: International conference learning representations (7)
Alpern B, Schneider FB (1987) Recognizing safety and liveness. Distributed Comput 2(3):117–126
Arisholm E, Briand LC, Hove SE, Labiche Y (2006) The impact of UML documentation on software maintenance: An experimental evaluation. IEEE Trans Softw Eng 32(6):365–381
Austin MA, Samadzadeh MH (2005) Software comprehension/maintenance: An introductory course. In: Proc. 18th International conference systems engineering. IEEE, pp 414–419
Avidan E, Feitelson DG (2017) Effects of variable names on comprehension: An empirical study. In: International conference program comprehension (25)55–65
Bach MJ (1986) The Design of the UNIX operating system. Prentice-Hall
Banker RD, Datar SM, Kemerer CF, Zweig D (1993) Software complexity and maintenance costs. Comm ACM 36(11):81–94
Beniamini G, Gingichashvili S, Klein Orbach A, Feitelson DG (2017) Meaningful identifier names: The case of single-letter variables. In: International conference program comprehension (25)45–54
Bogner A, Menz W (2009) The theory-generating expert interview: Epistemological interest, forms of knowledge, interaction. In: Bogner A, Littig B, Menz W (eds) Interviewing experts. Palgrave Macmillan, pp 43–80
Brooks FP Jr (1987) No silver bullet: Essence and accidents of software engineering. Computer 20(4):10–19
Brooks R (1983) Towards a theory of the comprehension of computer programs. Intl J Man-Machine Studies 18(6):543–554
Brunner T, Porkoláb Z (2019) The role of the version control information in code comprehension. In: IEEE International science and technology conference. Informatics (15)219–224
Carter N, Bryant-Lukosius D, DiCenso A, Blythe J, Neville AJ (2014) The use of triangulation in qualitative research. Oncol Nurs Forum 41 (5):545–547
Conway ME (1968) How do committees invent. Datamation 14 (4):28–31
Cook C, Bregar W, Foote D (1984) A preliminary investigation of the use of the cloze procedure as a measure of program understanding. Inf Process Manag 20(1-2):199–208
Cornelissen B, Zaidman A, van Deursen A (2011) A controlled experiment for program comprehension through trace visualization. IEEE Trans Softw Eng 37(3):341–355
Cornelissen B, Zaidman A, van Deursen A, Moonen L, Koschke R (2009) A systematic survey of program comprehension through dynamic analysis. IEEE Trans Softw Eng 35(5):684–702
Crouch M, McKenzie H (2006) The logic of small samples in interview-based qualitative research. Soc Sci Inf 45(4):483–499
Fekete A, Porkoláb Z. (2020) A comprehensive review on software comprehension models. Annales Mathematicae et Informaticae 51:103–111
Feng Y, Dreef K, Jones JA, van Deursen A (2018) Hierarchical abstraction of execution traces for program comprehension. In: International conference. program comprehension (26)86–96
Glaser B, Strauss A (1967) The discovery of grounded theory: Strategies for qualitative research. Sociology Press
Haiduc S, Aponte J, Marcus A (2010) Supporting program comprehension with source code summarization. In: 32nd Intl. conf. softw. eng. 2:223–226
Henry S, Kafura D (1981) Software structure metrics based on information flow. IEEE Trans Softw Eng SE-7(5):510–518
Hwa J, Lee S, Kwon YR (2009) Hierarchical understandability assessment model for large-scale OO system. In: Asia-Pacific softw. Engineering conferences (16)11–18
Jaffe A, Lacomis J, Schwartz EJ, Le Goues C, Vasilescu B (2018) Meaningful variable names for decompiled code: A machine translation approach. In: International conference Program Comprehension (26)
Jbara A, Feitelson DG (2014) On the effect of code regularity on comprehension. In: International conference Program Comprehension (22)189–200
Ko AJ (2017) A three-year participant observation of software startup software evolution. In: International conference Software engineering(39)
Kosar T, Gaberc S, Carver JC, Mernik M (2018) Program comprehension of domain-specific and general-purpose languages: replication of a family of experiments using integrated development environments. Empir Softw Eng 23(5):2734–2763
Kozaczynski W, Letovsky S, Ning J (1991) A knowledge-based approach to software system understanding. In: Ann. knowledge-based software engineering conference. (6)162–170
Kruchten P (1995) The 4 + 1 view model of architecture. IEEE Softw 12(6):42–50
Kruchten P (2004) An ontology of architectural design decisions in software-intensive systems. In: Groningen workshop on software variability management (2)54–61
Kulkarni A (2016) Comprehending source code of large software system for reuse. In: International conference program comprehension. IEEE, (24)1–4
Kulkarni N, Varma V (2017) Perils of opportunistically reusing software module. Software: Practice and Experience 47(7):971–984
Lehman MM (1980) Programs, life cycles, and laws of software evolution. Proc IEEE 68(9):1060–1076
Letovsky S (1987) Cognitive processes in program comprehension. J Syst Softw 7(4):325–339
Levy O, Feitelson DG (2019) Understanding large-scale software – a hierarchical view. In: International conference Program Comprehension (27)283–293
Lions J (1996) Lions’ Commentary on UNIX 6th Edition, with Source Code. Annabooks
Littman DC, Pinto J, Letovsky S, Soloway E (1987) Mental models and software maintenance. J Syst Softw 7(4):341–355
Maletic JI, Marcus A, Collard ML (2002) A task oriented view of software visualization. In: International Workshop visualizing software for understanding and analysis (1)32–40
Maletic JI, Mosora DJ, Newman CD, Collard ML, Sutton A, Robinson BP (2011) MosaiCode: Visualizing large scale software. In: International Workshop visualizing software for understanding & analysis(6)
Martin RC (2015) Expecting professionalism. https://youtu.be/BSaAMQVq01E?t=2102. Accessed: 2020-05-15
McCabe T (1976) A complexity measure. IEEE Trans Softw Eng SE-2(4):308–320
McKusick MK, Bostic K, Karels MJ, Quarterman JS (1996) The design and implementation of the 4.4BSD operating system. Addison Wesley
Medeirios F, Lima G, Amaral G, Apel S, Kästner C, Ribeiro M, Gheyi R (2019) An investigation of misunderstanding code patterns in C open-source software projects. Empirical Softw Eng 24(4):1693–1726
Metz S (2014) All the little things. https://www.youtube.com/watch?v=8bZh5LMaSmE. Accessed 11 Aug 2018
Meyers B (1992) Applying “design by contract”. Computer 25 (10):40–51
Moonen L, Yazdanshenas AR (2016) Analyzing and visualizing information flow in heterogeneous component-based software systems. Inf Softw Tech 77:34–55
Panas T, Epperly T, Quinlan D, Sæbjørnsen A, Vuduc R (2007) Communicating software architecture using a unified single-view visualization. In: IEEE International conference Engineering Complex Comput. Syst. (12) 217–228
Parnas DL (1972) On the criteria to be used in decomposing systems into modules. Comm ACM 15(12):1053–1058
Parnas DL, Clements PC, Weiss DM (1985) The modular structure of complex systems. IEEE Trans Softw Eng SE-11(3):259–266
Petersen K, Badampudi D, Shah SMA, Wnuk K, Gorschek T, Papatheocharous E, Axelsson J, Sentilles S, Crncovic I, Cicchetti A (2018) Choosing component origins for software intensive systems: In-house, COTS, OSS, or outsorcing? — a case survey. IEEE Trans Softw Eng 44(3):237–261
Razavizadeh A, Cimpan S, Verjus H, Ducasse S (2009) Software system understanding via architectural views extraction according to multiple viewpoints. In: On the move to meaningful internet systems: OTM 2009 Workshops, LNCS, vol 5872. Springer, pp 433–442
Rodeghero P, Liu C, McBurney PW, McMillan C (2015) An eye-tracking study of Java programmers and application to source code summarization. IEEE Trans Softw Eng 41(11):1038–1054
Roehm T, Tiarks R, Koschke R, Maalej W (2012) How do professional developers comprehend software? Intl Conf Softw Eng 34:255–265
Sackman H, Erikson WJ, Grant EE (1968) Exploratory experimental studies comparing online and offline programming performance. Comm ACM 11 (1):3–11
Salah M, Mancoridis S, Antoniol G, Di Penta M (2006) Scenario-driven dynamic analysis for comprehending large software systems. In: Proc. 10th European conference software maintenance & reengineering. IEEE, p 10
Salvaneschi G, Proksch S, Amann S, Nadi S, Mezini M (2017) On the positive effect of reactive programming on software comprehension: An empirical study. IEEE Trans Softw Eng 43(12):1125–1143
Shneiderman B (1976) Exploratory experiments in programmer behavior. Intl J Comput Info Sci 5(2):123–143
Siegmund J, Brechmann A, Apel S, Kästner C., Liebig J, Leich T, Saake G (2012) Toward measuring program comprehension with functional magnetic resonance imaging. In: Proceedings of the ACM SIGSOFT 20th International symposium on the foundations of software engineering. ACM, p 24
Şora I (2015) Helping program comprehension of large software systems by identifying their most important classes. In: International conference on evaluation of novel approaches to software engineering. Springer, 122–140
Spolsky J (2002) The law of leaky abstractions. https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/. Accessed: 2018-09-26
Storey M.-A. (2006) Theories, tools and research methods in program comprehension. Past, present and future Softw Quality J 14(3):187–208
Störrle H. (2014) On the impact of layout quality to understanding UML diagrams: Size matters. In: International conference on model driven engineering languages and systems. Springer, 518–534
Tichy W (2011) The evidence for design patterns. In: Oram A, Wilson G (eds) Making Software. O’Reilly Media Inc., pp 393–414
Torchiano M, Scanniello G, Ricca F, Reggio G, Leotta M (2017) Do UML object diagrams affect design comprehensibility? results from a family of four controlled experiments. J Vis Languages Comput 41:10–21
von Mayrhauser A, Vans AM (1994) Comprehension processes during large scale maintenance. In: International conference Software Engineering(16)39–48
von Mayrhauser A, Vans AM (1994) Dynamic code cognition behaviors for large scale code. In: Workshop Program Comrehension (3)74–81
von Mayrhauser A, Vans AM (1995) Program comprehension during software maintenance and evolution. Computer 28(8):44–55
von Mayrhauser A, Vans AM (1996) On the role of hypotheses during opportunistic understanding while porting large scale code. In: Workshop Program Comrehension, (4)68–77
von Mayrhauser A, Vans AM (1998) Program understanding behavior during adaptation of large scale software. In: Workshop program comrehension, (6)164–172
von Mayrhauser A, Vans AM, Howe AE (1997) Program understanding behavior during enhancement of large-scale software. J Softw Maintenance Res Pract 9(5):299–327
Weissman L (1974) Psychological complexity of computer programs: An experimental methodology. SIGPLAN Notices 9(6):25–36
Wettel R, Lanza M (2007) Program comprehension through software habitability. In: International conference Program Comprehension, (15)231–240
Wettel R, Lanza M (2007) Visualizing software systems as cities. In: IEEE International workshop visualizing software for understanding & analysis, (4)92–99
Wikipedia (2018) Java package. https://en.wikipedia.org/wiki/Java_package. Accessed 31 Oct
Xia X, Bao L, Lo D, Xing Z, Hassan AE, Li S (2018) Measuring program A large-scale field study with professionals. IEEE Trans Softw Eng 44 (10):951–976
Zhang H, Zhao H, Cai W, Liu J, Zhou W (2010) Using the k-core decomposition to analyze the static structure of large-scale software systems. J Supercomput 53(2):352–369
Acknowledgments
Many thanks to Neta Kligler-Vilenchik who provided us with invaluable guidance in the methodology of text analysis. Bareket Henle assisted with development of the initial interview plan.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Federica Sarro and Foutse Khomh
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: International Conference on Program Comprehension (ICPC)
Dror Feitelson holds the Berthold Badler Chair in Computer Science. This research was supported by the ISRAEL SCIENCE FOUNDATION (grants no. 407/13 and 832/18). This paper is an invited extended version of a paper from ICPC 2019.
Rights and permissions
About this article
Cite this article
Levy, O., Feitelson, D.G. Understanding large-scale software systems – structure and flows. Empir Software Eng 26, 48 (2021). https://doi.org/10.1007/s10664-021-09938-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-021-09938-8