Abstract
Automatic protocol mining is a promising approach for inferring accurate and complete API protocols. However, just as with any data-mining technique, this approach requires sufficient training data (object usage scenarios). Existing approaches resolve the problem by analyzing more programs, which may cause significant runtime overhead. In this paper, we propose an inheritance-based oversampling approach for object usage scenarios (OUSs). Our technique is based on the inheritance relationship in object-oriented programs. Given an object-oriented program p, generally, the OUSs that can be collected from a run of p are not more than the objects used during the run. With our technique, a maximum of n times more OUSs can be achieved, where n is the average number of super-classes of all general OUSs. To investigate the effect of our technique, we implement it in our previous prototype tool, ISpecMiner, and use the tool to mine protocols from several real-world programs. Experimental results show that our technique can collect 1.95 times more OUSs than general approaches. Additionally, accurate and complete API protocols are more likely to be achieved. Furthermore, our technique can mine API protocols for classes never even used in programs, which are valuable for validating software architectures, program documentation, and understanding. Although our technique will introduce some runtime overhead, it is trivial and acceptable.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Alur R, Černý P, Madhusudan P, et al., 2005. Synthesis of interface specifications for Java classes. ACM SIGPLAN Not, 40(1):98–109. https://doi.org/10.1145/1047659.1040314
Ammons G, Bodík R, Larus JR, 2002. Mining specifications. ACM SIGPLAN Not, 37(1):4–16. https://doi.org/10.1145/565816.503275
Bruce KB, Wegner P, 1986. An algebraic model of sybtypes in object-oriented languages (draft). ACM SIGPLAN Not, 21(10):163–172. https://doi.org/10.1145/323648.323756
Caserta P, Zendra O, 2014. JBInsTrace: a tracer of Java and JRE classes at basic-block granularity by dynamically instrumenting bytecode. Sci Comput Program, 79:116–125. https://doi.org/10.1016/j.scico.2012.02.004
Chang RY, Podgurski A, Yang J, 2007. Finding what’s not there: a new approach to revealing neglected conditions in software. Proc Int Symp on Software Testing and Analysis, p.163–173. https://doi.org/10.1145/1273463.1273486
Chen D, Huang RB, Qu BB, et al., 2014. Improving static analysis performance using rule-filtering technique. Proc 26th Int Conf on Software Engineering and Knowledge Engineering, p.19–24.
Chen D, Huang RB, Qu BB, et al., 2015a. Mining class temporal specification dynamically based on extended Markov model. Int J Softw Eng Knowl Eng, 25(3):573–604. https://doi.org/10.1142/S0218194015500047
Chen D, Zhang YD, Wang RC, et al., 2015b. Extracting more object usage scenarios for API protocol mining. Proc 27th Int Conf on Software Engineering and Knowledge Engineering, p.607–612. https://doi.org/10.18293/SEKE2015-212
Chen D, Zhang YD, Wei W, et al., 2017. Efficient vulnerability detection based on an optimized rule-checking static analysis technique. Front Inform Technol Electron Eng, 18(3):332–345. https://doi.org/10.1631/FITEE.1500379
Cook JE, Wolf AL, 1998. Discovering models of software processes from event-based data. ACM Trans Softw Eng Methodol, 7(3):215–249. https://doi.org/10.1145/287000.287001
Dai ZY, Mao XG, Lei Y, et al., 2014. Compositional mining of multiple object API protocols through state abstraction. Sci World J, Article 171 647.
Dallmeier V, Lindig C, Wasylkowski A, et al., 2006. Mining object behavior with ADABU. Proc Int Workshop on Dynamic Systems Analysis, p.17–24. https://doi.org/10.1145/1138912.1138918
Dallmeier V, Knopp N, Mallon C, et al., 2012. Automatically generating test cases for specification mining. IEEE Trans Softw Eng, 38(2):243–257. https://doi.org/10.1109/TSE.2011.105
Engler D, Chen DY, Hallem S, et al., 2001. Bugs as deviant behavior: a general approach to inferring errors in systems code. ACM SIGOPS Oper Syst Rev, 35(5):57–72. https://doi.org/10.1145/502059.502041
Ernst MD, Perkins JH, Guo PJ, et al., 2007. The Daikon system for dynamic detection of likely invariants. Sci Comput Program, 69(1-3):35–45. https://doi.org/10.1016/j.scico.2007.01.015
Kernighan BW, Ritchie DM, 1988. The C Programming Language (2nd Ed.). Prentice Hall, Englewood Cliffs, NJ.
Li ZM, Zhou YY, 2005. PR-miner: automatically extracting implicit programming rules and detecting violations in large software codes. ACM SIGSOFT Softw Eng Not, 30(5):306–315. https://doi.org/10.1145/1095430.1081755
Liskov B, 1988. Keynote address—data abstraction and hierarchy. ACM SIGPLAN Not, 23(5):17–34. https://doi.org/10.1145/62139.62141
Lorenzoli D, Mariani L, Pezzè M, 2008. Automatic generation of software behavioral models. Proc 30th Int Conf on Software Engineering, p.501–510. https://doi.org/10.1145/1368088.1368157
Pradel M, Gross TR, 2012. Leveraging test generation and specification mining for automated bug detection without false positives. Proc 34th Int Conf on Software Engineering, p.288–298. https://doi.org/10.1109/ICSE.2012.6227185
Pradel M, Jaspan C, Aldrich J, et al., 2012. Statically checking API protocol conformance with mined multi-object specifications. Proc 34th Int Conf on Software Engineering, p.925–935. https://doi.org/10.1109/ICSE.2012.6227127
Ramanathan MK, Grama A, Jagannathan S, 2007. Static specification inference using predicate mining. ACM SIGPLAN Not, 42(6):123–134. https://doi.org/10.1145/1273442.1250749
Shoham S, Yahav E, Fink S, et al., 2007. Static specification mining using automata-based abstractions. Proc Int Symp on Software Testing and Analysis, p.174–184. https://doi.org/10.1145/1273463.1273487
Skala V, Petruska R, 2014. A new approach to hash function construction for textual data: a comparison. Proc 4th World Congress on Information and Communication Technologies, p.39–44. https://doi.org/10.1109/WICT.2014.7077299
Tatsubori M, Sasaki T, Chiba S, et al., 2001. A bytecode translator for distributed execution of “legacy” Java software. Proc 15th European Conf on Object-Oriented Programming, p.236–255.
Thummalapenta S, Xie T, 2011. Alattin: mining alternative patterns for defect detection. Autom Softw Eng, 18(3-4):293–323. https://doi.org/10.1007/s10515-011-0086-z
Wasylkowski A, 2007. Mining object usage models. Proc 29th Int Conf on Software Engineering, p.93–94. https://doi.org/10.1109/ICSECOMPANION.2007.49
Wasylkowski A, Zeller A, Lindig C, 2007. Detecting object usage anomalies. Proc 6th Joint Meeting of the European Software Engineering Conf and the ACM SIGSOFT Symp on Foundations of Software Engineering, p.35–44. https://doi.org/10.1145/1287624.1287632
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the Scientific Research Project of the Education Department of Hubei Province, China (No. Q20181508), the Youths Science Foundation of Wuhan Institute of Technology (No. k201622), the Surveying and Mapping Geographic Information Public Welfare Scientific Research Special Industry (No. 201412014), the Educational Commission of Hubei Province, China (No. Q20151504), the National Natural Science Foundation of China (Nos. 41501505, 61502355, 61502355, and 61502354), the China Postdoctoral Science Foundation (No. 2015M581887), the Key Program of Higher Education Institutions of Henan Province, China (No. 17A520040), and the Natural Science Foundation of Henan Province, China (No. 162300410177)
A preliminary version was presented at the 27th International Conference on Software Engineering and Knowledge Engineering, Pittsburgh, USA, July 6–8, 2015
Rights and permissions
About this article
Cite this article
Chen, D., Zhang, Yd., Wei, W. et al. An oversampling approach for mining program specifications. Frontiers Inf Technol Electronic Eng 19, 737–754 (2018). https://doi.org/10.1631/FITEE.1601783
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1601783