ABSTRACT
Developers perform small-scale reuse tasks to save time and to increase the quality of their code, but due to their small scale, the costs of such tasks can quickly outweigh their benefits. Existing approaches focus on locating source code for reuse but do not support the integration of the located code within the developer's system, thereby leaving the developer with the burden of performing integration manually. This paper presents an approach that uses the developer's context to help integrate the reused source code into the developer's own source code. The approach approximates a theoretical framework (higher-order anti-unification modulo theories), known to be undecidable in general, to determine candidate correspondences between the source code to be reused and the developer's current (incomplete) system. This approach has been implemented in a prototype tool, called Jigsaw, that identifies and evaluates candidate correspondences greedily with respect to the highest similarity. Situations involving multiple candidate correspondences with similarities above a defined threshold are presented to the developer for resolution. Two empirical evaluations were conducted: an experiment comparing the quality of Jigsaw's results against suspected cases of small-scale reuse in an industrial system; and case studies with two industrial developers to consider its practical usefulness and usability issues.
- T. Apiwattanapong, A. Orso, and M. J. Harrold. A differencing algorithm for object-oriented programs. In Proc. Int'l Conf. Automated Softw. Eng., pp. 2--13, 2004. Google ScholarDigital Library
- V. R. Basili, G. Caldiera, and G. Cantone. A reference architecture for the component factory. ACM Trans. Softw. Eng. Methodol., 1(1):53--80, 1992. Google ScholarDigital Library
- T. J. Biggerstaff. The library scaling problem and the limits of concrete component reuse. In Proc. Int'l Conf. Softw. Reuse, pp. 102--109, 1994.Google ScholarCross Ref
- J. Burghardt. E-generalization using grammars. Artif. Intell. J., 165(1):1--35, 2005. Google ScholarDigital Library
- R. Cottrell, J. J. C. Chang, R. J. Walker, and J. Denzinger. Determining detailed structural correspondence for generalization tasks. In Proc. Joint Europ. Softw. Eng. Conf. and ACM SIGSOFT Int'l Symp. Foundations Softw. Eng., pp. 165--174, 2007. Google ScholarDigital Library
- E. Duala-Ekoko and M. P. Robillard. Tracking code clones in evolving software. In Proc. Int'l Conf. Softw. Eng., pp. 158--167, 2007. Google ScholarDigital Library
- W. B. Frakes and K. Kang. Software reuse research: Status and future. IEEE Trans. Software Eng., 31(7):529--536, 2005. Google ScholarDigital Library
- M. G. Gouda and T. Herman. Adaptive programming. IEEE Trans. Softw. Eng., 17(9):911--921, 1991. Google ScholarDigital Library
- R. Holmes and R. J. Walker. Supporting the investigation and planning of pragmatic reuse tasks. In Proc. Int'l Conf. Softw. Eng., pp. 447--457, 2007. Google ScholarDigital Library
- R. Holmes and R. J. Walker. Lightweight, semi-automated enactment of pragmatic-reuse plans. In Proc. Int'l Conf. Softw. Reuse, pp. 330--342, 2008. Google ScholarDigital Library
- R. Holmes, R. J. Walker, and G. C. Murphy. Approximate structural context matching: An approach to recommend relevant examples. IEEE Trans. Softw. Eng., 32(12):952--970, 2006. Google ScholarDigital Library
- P. Jablonski and D. Hou. CReN: A tool for tracking copy-and-paste code clones and renaming identifiers consistently in the IDE. In Proc. Eclipse Technology Exchange, pp. 16--20, 2007. Google ScholarDigital Library
- D. Jackson and D. A. Ladd. Semantic diff: A tool for summarizing the effects of modifications. In Proc. Int'l Conf. Softw. Maintenance, pp. 243--252, 1994. Google ScholarDigital Library
- T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng., 28(7):654--670, 2002. Google ScholarDigital Library
- C. Kapser and M. W. Godfrey. 'Cloning considered harmful' considered harmful. In Proc. Working Conf. Reverse Eng., pp. 81--90, 2006. Google ScholarDigital Library
- M. Kim, V. Sazawal, D. Notkin, and G. Murphy. An empirical study of code clone genealogies. In Proc. Joint Europ. Conf. Softw. Eng. and ACM SIGSOFT Int'l Symp. Foundations Softw. Eng., pp. 187--196, 2005. Google ScholarDigital Library
- C. W. Krueger. Software reuse. ACM Comput. Surv., 24(2):131--183, 1992. Google ScholarDigital Library
- A. Michail. Code web: Data mining library reuse patterns. In Proc. Int'l Conf. Softw. Eng., pp. 827--828, 2001. Google ScholarDigital Library
- G. D. Plotkin. A note on inductive generalization. Machine Intelligence, 5:153--163, 1970.Google Scholar
- M. B. Rosson and J. M. Carroll. The reuse of uses in Smalltalk programming. ACM Trans. Computer--Human Interaction, 3(3):219--253, 1996. Google ScholarDigital Library
- F. Van Rysselberghe and S. Demeyer. Reconstruction of Successful Software Evolution Using Clone Detection. Proc. Int'l Wkshp. Principles of Softw. Evolution, pp. 126--130, 2003. Google ScholarDigital Library
- R. W. Selby. Enabling reuse-based software development of large-scale systems. IEEE Trans. Softw. Eng., 31(6):495--510, 2005. Google ScholarDigital Library
- F. Tip, A. Kiezun, and D. Bäumer. Refactoring for generalization using type constraints. In Proc. ACM SIGPLAN Conf. Object-Oriented Progr. Syst. Lang. Appl., pp. 13--26, 2003. Google ScholarDigital Library
- T. Xie and J. Pei. MAPO: Mining API usages from open source repositories. In Proc. Int'l Wkshp. Mining Softw. Repositories, pp. 54--57, 2006. Google ScholarDigital Library
- D. M. Yellin and R. E. Strom. Protocol specifications and component adaptors. ACM Trans. Program. Lang. Syst., 19(2):292--333, 1997. Google ScholarDigital Library
Index Terms
- Semi-automating small-scale source code reuse via structural correspondence
Recommendations
On code reuse from StackOverflow
Context: Source code reuse has been widely accepted as a fundamental activity in software development. Recent studies showed that StackOverflow has emerged as one of the most popular resources for code reuse. Therefore, a plethora of work proposed ways ...
Identifying Source Code Reuse across Repositories Using LCS-Based Source Code Similarity
SCAM '14: Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and ManipulationDevelopers often reuse source files developed for another project. In order to update a reused file to a newer version released by the original project, developers have to track which revision of a file was reused and how its content was modified. ...
Comments