機能等価メソッドデータセットの構築によるソフトウェア工学タスクの高度化
项目来源
项目主持人
项目受资助机构
项目编号
立项年度
立项时间
研究期限
项目级别
受资助金额
学科
学科代码
基金类别
关键词
参与者
参与机构
1.A Large-Scale Investigation Into the Loss of Pull Request Data on GitHub
- 关键词:
- Software development management; Source coding; Application programminginterfaces; Testing; Codes; Soft sensors; Reviews; Maintenance; Java;Information science; Empirical software engineering; pull requests;social coding; GitHub; software mining
- Tang, Bowen;Maruyama, Katsuhisa
- 《IEEE ACCESS》
- 2026年
- 14卷
- 期
- 期刊
Analyzing pull requests (PRs) on GitHub provides valuable insights that can improve software development and maintenance. Therefore, researchers must collect PRs for empirical studies when testing hypotheses and creating practical tools based on these insights. Unfortunately, using GitHub as a data source for PRs carries the risk of data loss, owing to its flexible resource management. Existing studies have indicated that data losses can occur in PRs; however, the types and impacts of these losses remain unclear. This study shares findings from our investigation, which analyzed 84,828 PRs from 30 GitHub repositories and 2,345,724 actions recorded within the PRs. It clarified how different types of data loss affected PRs and highlighted variations in the percentage of PRs affected by loss. The results showed that 54.79% of the PRs experienced some data loss. Source code loss was common, whereas the loss of user information and commits was less frequent. Most user information loss resulted from missing committers. Compared to PRs that were rejected, merged PRs were more likely to have source code losses. The source code loss rate was much lower in testing-related PRs than in those unrelated to the testing. PRs that lacked files written in a programming language were more prone to commit loss. These findings help researchers better understand data loss in PRs and develop effective strategies to prevent it.
...2.An empirical study on the impact of change granularity in refactoring detection
- 关键词:
- Error detection;Open systems;Coarse-grained;Commit history;Commit message;Empirical studies;Impact of changes;Open science;Refactoring detection;Refactorings;Software Evolution
- Chen, Lei;Hayashi, Shinpei
- 《Journal of Systems and Software》
- 2026年
- 231卷
- 期
- 期刊
Detecting refactorings in commit history is essential to improve comprehension to code changes on code reviews, and to provide valuable information for empirical studies on software evolution. Techniques have been proposed to accurately detect refactorings on the granularity of a single commit. However, refactorings can be made over multiple commits because of their complexity or other practical development problems, which cause detecting on only the granularity of a single commit not enough. We observe that some refactorings can only be detected in coarser granularity, i.e., changes conducted over multiple commits, or in the granularity of a single commit but not in coarse-grained. We call these types of refactorings as coarse-grained refactorings (CGRs) and ephemeral refactorings (EPRs). We investigated the features and causes of CGRs and EPRs through an empirical study of 32 open-source Java projects and found that both commonly occur during development. In addition, we found that refactoring types related to splitting or merging classes and packages, as well as those involving modifications to the inheritance structure, tend to be CGRs, and types targeting small objects such as variables and attributes, and refactorings with context-sensitive detection criteria tend to be EPRs. The causes of CGRs and EPRs are analyzed and categorized, and the relationships between the commit messages of CGRs and themselves are also assessed. We found that about 20% of commit messages explicitly suggest the existence of CGRs. We suggest that CGRs and EPRs be valued in refactoring research and that detectors be extended to identify CGRs. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board. © 2025 The Authors
...3.Coverage Isn’t Enough: SBFL-Driven Insights into Manually Created vs. Automatically Generated Tests
- 关键词:
- Automatic test pattern generation;Software design;Software testing;Well testing;Automated test-case generations;Automatically generated;Code coverage;Fault localization;Mutation testing;Spectra's;Spectrum-based fault localization;Test case;Testing method;Testing phase
- Shimizu, Sasara;Higo, Yoshiki
- 《26th International Conference on Product-Focused Software Process Improvement, PROFES 2025》
- 2026年
- December 1, 2025 - December 3, 2025
- Salerno, Italy
- 会议
The testing phase is an essential part of software development, but manually creating test cases can be time-consuming. Consequently, there is a growing need for more efficient testing methods. To reduce the burden on developers, various automated test generation tools have been developed, and several studies have been conducted to evaluate the effectiveness of the tests they produce. However, most of these studies focus primarily on coverage metrics, and only a few examine how well the tests support fault localization—particularly using artificial faults introduced through mutation testing. In this study, we compare the SBFL (Spectrum-Based Fault Localization) score and code coverage of automatically generated tests with those of manually created tests. The SBFL score indicates how accurately faults can be localized using SBFL techniques. By employing SBFL score as an evaluation metric—an approach rarely used in prior studies on test generation—we aim to provide new insights into the respective strengths and weaknesses of manually created and automatically generated tests. Our experimental results show that automatically generated tests achieve higher branch coverage than manually created tests, but their SBFL score is lower, especially for code with deeply nested structures. These findings offer guidance on how to effectively combine automatically generated and manually created testing approaches. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
...4.How Natural Language Proficiency Shapes Generative AI Code for Software Engineering Tasks
- 关键词:
- Codes; Natural language processing; Software engineering; Softwarereliability; Software development management; Python; Softwaremeasurement; Large language models
- Rojpaisarnkit, Ruksit;Fan, Youmei;Matsumoto, Kenichi;Kula, Raula Gaikovina
- 《IEEE SOFTWARE》
- 2026年
- 43卷
- 1期
- 期刊
Much research has focused on prompt structure, but natural language proficiency is an underexplored factor that can influence the quality of generated code. This article investigates whether English language proficiency affects the proficiency and correctness of code generated by large language models.
...5.How Much Can a Behavior-Preserving Changeset Be Decomposed into Refactoring Operations?
- 关键词:
- ;Behavior preservation;Refactorings
- Someya, Kota;Chen, Lei;Decker, Michael J.;Hayashi, Shinpei
- 《41st IEEE International Conference on Software Maintenance and Evolution, ICSME 2025》
- 2025年
- September 7, 2025 - September 12, 2025
- Auckland, New zealand
- 会议
Developers sometimes mix behavior-preserving modifications, such as refactorings, with behavior-altering modifications, such as feature additions. Several approaches have been proposed to support understanding such modifications by separating them into those two parts. Such refactoring-aware approaches are expected to be particularly effective when the behavior-preserving parts can be decomposed into a sequence of more primitive behavior-preserving operations, such as refactorings, but this has not been explored. In this paper, as an initial validation, we quantify how much of the behavior-preserving modifications can be decomposed into refactoring operations using a dataset of functionally-equivalent method pairs. As a result, when using an existing refactoring detector, only 33.9 % of the changes could be identified as refactoring operations. In contrast, when including 67 newly defined functionally-equivalent operations, the coverage increased by over 128 %. Further investigation into the remaining unexplained differences was conducted, suggesting improvement opportunities. © 2025 IEEE.
...6.Social Media Reactions to Open Source Promotions: AI-Powered GitHub Projects on Hacker News
- 关键词:
- Artificial intelligence;Open systems;Social networking (online);Social sciences computing;Software design;Github project;Hacker news;LLM;News sources;Open source software projects;Open-source;Open-source softwares;Social media;Social media platforms;Spread of informations
- Meakpaiboonwattana, Prachnachai;Tarntong, Warittha;Mekratanavorakul, Thai;Ragkhitwetsagul, Chaiyong;Sangaroonsilp, Pattaraporn;Kula, Raula Gaikovina;Choetkiertikul, Morakot;Matsumoto, Kenichi;Sunetnanta, Thanwadee
- 《41st IEEE International Conference on Software Maintenance and Evolution, ICSME 2025》
- 2025年
- September 7, 2025 - September 12, 2025
- Auckland, New zealand
- 会议
Social media platforms have become more influential than traditional news sources, shaping public discourse and accelerating the spread of information. With the rapid advancement of artificial intelligence (AI), open-source software (OSS) projects can leverage these platforms to gain visibility and attract contributors. In this study, we investigate the relationship between Hacker News, a social news site focused on computer science and entrepreneurship, and the extent to which it influences developer activity on the promoted GitHub AI projects. We analyzed 2,195 Hacker News (HN) stories and their corresponding comments over a two-year period. Our findings reveal that at least 19 % of AI developers promoted their GitHub projects on Hacker News, often receiving positive engagement from the community. By tracking activity on the associated 1,814 GitHub repositories after they were shared on Hacker News, we observed a significant increase in forks, stars, and contributors. These results suggest that Hacker News serves as a viable platform for AI-powered OSS projects, with the potential to gain attention, foster community engagement, and accelerate software development. © 2025 IEEE.
...7.A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools
- 关键词:
- Open source software;Open systems;Tools;Bill of materials;Evaluating software;Generation tools;Material consumption;Real-world;Software bill of material;Software dependencies;Software-component;SPDX;Tool support
- Kishimoto, Rio;Kanda, Tetsuya;Manabe, Yuki;Inoue, Katsuro;Qiu, Shi;Higo, Yoshiki
- 《22nd IEEE/ACM International Conference on Mining Software Repositories, MSR 2025》
- 2025年
- April 27, 2025 - April 29, 2025
- Ottawa, ON, Canada
- 会议
A Software Bill of Materials (SBOM) is becoming an essential tool for effective software dependency management. An SBOM is a list of components used in software, including details such as component names, versions, and licenses. Using SBOMs, developers can quickly identify software components and assess whether their software depends on vulnerable libraries. Numerous tools support software dependency management through SBOMs, which can be broadly categorized into two types: tools that generate SBOMs and tools that utilize SBOMs. A substantial collection of accurate SBOMs is required to evaluate tools that utilize SBOMs. However, there is no publicly available dataset specifically designed for this purpose, and research on SBOM consumption tools remains limited. In this paper, we present a dataset of SBOMs to address this gap. The dataset we constructed comprises 46 SBOMs generated from real-world Java projects, with plans to expand it to include a broader range of projects across various programming languages. Accurate and well-structured SBOMs enable researchers to evaluate the functionality of SBOM consumption tools and identify potential issues. We collected 3,271 Java projects from GitHub and generated SBOMs for 798 of them using Maven with an open-source SBOM generation tool. These SBOMs were refined through both automatic and manual corrections to ensure accuracy, currently resulting in 46 SBOMs that comply with the SPDX Lite profile, which defines minimal requirements tailored to practical workflows in industries. This process also revealed issues with the SBOM generation tools themselves. The dataset is publicly available on Zenodo (DOI: 10.5281/zenodo.14233414). © 2025 IEEE.
...8.Revisiting Method-Level Change Prediction: A Comparative Evaluation at Different Granularities
- 关键词:
- Computer software;Maintainability;Change prediction;Class level;Comparative evaluations;Comparison methods;Different granularities;Level change;Machine-learning;Maintenance efforts;Performance;Prediction techniques
- Sugimori, Hiroto;Hayashi, Shinpei
- 《32nd IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2025》
- 2025年
- March 4, 2025 - March 7, 2025
- Montreal, QC, Canada
- 会议
To improve the efficiency of software maintenance, change prediction techniques have been proposed to predict frequently changing modules. Whereas existing techniques focus primarily on class-level prediction, method-level prediction allows for more direct identification of change locations. Method-level prediction can be useful, but it may also negatively affect prediction performance, leading to a trade-off. This makes it unclear which level of granularity users should select for their predictions. In this paper, we evaluated the performance of method-level change prediction compared with that of class-level prediction from three perspectives: direct comparison, method-level comparison, and maintenance effort-aware comparison. The results from 15 open source projects show that, although method-level prediction exhibited lower performance than class-level prediction in the direct comparison, method-level prediction outperformed class-level prediction when both were evaluated at method-level, leading to a median difference of 0.26 in accuracy. Furthermore, effort-aware comparison shows that method-level prediction performed significantly better when the acceptable maintenance effort is little. © 2025 IEEE.
...9.Toward Automated Test Generation for Dockerfiles Based on Analysis of Docker Image Layers
- 关键词:
- Automatic test pattern generation;Codes (symbols);Image processing;Software testing;Automated test generations;Docker;Dockerfile;General programming;Generation techniques;Image layers;Layer;Source codes;Text file;Virtualizations
- Goto, Yuki;Matsumoto, Shinsuke;Kusumoto, Shinji
- 《29th International Conference on Evaluation and Assessment of Software Engineering, EASE 2025》
- 2025年
- June 17, 2025 - June 20, 2025
- Istanbul, Turkey
- 会议
Docker has gained attention as a lightweight container-based virtualization platform. The process for building a Docker image is defined in a text file called a Dockerfile. A Dockerfile can be considered as a kind of source code that contains instructions on how to build a Docker image. Its behavior should be verified through testing, as is done for source code in a general programming language. For source code in languages such as Java, search-based test generation techniques have been proposed. However, existing automated test generation techniques cannot be applied to Dockerfiles. Since a Dockerfile does not contain branches, the coverage metric, typically used as an objective function in existing methods, becomes meaningless. In this study, we propose an automated test generation method for Dockerfiles based on processing results rather than processing steps. The proposed method determines which files should be tested and generates the corresponding tests based on an analysis of Dockerfile instructions and Docker image layers. The experimental results show that the proposed method can reproduce over 80% of the tests created by developers. © 2025 Copyright held by the owner/author(s).
...10.Exploring anInclusion Relation onTest Cases toIdentify Unit andIntegration Tests
- 关键词:
- Integration;Debugging efforts;Inclusion relation;Integration test;Line coverage;Measurement methods;Software testings;Test case;Testing efficiency;Testing process;Unit tests
- Okamoto, Ryu;Matsumoto, Shinsuke;Kusumoto, Shinji
- 《25th International Conference on Product-Focused Software Process Improvement, PROFES 2024》
- 2025年
- December 2, 2024 - December 4, 2024
- Tartu, Estonia
- 会议
In software testing, among the various types of tests, two commonly conducted ones are unit and integration tests.Unit tests verify individual functionalities, and integrationtests verify the combination of multiple functionalities. If wecan identify unit/integration tests and measure them as ordinal values, such as the degree of integration-ness, we can utilizethem to improve testing efficiency. However, the definitionsof unit/integration are ambiguous, making it difficult to distinguish between them. To the best of our knowledge, there is currentlyno method for detecting this distinction. In this study, aimingto support the testing process, we will consider a measurement method for unit/integration tests. The key idea is to utilize an inclusion relation, which naturally exists among test cases. As an application of the inclusion relation, we propose a method for ordering failed tests to streamline debugging. We conducted a mutation analysisto evaluate how much our proposal reduces debugging effort comparedto a naive method. The results showed that our proposal was effective in 29.7% of cases and confirmed an average reduction of 20.7%in debugging effort. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
...
