Barbara Kitchenham
Recommendations for analysing and meta-analysing small sample size software engineering experiments
Kitchenham, Barbara; Madeyski, Lech
Authors
Lech Madeyski
Abstract
Context: Software engineering (SE) experiments often have small sample sizes. This can result in data sets with non-normal characteristics, which poses problems as standard parametric meta-analysis, using the standardized mean difference (StdMD) effect size, assumes normally distributed sample data. Small sample sizes and non-normal data set characteristics can also lead to unreliable estimates of parametric effect sizes. Meta-analysis is even more complicated if experiments use complex experimental designs, such as two-group and four-group cross-over designs, which are popular in SE experiments. Objective: Our objective was to develop a validated and robust meta-analysis method that can help to address the problems of small sample sizes and complex experimental designs without relying upon data samples being normally distributed. Method: To illustrate the challenges, we used real SE data sets. We built upon previous research and developed a robust meta-analysis method able to deal with challenges typical for SE experiments. We validated our method via simulations comparing StdMD with two robust alternatives: the probability of superiority (p^) and Cliffs’ d. Results: We confirmed that many SE data sets are small and that small experiments run the risk of exhibiting non-normal properties, which can cause problems for analysing families of experiments. For simulations of individual experiments and meta-analyses of families of experiments, p^ and Cliff’s d consistently outperformed StdMD in terms of negligible small sample bias. They also had better power for log-normal and Laplace samples, although lower power for normal and gamma samples. Tests based on p^ always had better or equal power than tests based on Cliff’s d, and across all but one simulation condition, p^ Type 1 error rates were less biased. Conclusions: Using p^ is a low-risk option for analysing and meta-analysing data from small sample-size SE randomized experiments. Parametric methods are only preferable if you have prior knowledge of the data distribution.
Citation
Kitchenham, B., & Madeyski, L. (2024). Recommendations for analysing and meta-analysing small sample size software engineering experiments. Empirical Software Engineering, 29(6), Article 137. https://doi.org/10.1007/s10664-024-10504-1
Journal Article Type | Article |
---|---|
Acceptance Date | May 24, 2024 |
Online Publication Date | Aug 17, 2024 |
Publication Date | 2024-11 |
Deposit Date | Aug 23, 2024 |
Publicly Available Date | Aug 27, 2024 |
Journal | Empirical Software Engineering |
Print ISSN | 1382-3256 |
Electronic ISSN | 1573-7616 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Volume | 29 |
Issue | 6 |
Article Number | 137 |
DOI | https://doi.org/10.1007/s10664-024-10504-1 |
Keywords | Meta-analysis · Effect size · Non-parametric · Probability of superiority · Small sample sizes · Reproducible research |
Public URL | https://keele-repository.worktribe.com/output/887834 |
Additional Information | Accepted: 24 May 2024; First Online: 17 August 2024; : ; : The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. |
Files
Recommendations for analysing and meta-analysing small sample size software engineering experiments
(1 Mb)
Archive
Licence
https://creativecommons.org/licenses/by/4.0/
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Copyright Statement
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
You might also like
SEGRESS: Software Engineering Guidelines for REporting Secondary Studies
(2022)
Journal Article
Problems with Statistical Practice in Software Engineering Research
(2019)
Conference Proceeding
OECD Recommendation's draft concerning access to research data from public funding: A review
(2021)
Journal Article
The Importance of the Correlation in Crossover Experiments
(2022)
Journal Article
Downloadable Citations
About Keele Repository
Administrator e-mail: research.openaccess@keele.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search