Harnessing Large Language Models for Automated Software Testing: A Leap Towards Scalable Test Case Generation

Rehan, Shaheer; Al-Bander, Baidaa; Al-Said Ahmad, Amro

doi:10.3390/electronics14071463

Harnessing Large Language Models for Automated Software Testing: A Leap Towards Scalable Test Case Generation

Rehan, Shaheer; Al-Bander, Baidaa; Al-Said Ahmad, Amro

Authors

Shaheer Rehan

Baidaa Al-Bander b.al-bander@keele.ac.uk

Dr Amro Al-Said Ahmad a.m.al-said.ahmad@keele.ac.uk

Abstract

Software testing is critical for ensuring software reliability, with test case generation often being resource-intensive and time-consuming. This study leverages the Llama-2 large language model (LLM) to automate unit test generation for Java focal methods, demonstrating the potential of AI-driven approaches to optimize software testing workflows. Our work leverages focal methods to prioritize critical components of the code to produce more context-sensitive and scalable test cases. The dataset, comprising 25,000 curated records, underwent tokenization and QLoRA quantization to facilitate training. The model was fine-tuned, achieving a training loss of 0.046. These results show the promise of AI-driven test case generation and underscore the feasibility of using fine-tuned LLMs for test case generation, highlighting opportunities for improvement through larger datasets, advanced hyperparameter optimization, and enhanced computational resources. We conducted a human-in-the-loop validation on a subset of unit tests generated by our fined-tuned LLM. This confirms that these tests effectively leverage focal methods, demonstrating the model’s capability to generate more contextually accurate unit tests. The work suggests the need to develop novel validation objective metrics specifically tailored for the automation of test cases generated by utilizing large language models. This work establishes a foundation for scalable and efficient software testing solutions driven by artificial intelligence. The data and code are publicly available on GitHub

Citation

Rehan, S., Al-Bander, B., & Al-Said Ahmad, A. (2025). Harnessing Large Language Models for Automated Software Testing: A Leap Towards Scalable Test Case Generation. Electronics, 14(7), 1-25. https://doi.org/10.3390/electronics14071463

Journal Article Type	Article
Acceptance Date	Apr 2, 2025
Online Publication Date	Apr 4, 2025
Publication Date	Apr 4, 2025
Deposit Date	Apr 4, 2025
Publicly Available Date	Apr 9, 2025
Journal	Electronics
Electronic ISSN	2079-9292
Publisher	MDPI
Peer Reviewed	Peer Reviewed
Volume	14
Issue	7
Article Number	1463
Pages	1-25
DOI	https://doi.org/10.3390/electronics14071463
Keywords	LLM; focal methods; unit testing; test case generation; Llama-2; software testing; QLoRA
Public URL	https://keele-repository.worktribe.com/output/1192540
Publisher URL	https://www.mdpi.com/2079-9292/14/7/1463

Files

Harnessing Large Language Models for Automated Software Testing: A Leap Towards Scalable Test Case Generation (855 Kb)
PDF

Licence
https://creativecommons.org/licenses/by/4.0/

Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/

Copyright Statement
The final version of this accepted manuscript and all relevant information related to it, including copyrights, can be found on the publisher website