βVia our empirical studies, we find that the presence of potential bugs drastically degrades the code-completion performance of high-performing Code-LLMs, with test-case pass rates dropping to below 5% across both datasets for all tested model variants.β
https://arxiv.org/abs/2306.03438