恭喜 OP 发现了现在软件工程领域对 LLM 研究的三大前沿:
1. ”我给它上传了一个公共库代码和调用它的完整代码“,这对应着 ”仓库级代码自动修复“ ( Repository-Level Automatic Program Repair)
如今年的 Y. Chen et al., "When Large Language Models Confront Repository-Level Automatic Program Repair: How Well They Done?," in 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), Lisbon, Portugal, 2024, pp. 459-471, doi: 10.1145/3639478.3647633.
2. ”必须把公用库代码清理干净“,这表明 OP 发现了代码注释对 LLM 理解的重大影响甚至误导的现象,在 2024 年发表的多篇论文中亦有研究记载,如今年的 H. Yu et al., "CoderEval: A Benchmark of Pragmatic Code Generation with Generative Pretrained Models," in 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE), Lisbon, Portugal, 2024, pp. 428-439, doi: 10.1145/3597503.3623322.
3. “或者是完整写明哪些函数有用他才会看”,这表明 OP 发现了思考链( chain-of-thought )以及专业知识对 LLM 理解能力的提升的现象
此外,SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering 或许可以对”它疯狂优化根本没用上的函数“ 的现象有所缓解