Publications

Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models

Published in ACL, 2026

Recommended citation

Junhao Liu, Haonan Yu, Zhenyu Yan, and Xin Zhang. 2026. Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026). Association for Computational Linguistics, San Diego, CA, USA. To appear. PDF

Guiding LLM-based Loop Invariant Synthesis via Feedback on Local Reasoning Errors

Published in TOPLAS, will be presented on OOPSLA, 2026

LLMs can produce loop invariants with correct high-level insights but invalid details due to small logical reasoning errors. LORIS builds a pipeline that first asks the LLM to generate reasoning steps and then formalizes and checks those steps with a verifier to identify the failing reasoning. By providing detailed feedback on the failure, LORIS guides the LLM toward valid loop invariants more effectively.

On a benchmark suite of 460 C programs, LORIS with GPT-4.1 solves 445 programs with a 93.1% success rate, outperforming the SOTA approaches LaM4Inv and Clause2Inv, both of which remain below 85% under the same model. Across five tested models, LORIS solves the most unique programs on all models and achieves higher success rates on four models. On an additional non-linear benchmark suite, LORIS solves 47 of 50 programs.

Recommended citation

Tianchi Li, Zhenyu Yan, Junhao Liu, Peng Di, and Xin Zhang. 2026. Guiding LLM-Based Loop Invariant Synthesis via Feedback on Local Reasoning Errors. ACM Transactions on Programming Languages and Systems 48, 2, Article 8 (May 2026), 30 pages. PDF

Beer: Interactive Alarm Resolution in Bayesian Program Analysis via Exploration-Exploitation

Published in OOPSLA, 2026

Bayesian program analysis is a novel paradigm that systematically introduces probabilities into program analysis. It ranks alarms by probabilistic confidence and updates those confidences using user inspection feedback. However, existing interactive approaches greedily pick the highest-confidence alarm for inspection, which can get stuck in local optima, resulting in sub-optimal performance.

Our approach, Beer, frames interactive alarm resolution as an exploration-exploitation problem: when the analysis appears stuck, it explores by selecting alarms that approximately maximize information gain to break out of local optima more effectively. Experiments on datarace, thread-escape, and taint analyses show that Beer outperforms the greedy baseline, achieving up to 32% fewer false-alarm inspections, and also outperforms SOTA Bayesian analysis systems. This demonstrates that its exploration-exploitation strategy generalizes beyond a single analysis.

Recommended citation

Haoran Lin, Zhenyu Yan, and Xin Zhang. 2026. Beer: Interactive Alarm Resolution in Bayesian Program Analysis via Exploration-Exploitation. Proceedings of the ACM on Programming Languages 10, OOPSLA1 (April 2026), 400-426. PDF

Scaling Abstraction Refinement for Program Analyses in Datalog using Graph Neural Networks

Published in OOPSLA, 2024

CEGAR-based abstraction refinement often relies on constraint solving, which can struggle to scale to large Datalog analyses. This work uses graph neural networks to prune unhelpful abstraction parameters from Datalog derivation graphs before MaxSAT-based refinement, yielding smaller constraint problems that speed up constraint solving and more effective refinements that reduce the number of the refinement iterations.

Our approach is general and does not require heavy domain knowledge for different analyses. Experiments on pointer and typestate analyses show that this approach answers 2.83x and 1.5x as many queries, respectively, as the baseline on large programs. It also runs faster on programs where both approaches terminate, and its timeout frequency is about 30% of the baseline’s.

Recommended citation

Zhenyu Yan, Xin Zhang, and Peng Di. 2024. Scaling Abstraction Refinement for Program Analyses in Datalog using Graph Neural Networks. Proc. ACM Program. Lang. 8, OOPSLA2, Article 325 (October 2024), 29 pages. PDF

鄢振宇

Zhenyu Yan

Publications

Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models

Guiding LLM-based Loop Invariant Synthesis via Feedback on Local Reasoning Errors

Beer: Interactive Alarm Resolution in Bayesian Program Analysis via Exploration-Exploitation

Scaling Abstraction Refinement for Program Analyses in Datalog using Graph Neural Networks