Research

My research is mainly in programming languages, with a focus on strengthening the theoretical foundations of program analysis and on integrating machine learning or large language models into traditional analysis and verification techniques.

Finer-Grained Context Sensitivity

Selective context sensitivity improves the trade-off between precision and efficiency by analyzing important parts of a program context-sensitively while analyzing other parts context-insensitively. Most existing choices are made at method-level or variable-level granularities, which are natural boundaries for context sensitivity but can be too coarse for finer precision choices.

My current work studies statement-level selective context sensitivity: a framework in which an analysis can choose which statements should be analyzed context-sensitively. The core challenge is that mixing context sensitivity and context insensitivity inside a method breaks the natural boundaries of context sensitivity, and therefore requires variables to carry facts produced under context sensitivity and context insensitivity. To efficiently resolve this issue, we propose a unified expression of both context-sensitive and context-insensitive facts, allowing them to propagate together without breaking soundness guarantees.

Learning-Based Analysis and Verification

Many practical program analyses rely on complex rules or heavily hand-designed heuristics, which require substantial domain knowledge and can limit generality. Other approaches use machine learning or LLMs in ad hoc ways, without systematically addressing interpretability or formal guarantees. I am interested in using learning-based methods systematically, so that they reduce human effort while preserving interpretable analysis structure and formal guarantees where possible.

My OOPSLA 2024 work is one example of this direction. Datalog-based analyses naturally produce derivation graphs, which can serve as structured inputs to neural models. In that work, we use derivation graphs as inputs to graph neural networks, instead of using domain knowledge to manually design input formats. To preserve soundness, our graph neural network is used only to guide abstraction refinement by pruning unhelpful abstraction parameters before constraint solving, while the refinement step itself remains grounded in a Datalog-based analysis framework. Experiments show that our approach resolves 2.83x and 1.5x as many queries as the baseline on pointer and typestate analyses, respectively, and our approach runs faster than the baseline. We also apply XAI techniques to extract reusable rules from the learned model and evaluate how well those rules explain its pruning decisions on other samples, providing global post-hoc interpretability for the learning-guided component.

I also participated in a TOPLAS 2026 work that uses verification techniques to identify errors LLMs make when generating loop invariants. By using detailed verifier feedback, our approach guides LLMs toward valid invariants more effectively.

Explainable Artificial Intelligence (XAI)

I also contribute to collaborative XAI and LLM-interpretability projects in our group, mainly by bringing PL techniques, algorithms, and PL-style formal reasoning to theoretical and algorithmic subproblems in those projects.

鄢振宇

Zhenyu Yan

Research

Finer-Grained Context Sensitivity

Learning-Based Analysis and Verification

Explainable Artificial Intelligence (XAI)