Zhenyu Yan’s Homepage

Research Interest

I am interested in topics about leveraging data-driven methods (machine learning or probability-based methods) for program analysis. Most latest program analyses rely on far too many heuristics, which limits their interpretability. I hope to combine data-driven methods and domain knowledge to develop general and more interpretable program analyses.

Some big-picture ideas:

  • As mentioned above, some program analyses use heuristics that require too much domain knowledge and heavy feature engineering. Replacing those heuristics with fancy data-driven methods may provide better interpretability and even better performances while maintaining soundness. For example, replace heavily handcrafted feature sets (like signatures, return types, or specific statements) with AST, CFG, or the even derivation of logic programming languages.
  • Some data-driven methods may struggle with soundness, but their false positive may be much less than those of traditional methods. We may combine different unsound methods to create a sound analysis, or combing sound analyses with unsound but more precise analyses.

Education

  • Ph.D in Computer Software, Peking University 2021.9 ~ 2026 (expected)
  • B.S. in Computer Software and Technology, Nanjing University 2017.9 ~ 2021

For more informal information about myself, please refer to About Me