Yakun Zhang

Ph.D. candidate student, Software Engineering, Institute of Software, School of Electronics Engineering and Computer Science (EECS), Peking University

Email: zhangyakun@stu.pku.edu.cn

I am currently a Ph.D. candidate student under the supervision of Prof. Lu Zhang in Peking University. I was advised by Prof. Dan Ye and Prof. Wensheng Dou in Institute of Software, Chineses Academy of Sciences during my master study. My research interests focus on software engineering, android test, spreadsheet analysis, and machine/deep learning.


  • Ph.D. in Software Engineering
    • Advisors: Prof. Lu Zhang
    • Institute of Software, School of Electronics Engineering and Computer Science, Peking University (09/2021-Now)
  • M.S. in Software Engineering
    • Advisors: Prof. Dan Ye, Wensheng Dou
    • Institute of Software, Chinese Academy of Sciences (09/2018-07/2021)
  • B.S. in Computer Science
    • Wuhan University (09/2014-07/2018)
    • Rank: Top1


  • [09/2021] Chinese Academy of Science Presidential Scholarship Awarded.
  • [07/2021] Awarded Outstanding Graduate of Beijing City.
  • [06/2021] Awarded Outstanding Graduate of University of Chinese Academy of Sciences.
  • [04/2021] Our full paper entitled "Semantic Table Structure Identification in Spreadsheets" is accepted by ISSTA 2021.

Research Experience

  1. Semantic Table Structure Identification in Spreadsheets, 05/2021-05/2020
    • Research problem: spreadsheet tables are usually organized in a semi-structured way, and contain complicated semantic structures. Lack of documented semantic table structures, existing data analysis and error detection tools can hardly understand spreadsheet tables.
    • Contribution: We propose Tasi (Table structure identification) to automatically identify semantic table structures in spreadsheets. We further propose TasiError, to detect spreadsheet errors based on the identified semantic table structures by Tasi.
  2. Learning to detect table clones in spreadsheets, 09/2019-05/2020
    • Research problem: Table clones in spreadsheets are common. Precisely extracting table clones can benefit data reuse, fault detection and data mining. However, table clones are not recorded during the creation of spreadsheets.
    • Contribution: We propose LTC (Learning to Detect Table Clones), to automatically detect table clones with or without structure changes. LTC can achieve a highly precision and recall in table clone detection, significantly outperforming the state-of-the-art technique.
  3. Spatial-temporal video quality assessment (Intern project), 11/2016-12/2017
    • Research problem: This project aims to perform video quality assessment which benefits video codec and compression.
    • Contribution: We improve the JND image quality assessment algorithm in three aspects and get a new video quality assessment algorithm.
  4. Infrared face recognition system based on deep learning (Research competition), 11/2016-08/2017
    • Description: This project aims to improve face recognition caused by the camera's unclear face captured in the low light environment at night. This project can be used for field security monitoring.
    • Approach: MTCNN is used for face detection, and the infrared recovery framework is built on the GAN network. The face recognition system is modified from the Google FaceNet framework.
  5. A City Roadside Parking System (Research competition), 11/2015-08/2016
    • Description: This project aims to solve the problem of roadside parking, and the public cannot find roadside parking space.
    • Approach: We use ultrasonic sensors to determine whether the parking space is occupied, and we have designed a WeChat service number to check the parking space around the user at any time and make an appointment for certain parking spaces.

Research Interest

  • Software Engineering
  • Machine/deep learning
  • Spreadsheet programing and analysis

Publication & Patent

  1. [ISSTA 2021] Semantic Table Structure Identification in Spreadsheets. [PDF] [Data]
    Yakun Zhang, Xiao Lv, Haoyu Dong, Wensheng Dou, Shi Han, Dongmei Zhang, Jun Wei, Dan Ye.
    30th ACM SIGSOFT International Symposium on Software Testing and Analysis.
  2. [ISSTA 2020] Learning to Detect Table Clones in Spreadsheets. [PDF]
    Yakun Zhang, Wensheng Dou, Jiaxin Zhu, Liang Xu, Zhiyong Zhou, Jun Wei, Dan Ye, Bo Yang.
    29th ACM SIGSOFT International Symposium on Software Testing and Analysis.
  3. A City Roadside Parking System Based on Ultrasonic Sensors.
    Yakun Zhang, Zhiyuan Deng, Yinjie Guo, Xiaowei Zhang.
    ZL201720326838.9, 03/2017.

Awards and Certificates

  1. Outstanding Graduate of Beijing City, 07/2021
  2. Outstanding Graduate of University of Chinese Academy of Sciences, 06/2021
    • Rank: 1/120
  3. National Scholarship, 12/2020
    • Top 0.2%
  4. Model of Merit Student of University of Chinese Academy of Sciences, 06/2020
  5. Merit Student of University of Chinese Academy of Sciences, 06/2019
  6. Big Data & Computational Intelligence Contest, 09/2018-11/2018
    • Rank: 12/2444
  7. Outstanding Graduate of Wuhan University, 06/2018
  8. National University Student Information Security Competition, 11/2016-08/2017
    • National Second Prize
  9. National University Student Internet of Things Competition, 10/2015-08/2016
    • National Second Prize

Internship Experience

  • Microsoft Research Asia (MSRA), 08/2019-03/2020
    • Job: Data Knowledge Intelligent Group (DKI) Research Intern (Full-Time)
    • Direction: Table structure identification in Spreadsheets.

Student Service

  1. President of the Student Council in Institute of Software, Chinese Academic of Science, 2019-2020
  2. President of the Youth Volunteer Association in Wuhan University, 2015-2016
  3. Vice President of Student Council in Wuhan University, 2014-2015


  • When having spare time, I love to play Pipa or Chinese Zither. Besides this, I also love to dance with my friends in the dance room.

Last modified: 2021/10/17