介绍

主要研究方向为分布式计算、大数据与机器学习系统 ,目前主要关注面向大规模消息处理、分布式内存计算、流式机器学习、分布式图数据挖掘分析等新型分布式计算需求的引擎系统与算法。在VLDB、SIGMOD、ICDE、SC、ICSE、ASE、ISSTA、ICDCS、Middleware、ISSRE、ICAC、CIKM、Cluster、ICWS、JSS、Computing、中国科学、软件学院、中国科学院院刊等国内外重要学术期刊和会议发表论文80余篇,申请发明专利30余项(含国际1项,授权28项、转让8项)。获国家重点研发计划课题、国家自然科学基金、国家863高技术计划、中科院先导A类专项、工信部电子发展基金、工信部软件专项、华为创新研究计划(HIRP)、阿里创新研究计划(AIR)等资助,与华为、阿里、百度、京东、金山、联想、中国软件、浪潮软件、神舟软件、航天信息、用友、南方电网等信息技术领域头部企业以及操作系统、数据库、中间件等国产基础软件头部企业开展合作,一批关键技术批量化应用于党政、金融、交通、电力、互联网等重要领域和行业。
  • 招收对分布式系统、大数据与机器学习系统研究感兴趣的博士生/硕士生/实习生/访问学生(常年有效),要求学生对研究具有较强兴趣和较强的动手能力!

成员

指导老师

博士研究生

    • 汪钇丞, 秦政, 刘力玮, 汤磊, 梁堉

硕士研究生

    • 刘祥隆, 王雨炫, 王毅、于昊南、王音棋、刘薛浒、周亦轩

实习生

    • 周富杰, 于然, 程经纬, 张宇涵, 马淞康, 吴阳阳

科研项目

分布式系统研究

    • 面向持久内存的数据库系统性能优化
    • 关系型数据库广泛用于支撑高频交易等关键场景,持久内存作为一种新型存储介质,具备与内存类似的字节寻址特性以及与磁盘类似的非易失特性,能够有效提升系统性能。改进了多版本并发控制、失效恢复、分布式共识等核心协议,提出了改进的存储结构和存储系统以暂存数据库事务执行过程中的中间数据,避免反复的持久化及高开销的故障恢复操作,有效提升数据库系统在高并发事务场景下的性能。

    • EasyTuning: 分布式数据库系统参数调优
    • 数据库参数调优是将机器学习应用于数据库系统的重要研究方向,而分布式数据库作为一种全新的数据库形态,尚未有成熟的调优工作。提出一种面向分布式数据库的自动化参数调优工具,支持负载级别和查询级别两种粒度的调优。通过分析分布式数据库特点,选取合适的调优参数,通过机器学习算法为数据库推荐更优的配置方案,可有效的提升数据库在OLTP场景和OLAP场景下的工作性能,提高事务处理的吞吐率和降低查询的时延。

    • 分布式图数据库管理系统逻辑错误检测
    • 在图数据库系统查询生成方面,提出一种模型驱动的Gremlin查询随机生成方法和参数随机生成策略;在图数据库系统错误检测方面,提出一种面向多个图数据库系统的差分测试方法和基于映射表的结果交叉验证方法;设计并实现了图数据库系统差分测试框架Grand。在Neo4j、OrientDB、JanusGraph、HugeGraph、TinkerGraph以及ArcadeDB等6个广泛应用的开源图数据库系统中进行实验评估,能有效的检测出图数据库系统中的逻辑错误。研究成果发表在ISSTA2022

    • 基于数字孪生的IoT故障检测与诊断
    • 在智能设备数字化建模方面,提出了一种基于层次状态自动机的智能家居设备数字化行为模型;在设备运维异常定位方面,提出了一种基于模式越界报警的突变异常检测与基于变化趋势比较的累积异常检测相结合的综合方法。针对开源智能家居系统Home Assistant设计实现了一个测试驱动的自动化建模工具和异常定位系统。在使用真实智能家居设备建立的实验环境中,对智能设备行为建模和(人为设计异常场景下)异常定位功能进行了检测,能高效、便捷地构造出设备的行为模型,有效准确地定位多种异常。

    • 极端规模系统实例消息分发管理
    • 云服务提供商通常维护百万乃至千万量级的容器实例,如何管理这些容器实例,提供高效、可靠、灵活的消息分发服务是亟待解决的关键问题。我们提出了适用于极端规模场景的消息分发协议,将其解构为底层拓扑和一致性保障两部分,在底层拓扑方面,提出了基于生成树的灵活的多轮次推送拓扑以降低节点负载,提升高负载场景下的推送性能;在一致性保障方面,使用“绕开不可达节点”的容错机制配合ACK机制,保障消息最终送达并可保障网络分区时多分区可用。我们在阿里云百万量级核心系统实现重构和实验部署,基准性能提升1个数量级以上,关键技术集成到阿里云开源中间件NACOS。

    • EasyTuning: SSD混合存储管理
    • EasyTuning用于SSD+HDD的混合存储环境下的云应用性能优化。目前,EasyTuning针对虚拟化环境下事务型(TPC-W)和分析型(Hadoop)这两类主流的弹性云应用,从I/O软件栈中的存储介质层、虚拟机监视器层和虚拟机Guest OS层这三个正交层次分别进行优化。EasyTuning支持(1)在存储介质层通过调整每台虚拟机使用的SSD缓存的比例实现性能优化;(2)在虚拟机监视器层通过对虚拟机进行优化放置和动态迁移以消除性能瓶颈;(3)在虚拟机Guest OS层通过配置合适的缓存策略和参数实现性能优化。

    • EasyCache:内存数据网格系统
    • EasyCache是一种低时延、可伸缩的分布式内存数据管理系统,利用分布式内存计算技术实现毫秒级的端到端低时延响应,可支撑海量数据处理和访问负载,适用于智慧城市、智能装备、智能制造、极端交易等场景下的事务处理、实时处理、流式处理等应用。EasyCache支持(1)高可扩展的流式数据实时查询、连续查询“内外存融合”的数据访问穿透;(2)自定义、可扩展的数据处理流程高性能数据聚合与高吞吐持久化;(3)兼容JDBC的缓存访问接口,自动化的缓存加载与更新持久化数据库更新数据自动捕获(CDC)。

    • EasyTesting:负载测试云服务系统
    • EasyTesting提供自助式性能管理服务,帮助开发人员实时测试应用性能表现、评估系统资源需求、诊断瓶颈问题,确保应用上线后的服务质量。EasyTesting支持(1)负载测试云服务,使用交互式前端录制/定制负载测试脚本,利用云端资源实现大规模负载生成;(2)动态自适应监测探针插桩,利用程序分析技术简化探针安装配置过程,自动实现代码级的细粒度执行追踪;(3)分布式应用可视化追踪,实时测试、即席展现,提供可视化界面辅助运维人员检测、定位问题;(4)数据驱动测试过程优化,基于反馈的测试用例优化与负载动态调整,实现测试驱动的应用性能异常检测与定位。

    • EasyAPM:分布式应用性能管理系统
    • EasyAPM是一款应用执行时对软件性能进行监测、对业务处理流程进行追踪的工具。EasyAPM采用动态插桩的方法收集Java应用执行时的执行轨迹和关键性能指标数据,并将收集到的数据进行可视化展示,从而帮助用户透视应用的各项性能指标和资源消耗,并监测执行期异常信息。通过EasyAPM,用户可以获得应用执行时的资源消耗、事务执行时间、性能表现评价、故障监测、错误追踪、JVM资源消耗、系统环境信息等。EasyAPM主要应用于应用性能监测、错误检测、故障定位等方面。

    • TRE4J:多租户性能管理系统
    • TRE4J是面向多租户系统的性能管理工具,提出的多租户系统资源评估方法成为系统资源评估库LibReDE三类代表方法之一(LibReDE由SPEC Research Group主席Ing. Samuel Kounev、VMware主管工程师Xiaoyun Zhun、英国帝国理工学院Giuliano Casale等人联合建立)。

大数据系统及算法研究

    • Trilink: 亿级高吞吐分布式分析引擎
    • 针对大流量场景下Flink框架存在的节点间负载不均衡、非必要数据乱序、 跨节点数据交换、节点间低效连接等问题,对Flink流式处理引擎进行扩展优化,实现水平扩展瓶颈的快速识别和优化,Yahoo Streaming Benchmark以及典型行业场景测试结果表明,吞吐量可提升10-100倍,可在大规模国产处理器整机集群环境下实现每秒上亿条数据处理与分析。

    • BridgeGC:大数据框架友好的垃圾收集器
    • BridgeGC采用了跨级别协同设计,可以显著减少由长寿命数据对象带来的GC开销。在大数据框架级别,BridgeGC为框架开发人员提供了两个注解,用于标记数据对象的创建和释放。基于这些注解,BridgeGC跟踪数据对象的生命周期,并在GC级别优化它们的分配和回收。在GC级别,BridgeGC设计了基于标签的分配器,将数据对象与其他对象分开存储,并平衡它们的内存划分,从而减少GC周期。BridgeGC进一步设计了一种高效的收集器,在GC周期中消除不必要的数据对象标记和复制。当前BridgeGC集成在OpenJDK ZGC中。通过使用两个流行的大数据框架(Flink和Spark)的评估显示,与ZGC相比,BridgeGC减少了23%-82%的GC时间。

    • GraphLib: 分布式图挖掘算法库
    • 通过分析当前主流图挖掘算法的计算模式、数据依赖、计算顺序等特征,基于BSP模型设计实现了GraphLib大规模并行图挖掘算法库,可以用于商品推荐、反欺诈、社交网络分析等多个领域。GraphLib算法库包含17个并行化的典型图挖掘算法。这些算法基于Flink Gelly和Spark GraphX实现,是当时工业界和学术界最全面的大规模图挖掘算法库,具有较高的性能和扩展性,在百万图节点数据上的实验表明,算法执行时间在分钟级别,同时具有近线性加速比。

    • GraphFlow: 流式图计算引擎及算法
    • 针对动态图数据挖掘需求,设计开发了基于状态更新的流式图计算引擎GraphFlow。该系统包含:(1) 基于状态更新传播的流式图计算模型,该模型在计算动态图数据时,能够在原有图状态基础上,并发计算增量信息的影响,而无需在整个图上重新计算。(2) 基于细粒度分布式锁结构的计算引擎,通过细粒度分布式锁,实现状态的并发更新,保证计算结果的正确性。(3) 典型流式图计算算法包括三角形统计、节点度统计、最短路径、PageRank等算法。相比较传统的批式图计算系统,GraphFlow能够实时计算并反馈结果,准确率较高。

    • DistStream: 流式机器学习并行计算框架及算法
    • 如何提高流式机器学习算法的训练效率,满足当前高速数据流的实时分析需求是重要的研究问题。针对该问题,提出了一种面向流式聚类算法的分布式计算框架DistStream,DistStream主要解决了流式聚类算法的并行化问题,同时保证了并行化后流式聚类算法的正确性。DistStream通过引入微批增量更新模式、设计基于多维划分的并行化方法来提高流式聚类算法的吞吐率,同时提出时序化的增量更新机制来保证微批增量更新模型的正确性。目前已在DistStream框架上实现了四个流式聚类算法CluStream、DenStream、D-Stream、ClusTree,相关的研究成果发表在IEEE ICDCS 2020。

    • Spark/Flink Benchmark: 大数据处理框架基准测试
    • 与华为杭州研究所合作,设计实现了Apache Flink性能基准测试框架。该框架包含8个典型的流处理应用,覆盖了电商购物实时分析、股票交易实时分析、车辆速度实时分析、农业温控实时分析、网络广告点击分析 5个场景,也覆盖了流处理的基本算子。该框架已应用于华为云平台实时流计算服务的测试分析。在流处理系统可靠性方面,研究了流处理系统中基于分布式缓存的失效恢复方法,通过将异步快照及状态信息保存在分布式缓存中,实现了快速精确的流处理中间数据及状态恢复,相关研究工作发表在Internetware 2017上。

    • 大数据系统可靠性保障:内存溢出错误分析与诊断,垃圾回收机制研究
    • 在大数据处理框架的可靠性方面取得多项成果。在MapReduce应用内存溢出错误分析方面,研究了123个真实Hadoop和Spark应用的内存溢出错误,总结了错误原因及修复方法,研究成果发表在ISSRE 2015 上。在MapReduce内存溢出错误诊断方面,研究实现了Hadoop MapReduce内存溢出错误诊断工具Mprof,研究成果发表在Journal of Software and Systems 2018上。在Spark应用可靠性研究方面,设计和实现了Spark可靠性测试框架,工作发表在ICDCS 2017 Joint Cloud Computing (JCC) workshop上。在面向大数据系统的垃圾回收机制方面,实验刨析了当前各种JVM垃圾回收算法在典型大数据应用下的性能和可靠性表现,系统地总结出大数据应用的5类典型内存使用模式,发现其中3类模式会导致GC算法出现严重性能问题甚至内存溢出,并进一步提出垃圾回收适应性优化方法,相关工作发表在VLDB 2019上。

合作研究

    • 与华为、阿里、百度、京东、腾讯、金山云等国内大型IT企业建立了良好的项目合作关系,同时推荐优秀毕业生到这些企业实习和工作。
    • 与国外学者合作密切,定期邀请如OSU Feng Qin教授、WPI的Tian Guo教授来访交流。

论文发表

专著

    • 许利杰 等,《大数据处理框架Apache Spark设计与实现》,电子工业出版社,2020-07

期刊

    • Chen Xu, Xiaoping Du, Xiangtao Fan, Gregory Giuliani, Zhongyang Hu, Wei Wang, Jie Liu, Teng Wang, Zhenzhen Yan, Junjie Zhu, Tianyang Jiang & Huadong Guo. Cloud-based storage and computing for remote sensing big data: a technical review. International Journal of Digital Earth, 15:1, 1417-1445
    • Hui Li, Shuping Ji, Hua Zhong, Wei Wang, Lijie Xu, Zhen Tang, Jun Wei, Tao Huang. LPW: An Efficient Data-Aware Cache Replacement Strategy for Apache Spark. Sci China Inf Sci, 2023, 66(1): 112104.
    • Zhen Tang, Wei Wang, Lei Sun, et al. IO Dependent SSD Cache Allocation for Elastic Hadoop Applications. Sci. China Inf. Sci. (2018) 61: 050104.
    • Lijie Xu, Wensheng Dou, Feng Zhu, Chushu Gao, Jie Liu, and Jun Wei. Characterizing and Diagnosing Out of Memory Errors in MapReduce Applications. The Journal of Systems and Software (JSS), Volume 137, March 2018, Pages 399-414.
    • Xiulei Qin, Wei Wang, Wenbo Zhang, Jun Wei, Xin Zhao, Hua Zhong, Tao Huang. PRESC2: efficient self-reconfiguration of cache strategies for elastic caching platforms. Computing 96(5): 415-451 (2014).
    • 汪钇丞, 曾鸿斌, 许利杰, 王伟, 魏峻, 黄涛. 面向大数据处理框架的JVM优化技术综述. 软件学报, 2023,34(1):463-488
    • 唐震, 王伟, 黄宇, 李艳林, 纪树平, 宋傲, 魏峻, 黄涛. 面向大规模集群的柔性配置更新推送方法. 中国科学: 信息科学, 2020, 50: 1645-1664
    • 钟华, 刘杰, 王伟. 科学大数据智能分析软件的现状与趋势. 中国科学院院刊,2018,33(8):812-817.
    • 唐震, 吴恒, 王伟, 魏峻, 黄涛. 虚拟化环境下面向多目标优化的自适应SSD缓存系统. 软件学报,2017,28(8):1982-1998.
    • 王彦士, 王伟, 刘朝辉, 魏峻, 黄涛. 支持透明集成的数据缓存机制. 计算机研究与发展, 2015, 52(4):907-917.
    • 王伟, 黄涛, 魏峻, 钟华, 宋云奎. 面向多租户Web应用的性能隔离方法. 中国科学:信息科学, 2013, 43(1):45-59.
    • 秦秀磊, 张文博, 王伟, 魏峻, 赵鑫, 黄涛. 面向云端Key/Value存储系统的开销敏感的数据迁移方法. 软件学报, 2013(6):1403-1417.
    • 秦秀磊, 张文博, 魏峻, 王伟, 黄涛. 云计算环境下分布式缓存技术的现状与挑战. 软件学报, 2013, 24(1):50-66.

会议

    • Yicheng Wang, Wensheng Dou, Yu Liang, Yi Wang, Wei Wang, Jun Wei, Tao Huang Evaluating Garbage Collection Performance Across Managed Language Runtimes, 47th IEEE/ACM International Conference on Software Engineering (ICSE 2025)
    • Jiansen Song, Wensheng Dou, Yu Gao, Ziyu Cui, Yingying Zheng, Dong Wang, Wei Wang, Jun Wei, Tao Huang. Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database Construction, 50th International Conference on Very Large Data Bases (VLDB 2024)
    • Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, and Ce Zhang. Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems, 50th International Conference on Very Large Data Bases (VLDB 2024)
    • Yingying Zheng, Wensheng Dou, Lei Tang, Ziyu Cui, Yu Gao, Jiansen Song, Liang Xu, Jiaxin Zhu, Wei Wang, Jun Wei, Hua Zhong, Tao Huang. Testing Gremlin-Based Graph Database Systems via Query Disassembling. 33st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024)
    • Shuping Ji, Zhen Tang, Wei Wang, Hui Li, Jianguo Yao, Hans-Arno Jacobsen. Ripple: Large-scale Service and Configuration Management in the Cloud. 25th International Middleware Conference (Middleware 2024)
    • Xiaochen Sun, Wei Wang, Tao Huang. How to Fit the SCC Algorithm Efficiently into Distributed Graph Iterative Computation. COMPSAC 2024
    • Xiaochen Sun, Wei Wang*, Tao Huang. Efficient Multi-network Community Search Method for Distributed Graph Iterative Computation. COMPSAC 2024
    • GraphFlow: A Fast and Accurate Distributed Streaming Graph Computation Model. ICPADS 2024
    • Yingying Zheng, Wensheng Dou, Lei Tang, Ziyu Cui, Jiansen Song, Ziyue Cheng, Wei Wang, Jun Wei, Hua Zhong, Tao Huang. Differential Optimization Testing of Gremlin-Based Graph Database Systems, 17th IEEE International Conference on Software Testing, Verification and Validation (ICST 2024)
    • Wensheng Dou, Ziyu Cui, Qianwang Dai, Jiansen Song, Dong Wang, Yu Gao, Wei Wang, Jun Wei, Lei Chen, Hanmo Wang, Hua Zhong, Tao Huang. Detecting Isolation Bugs via Transaction Oracle Construction, 45th IEEE/ACM International Conference on Software Engineering (ICSE 2023)
    • Jiansen Song, Wensheng Dou, Ziyu Cui, Qianwang Dai, Wei Wang, Jun Wei, Hua Zhong, Tao Huang. Testing Database Systems via Differential Query Execution, 45th IEEE/ACM International Conference on Software Engineering (ICSE 2023)
    • Rui Yang, Yingying Zheng, Lei Tang, Wensheng Dou, Wei Wang, Jun Wei. Randomized Differential Testing of RDF Stores, 45th IEEE/ACM International Conference on Software Engineering (ICSE Demo 2023)
    • Liwei Liu, Wei Chen, Tao Wang, Wei Wang, Guoquan Wu, Jun Wei. Generating Scenario-Centric TAP rules for Smart Homes by Mining Historical Event Logs, IEEE International Conference on Web Services (ICWS 2023)
    • Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, and Ce Zhang. 2022. In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle. In Proceedings of the 2022 International Conference on Management of Data (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 1286–1300.
    • Ziyu Cui, Wensheng Dou, Qianwang Dai, Jiansen Song, Wei Wang, Jun Wei, Dan Ye. Differentially Testing Database Transactions for Fun and Profit, 37th IEEE/ACM International Conference on Automated Software Engineering (ASE 2022)
    • Yingying Zheng, Wensheng Dou, Yicheng Wang, Zheng Qin, Lei Tang, Yu Gao, Dong Wang, Wei Wang, Jun Wei. Finding Bugs in Gremlin-Based Graph Database Systems via Randomized Differential Testing, 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2022)
    • Feng Zhu, Lijie Xu, Gang Ma, Shuping Ji, Jie Wang, Gang Wang, Hongyi Zhang, Kun Wan, Mingming Wang, Xingchao Zhang, Yuming Wang, Jingpin Li.An Empirical Study on Quality Issues of eBay's Big Data SQL Analytics Platform,ICSE 2022
    • Shuping Ji, Hans-Arno Jacobsen. A-tree: a dynamic data structure for efficiently indexing arbitrary boolean expressions. In: Proceedings of the 2021 International Conference on Management of Data (SIGMOD '21). 2021. p.817-829.
    • Shijian Li, Oren Mangoubi, Lijie Xu, Tian Guo. Sync-Switch: Hybrid Parameter Synchronization for Distributed Deep Learning. The 41st IEEE International Conference on Distributed Computing Systems (ICDCS 2021)
    • Kai Kang, Lijie Xu, Wei Wang, Guoquan Wu, Jun Wei, Wei Shi, Jizhong Li. A Hierarchical Automata Based Approach for Anomaly Detection in Smart Home Devices, IEEE iThings 2020, pp. 1-8
    • K. Zhang, Y. Fang, Y. Zheng, H. Zeng, L. Xu and W. Wang, GraphLib: A Parallel Graph Mining Library for Joint Cloud Computing, 2020 IEEE International Conference on Joint Cloud Computing, Oxford, United Kingdom, 2020, pp. 9-12
    • Lijie Xu, Xingtong Ye, Kai Kang, Tian Guo, Wensheng Dou, Wei Wang, and Jun Wei. DistStream: An Order-Aware Distributed Framework for Online-Offline Stream Clustering Algorithms. The 40th IEEE International Conference on Distributed Computing Systems (ICDCS 2020).
    • Hui Li, Dong Wang, Tianze Huang, Yu Gao, Wensheng Dou, Lijie Xu, Wei Wang, Jun Wei, Hua Zhong. Detecting Cache-Related Bugs in Spark Applications, 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2020)
    • Shijian Li, Robert J. Walls, Lijie Xu, and Tian Guo. Speeding up Deep Learning with Transient Servers. In Proceedings of the 16th IEEE International Conference on Autonomic Computing (ICAC 2019).
    • Xiaochen Sun, Xingtong Ye, Kai Kang, Lijie Xu, Wei Wang, Lv Lv. BSP-Based Strongly Connected Component Algorithm in Joint Cloud Computing. SOSE 2019.
    • Lijie Xu, Tian Guo, Wensheng Dou, Wei Wang, and Jun Wei. An Experimental Evaluation of Garbage Collectors on Big Data Applications. The 45th International Conference on Very Large Data Bases (VLDB 2019), pages 570-583.
    • Shijian Li, Robert J Walls, Lijie Xu, Tian Guo. Speeding up deep learning with transient servers. 2019 IEEE International Conference on Autonomic Computing (ICAC 2019), pages 125-135
    • Zhen Tang, Heng Wu, Lei Sun, Zhongshan Ren, Wei Wang, Wei Zhou, Liang Yang. Transaction-aware SSD Cache Allocation for the Virtualization Environment. SOSE 2018:174-179.
    • Yingying Zheng, Lijie Xu, Wei Wang, Wei Zhou, Ying Ding. A Reliability Benchmark for Big Data Systems on JointCloud. ICDCS Workshops 2017: 306-310.
    • Zhen Tang, Wei Wang, Yu Huang, Heng Wu, Jun Wei, Tao Huang. Application-centric SSD Cache Allocation for Hadoop Applications. Internetware 2017: 1-10.
    • Yingying Zheng, Wei Wang, Lijie Xu, Zhen Tang, Zhongshan Ren, Jun Wei, Dan Ye. Fast and Precise recovery in Stream processing based on Distributed Cache. Internetware 2017: 19:1-19:6.
    • Lijie Xu, Wensheng Dou, Feng Zhu, Chushu Gao, Jie Liu, Hua Zhong, Jun Wei. A Characteristic Study on Out of Memory Errors in Distributed Data-Parallel Applications. ISSRE 2015: 518-529.
    • Zhezhe Chen, James Dinan, Zhen Tang, Pavan Balaji, Hua Zhong, Jun Wei, Tao Huang, and Feng Qin. MC-Checker: Detecting Memory Consistency Errors in MPI One-Sided Applications. SC 2014: 499-510
    • Lijie Xu, Jie Liu, and Jun Wei. FMEM: A Fine-grained Memory Estimator for MapReduce Jobs. ICAC 2013:65-68.
    • Lijie Xu, Jie Liu, and Jun Wei. MapReduce Framework Optimization via Performance Modeling. IPDPS PhD Forum:2506-2509.
    • Wei Wang, Zhaohui Liu, Yong Jiang, Xinchen Yuan, Jun Wei. EasyCache: a transparent in-memory data caching approach for internetware. Internetware 2014: 35-44.
    • Qiang Gao, Wei Wang, Guoquan Wu, Xuan Li, Jun Wei, Hua Zhong. Migrating Load Testing to the Cloud: A Case Study. SOSE 2013: 429-434.
    • Wei Wang, Xiang Huang, Xiulei Qin, Wenbo Zhang, Jun Wei, Hua Zhong. Application-Level CPU Consumption Estimation: Towards Performance Isolation of Multi-tenancy Web Applications. IEEE CLOUD 2012: 439-446.
    • Xiulei Qin, Wenbo Zhang, Wei Wang, Jun Wei, Xin Zhao, Tao Huang. Optimizing data migration for cloud-based key-value stores. CIKM 2012: 2204-2208.
    • Xiulei Qin, Wenbo Zhang, Wei Wang, Jun Wei, Xin Zhao, Tao Huang. Towards a Cost-Aware Data Migration Approach for Key-Value Stores. CLUSTER 2012: 551-556.
    • Xiulei Qin, Wei Wang, Wenbo Zhang, Jun Wei, Xin Zhao, Tao Huang. Elasticat: A load rebalancing framework for cloud-based key-value stores. HiPC 2012: 1-10.
    • Shuping Ji, Wei Wang, Chunyang Ye, Jun Wei, Zhaohui Liu. Constructing a data accessing layer for in-memory data grid. Internetware 2012: 15:1-15:7.
    • Wei Wang, Xiang Huang, Yunkui Song, Wenbo Zhang, Jun Wei, Hua Zhong, Tao Huang. A Statistical Approach for Estimating CPU Consumption in Shared Java Middleware Server. COMPSAC 2011: 541-546.

专利

  • 一种基于Gremlin的图数据库系统的查询拆解测试方法及装置,202410030905.7
  • 一种面向遥感即时计算的细粒度并行调度方法及系统,202310866366.6
  • 一种面向动态负载的数据库在线参数调优方法及装置,202310733756.6
  • 一种面向大数据流式处理系统的动态负载均衡方法及装置,202310787374.1
  • 一种面向大数据处理框架的高效半自动垃圾回收方法和系统,202310739729.X
  • 面向云原生集群智能运维的自适应控制方法及装置,2023104188697
  • 一种面向Spark的自动缓存方法及装置,202310363669.6
  • 面向关系型数据库的悲观模式下的事务差分测试方法及装置,20220726
  • 面向关系型数据库中SQL语句执行的自动化测试方法及装置, 2022110462533
  • 一种Python领域知识图谱构建方法, 2022108272526
  • 一种FaaS服务运行环境自动构建方法, 202110055222.3
  • 一种面向微服务治理的数据发布-订阅方法和系统, 202011578199.8
  • 一种基于模块度的分布式社区发现方法, 202011622834.8
  • 一种面向社交网络的分布式用户聚类方法, ZL2020115782168
  • 一种面向Spark的基于数据感知的缓存替换方法及系统, ZL2020115257540
  • 一种面向大数据处理框架的GC自适应调节方法及装置, ZL202011472196.6
  • 一种基于多维数据立方体的数据处理方法及电子装置,202010842774.4
  • 一种一致性级别可控的自适应数据同步方法和系统, ZL201910903210.4
  • 一种面向大规模流数据的分布式聚类方法及系统, 201910795304.4
  • 一种基于状态更新传播的流式图计算方法及系统, ZL201810721794.9
  • 一种大数据流处理框架的性能基准测试系统及方法,ZL201810461515.X
  • 一种基于Spark SQL的分布式全文检索系统及方法,ZL201710269870.2
  • 一种云应用导向的固态盘缓存管理方法及系统,ZL201611127232.9
  • 一种基于混合存储的流式数据自适应持久化方法及系统,ZL201610197157.7
  • 一种基于内存数据网格的流式数据处理程序错误的数据溯源定位方法,ZL201610186177.4
  • 一种基于内存数据网格的实时流式数据处理失效恢复系统及方法,ZL201610186150.5
  • 一种面向内存数据网格的分布式事务保障方法,ZL201310415370.7
  • 一种面向分布式系统性能测试的测试资源管理方法,ZL201310376714.8
  • 基于机器学习的分布式缓存策略自适应切换方法及系统,ZL201110167018.7
  • 一种软件生产线构造方法及系统,ZL201010279066.0
  • 一种Web应用细粒度性能建模方法及其系统,ZL201010275216.0
  • 一种面向Web应用宿主平台的资源供给方法,ZL201010578793.7
  • 一种Web应用性能异常侦测方法,ZL200910079404.3
  • 线程池容量自适应调节方法及应用服务器并发控制方法,ZL200810119285.5
  • 一种基于模型同步的软件工具集成方法,ZL200810119280.2
  • 一种面向应用服务器的资源敏感性能优化方法及其系统,ZL200810119278.5

毕业生

博士

    • Yingying Zheng (Advisor: Wei Wang, Wensheng Dou), Gruadated in 2024, Frist employment: ISCAS, Beijing. Phd dissertation: Detecting Logic Bugs in Graph Database Systems.
    • Hui Li (Advisor: Hua Zhong), Gruadated in 2022, Frist employment: ISCAS, Beijing. Phd dissertation: Research on Cache Optimization for Spark.
    • Zhen Tang (Advisor: Tao Huang), Gruadated in 2019, Frist employment: ISCAS, Beijing. Phd dissertation: Workload-aware Performance Optimization of SSD Caches for Cloud Applications.
    • Xiulei Qin (Advisor: Tao Huang), Gruadated in 2013, Frist employment: Glodon, Beijing. Phd dissertation: Research on the Key Technologies of PaaS-Oriented Distributed Caching Services.

硕士

    • Mingchao Wu, Graduated in 2024, First employment: Tencent, Shenzhen
    • Hongjian Yang, Graduated in 2024, First employment: Alibaba, Hangzhou
    • Hongbin Zeng, Graduated in 2023, First employment: CMB, Shenzhen
    • Xiang Fang, Graduated in 2023, First employment: CTYun, Chengdu
    • Xingchen Chen, Graduated in 2023,First employment: CETC, Chengdu
    • Jiman Du, Graduated in 2023, First employment: Guangxi
    • Jincheng Jia, Graduated in 2023, First employment: Harbin
    • Yi Zou, Graduated in 2022, First employment: NIO, Shanghai
    • Jinglong Li, Graduated in 2022, First employment: CSSC, Kunming
    • Shengjie Chen, Graduated in 2022, First employment: YONYOU, Beijing
    • Ao Song, Graduated in 2021, First employment: Alibaba, Beijing
    • Yange Fang, Graduated in 2021, First employment: Zhejiang, Hangzhou
    • Kai Zhang, Graduated in 2021, First employment: FreeWheel, Beijing
    • Bingcheng Yuan, Graduated in 2021, First employment: YuanFuDao, Beijing
    • Kai Kang, Graduated in 2020, First employment: Tencent, Beijing
    • Xingtong Ye, Graduated in 2020, First employment: Microsoft, Beijing
    • Jiaqi Shao, Graduated in 2019, First employment: Baidu, Beijing
    • Jiaxuan Hu, Graduated in 2019, First employment: Alibaba, Hangzhou
    • Wenting Shen, Graduated in 2018, First employment: Alibaba, Hangzhou
    • Wei Zhao, Graduated in 2018, First employment: China Merchants Bank, Shenzhen
    • Chongrui Liu, Graduated in 2018, First employment: SenseTime, Beijing
    • Yingying Zheng, Graduated in 2017, First employment: ISCAS, Beijing
    • Caizheng Liu, Graduated in 2017, Studying in ICTCAS for Phd degree
    • Shikai Duan, Graduated in 2017, First employment: Alibaba, Hangzhou
    • Yong Jiang, Graduated in 2016, First employment: Antfin, Hangzhou
    • Xinchen Yuan, Graduated in 2016, First employment: JD, Beijing
    • Xiaoran Wang, Graduated in 2015, First employment: Xiaomi, Beijing
    • Tienan Chen, Graduated in 2015, First employment: Yuanfudao, Beijing
    • Zhaohui Liu, Graduated in 2015, First employment: Microsoft, US
    • Xinsheng Yang, Graduated in 2014, First employment: Microsoft, US
    • Yanshi Wang, Graduated in 2014, First employment: Alibaba, Hangzhou
    • Shuping Ji, Graduated in 2013, First employment: Microstrategy, Beijing
    • Xuan Li, Graduated in 2013, First employment: Microsoft, Beijing
    • Xin Zhao, Graduated in 2013, First employment: Microstrategy, Beijing

学士

    • Ziyue Wang, Graduated in 2024, UCAS, studying in ISCAS for Master degree
    • Fujie Zhou, Graduated in 2024, UCAS, studying in ISCAS for Master degree
    • Ran Yu, Graduated in 2024, TJU, studying in ISCAS for Master degree
    • Jingwei Cheng, Graduated in 2024, USTB, studying in ISCAS for Master degree
    • Haonan Yu, Graduated in 2023, THU, studying in ISCAS for Master degree
    • Yinqi Wang, Graduated in 2023, UESTC, studying in ISCAS for Master degree
    • Yiran Guo, Graduated in 2023, CQUPT, studying in ISCAS for Master degree
    • Tinghui Gui, Graduated in 2023, UCAS
    • Yu Liang, Graduated in 2022, SCU, studying in ISCAS for Master degree
    • Yi Wang, Graduated in 2022, CCNU, studying in ISCAS for Master degree
    • Qin Yang, Graduated in 2022, UCAS, studying in ICT for Master degree
    • Mingchao Wu, Graduated in 2021, BUPT, studying in ISCAS for Master degree
    • Hongjian Yang, Graduated in 2021, SCU, studying in ISCAS for Master degree
    • Xianglong Liu, Graduated in 2021, UCAS, studying in ISCAS for Master degree
    • Zhengpin Qian, Graduated in 2021, UCAS, studying in UCAS for Master degree
    • Jiashan Li, Graduated in 2021, UCAS
    • ​Hongbin Zeng, Graduated in 2020, UCAS, studying in ISCAS for Master degree
    • ​Jinglong Li, Graduated in 2019, UCAS, studying in ISCAS for Master degree
    • ​Yicheng Wang, Graduated in 2019, UCAS, studying in ISCAS for Master degree
    • ​Xinyu Zhou, Graduated in 2019, DLUT, studying in ISCAS for Master degree
    • ​Lingyun Shi, Graduated in 2019, BJTBU
    • ​Bingcheng Yuan, Graduated in 2018, UCAS, studying in ISCAS for Master degree
    • Muzi Qu, Graduated in 2018, UCAS, studying in ISCAS for Master degree
    • Kai Zhang, Graduated in 2018, UCAS, studying in ISCAS for Master degree
    • Keren Zhu, Graduated in 2018, TJU,studying in USC for Master degree
    • Kai Kang, Graduated in 2017, JLU, studying in ISCAS for Master degree
    • Xingtong Ye, Graduated in 2017, NJTU, studying in ISCAS for Master degree
    • Shixiong Li, Graduated in 2016, HUST, studying in ISCAS for Master degree
    • Chongrui Liu, Graduated in 2015, BUAA, studying in ISCAS for Master degree
    • Yingying Zheng, Graduated in 2014, JLU, studying in ISCAS for Master degree
    • Kai Ren, Graduated in 2013, USTC, studying in ISCAS for Master degree