郝继泰

Jitai Hao

关于我

About Me

我的主要研究方向为 Efficient AI/LLMs统一多模态理解与生成模型。目前我在哈尔滨工业大学(深圳)攻读博士学位,导师为 俞俊教授。在此之前,我于 2022 和 2025 年分别在山东大学获得计算机科学学士和硕士学位,导师是 任昭春教授

My research interests include Efficient AI/LLMs and unified multimodal understanding & generation models. I am currently a Ph.D. student at Harbin Institute of Technology (Shenzhen), supervised by Prof. Jun Yu. Before that, I obtained my Bachelor's and Master's degrees in Computer Science from Shandong University in 2022 and 2025, respectively, under the supervision of Prof. Zhaochun Ren.

最新动态

News

[2026年2月] 我们关于统一多模态模型训练中模态冲突缓解的新论文 Uni-X 已被 ICLR 2026 (Poster) 接收!

[Feb 2026] Our new paper Uni-X on mitigating modality conflict for unified multimodal models has been accepted to ICLR 2026 (Poster)!

[2025年9月] 我们关于大模型高效知识蒸馏的新论文已被 NeurIPS 2025 (Spotlight) 接收!我们提出了 Low-Rank Clone (💖LRC💖),一种创新的 SLM 高效预训练方法。 LRC 仅需约 10B-20B tokens 即可达到甚至超越需要数万亿 (Trillions) tokens 训练的SOTA模型。

[Sep 2025] Our new paper on efficient knowledge distillation for LLMs has been accepted to NeurIPS 2025 (Spotlight)! We propose Low-Rank Clone (💖LRC💖), an innovative and efficient pretraining method for SLMs. LRC can achieve, or even surpass, the performance of SOTA models trained on trillions of tokens with only about 10B-20B tokens.

研究成果

Publications

DeltaKV: Residual-Based KV Cache Compression via Long-Range Similarity

Jitai Hao, Qiang Huang, Yaowei Wang, Min Zhang, Jun Yu
arXiv 2026
TL;DR: 我们发现 KV 表征中存在显著的长程相似性,据此提出 DeltaKV。通过将语义表示编码为相对历史参考的残差,在不丢弃 token 的情况下将 KV 内存降至 29%,在 SCBench、AIME 等任务上接近无损,并实现 2× 吞吐提升。
TL;DR: Motivated by the long-range similarity in KV representations, we propose DeltaKV. It encodes semantic residuals relative to historical references, reducing KV memory to 29% without discarding tokens. It achieves near-lossless performance on SCBench and AIME, with 2× throughput gain.

Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models

Jitai Hao*, Hao Liu*, Xinyan Xiao, Qiang Huang, Jun Yu
ICLR 2026 (Poster)
TL;DR: 提出 Uni-X “两端分离—中间共享”架构,通过模态特定路径缓解统一多模态模型中的梯度冲突。3B 规模的 Uni-X 性能比肩 7B 模型,并在 GenEval 上达到 82.0。
TL;DR: Uni-X uses an X-shaped "two-end-separated" architecture to mitigate modality conflicts in unified multimodal models. A 3B Uni-X matches or surpasses 7B models, achieving 82.0 on GenEval.

A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone

Jitai Hao, Qiang Huang, Hao Liu, Xinyan Xiao, Zhaochun Ren, Jun Yu
NeurIPS 2025 (Spotlight)
TL;DR: 我们提出 Low-Rank Clone (LRC),一种能显著提高模型训练效率的创新方法。LRC 仅需约 10B-20B tokens 数据,就能达到甚至超越那些用数万亿 tokens 训练的顶尖模型(如 Qwen3, Llama3)的性能。
TL;DR: We propose Low-Rank Clone (LRC), an innovative method to significantly improve model training efficiency. With only about 10B-20B tokens, LRC can match or even surpass the performance of SOTA models like Qwen3 and Llama3, which are trained on trillions of tokens.

OmniKV: Dynamic Context Selection for Efficient Long-Context LLMs

Jitai Hao*, Yuke Zhu*, Tian Wang, Jun Yu, Xin Xin, Bo Zheng, Zhaochun Ren, Sheng Guo
ICLR 2025
TL;DR: 通过创新地提出层间注意力相似性,OmniKV 能够动态选择最重要的上下文信息,从而在处理长文本时,显著提升 LLM 的效率和性能,同时降低计算成本。
TL;DR: By innovatively proposing inter-layer attention similarity, OmniKV dynamically selects the most crucial context information, significantly enhancing the efficiency and performance of LLMs for long-context tasks while reducing computational costs.

MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

Jitai Hao, Weiwei Sun, Xin Xin, Qi Meng, Zhumin Chen, Pengjie Ren, Zhaochun Ren
ACL 2024
TL;DR: MEFT 是一种内存效率更高的模型微调方法。它通过引入稀疏适配器(Sparse Adapter)来减少微调时占用的内存,使得在大模型上进行微调更加可行和高效。
TL;DR: MEFT is a memory-efficient fine-tuning method. It reduces memory usage during fine-tuning by introducing a sparse adapter, making it more feasible and efficient to fine-tune large models.

Multi-Defendant Legal Judgment Prediction via Hierarchical Reasoning

Yougang Lyu*, Jitai Hao*, Zihan Wang, Kai Zhao, Shen Gao, Pengjie Ren, Zhumin Chen, Fang Wang, Zhaochun Ren
EMNLP 2023 Findings
TL;DR: 本文研究如何通过分层推理来预测涉及多个被告的法律判决结果,旨在理解案件中复杂的实体关系和逻辑链条,以提高预测的准确性。
TL;DR: This paper investigates how to predict legal judgment outcomes involving multiple defendants through hierarchical reasoning, aiming to understand complex relationships and logical chains to improve prediction accuracy.

实习经历

Internship Experience

  • 百度 (2025年3月 - 2025年10月),研究实习生
  • 蚂蚁集团 (2024年5月 - 2024年10月),研究实习生
  • Baidu (Mar. 2025 - Oct. 2025), Research Intern
  • Ant Group (May 2024 - Oct. 2024), Research Intern

奖项与荣誉

Awards & Honors

  • 国家奖学金,一等学业奖学金等
  • ACM/ICPC 亚洲区域赛银牌 (2枚)
  • 大学生软件创新大赛 (OPPO 杯) 国家一等奖
  • National Scholarship, First-Class Academic Scholarship, etc.
  • ACM/ICPC Asia Regional Contest, Silver Medal (x2)
  • National First Prize, University Student Software Innovation Competition (OPPO Cup)