curriculum vitae
General Information
Full Name | Zheng Gao |
Contact | woshigaozheng [at] gmail [dot] com |
Research Interests | Large Language Model, Natural Langauge Processing, Graph Mining |
Languages | English, Chinese |
Education
- 2015 - 2020
Ph.D. in Information Science
Indiana University Bloomington, United States
- Minor in Computer Science
- Advised by Prof. Xiaozhong Liu
- 2013 - 2015
M.S. in Information Science
University of Pittsburgh, United States
- 2009 - 2013
B.M. in Information Management and System
Shanghai International Studies University, China
Experience
- 02/2023 - now
Senior Algorithm Engineer
Ant Group
- Trained 10B and 65B Ant Group self-innovated large language model (LLM). Supported supervised fine-tuning stage to improve LLM reasoning capability.
- Lead to implement and maintain company-level LLM evaluation pipeline, which contained 50+ public/private datasets and 20+ evaluation metrics. It served as the default evaluation toolkit for all major Ant Group Artificial General Intelligence (AGI) teams.
- Implemented LLM agents to support two Ant Group internal human resource related use-cases, including meeting arrangements and hiring & dimission.
- 06/2020 - 01/2023
Applied Scientist
Amazon Alexa AI
- Built Natural Language Understanding (NLU) pipelines for Alexa to support customer utterance interpretation.
- Expanded Alexa NLU pipelines from English regions to other language regions by involving contextual signals.
- Developed 3p skill recommendation for fall back utterances to improve customer experience.
- 06/2019 - 09/2019
Data Scientist Intern
Amazon Alexa AI
- Applied deep language models and state-of-art clustering methods to extract infuential text patterns from user requests.
- Built up an automatic pipeline by Spark and Shell scripts to enable training models on multiple data resources under Alexa restricted environment to replace existing human labor annotation.
- 02/2018 - 03/2019
NLP Research Intern
Alibaba DAMO Academy / AI Lab
- Generated product review summary from user consecutive behaviors by leveraging dynamic matrix factorization, deep reinforcement learning (Policy Gradient) and sequence to sequence model (Neural Machine Translation) with Attention techniques.
- Proposed an end-to-end pairwise ranking model with transfer learning techniques to detect communities in targeted sparse graphs.
- Detected multilevel anomalies from high dimensional dynamic use logs via Adversarial Autoencoder and Attention-based hierarchical representation learning.
Services
-
Conference Reviewer
- Annual Meeting of the Association for Computational Linguistics (ACL 2024)
- iConference (2023,2024)
- ACM International Conference on Web Search and Data Mining (WSDM 2023,2024)
- AAAI Conference on Artificial Intelligence (AAAI 2022,2023,2024)
- International Workshop on Deep Learning Practice for High-Dimensional Sparse Data (DLP-RecSys 2023; DLP-KDD 2020,2021)
- Workshop on Information Extraction from Scientific Publications (WIESP-AACL 2022)
- The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)
- China Conference on Knowledge Graph and Semantic Computing (CCKS 2022)
- Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE 2022)
- IEEE International Conference on Multimedia and Expo (ICME 2022)
- The Web Conference (WWW 2019, 2020)
- ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018, 2022)
- IEEE International Conference on Big Data (BigData 2020, 2022)
- Joint Conference on Digital Libraries (JCDL 2021, 2022)
- International Workshop on Knowledge Graph (IWKG-KDD 2020)
- Workshop on Scholarly Document Processing (SDP-NAACL 2021, SDP-COLING 2022)
- International Conference on Information Systems (ICIS 2021)
- China Conference on Information Retrieval (CCIR 2021)
-
Journal Reviewer
- Data Intelligence (2022)
- The Social Science Journal (2022)
- Journal of Informetrics (JOI 2021)
- Computers in Industry (2021)
- Journal of the Association for Information Science and Technology (JASIST 2019, 2021)
- PeerJ Computer Science (2020)
- PLoS ONE (2020, 2021)
- BMC Bioinformatics (2019, 2020, 2022)
- Social Network Analysis and Mining (SNAM 2019, 2020, 2021)
- Medical Science Monitor (2019)
- ACM Transactions on Computing for Healthcare (2020)
-
Funding Reviewer
- Amazon Research Awards (ARA 2022)
-
Administrative Service
- Chair of Doctoral Student Association (DSA) at Department of Information and Library Science, Indiana University Bloomington (2016 - 2018)
Honors and Awards
- 2018 - 2019
- Clayton A. Shepherd Scholarship, Indiana University Bloomington
- 2015 - 2018
- T’ung-li Yuan Memorial Fellowship, Indiana University Bloomington