Experience

Research, product, and entrepreneurial work.

2025.02 - 2026.01

Leader

Extensively read books (spanning psychology, family education, team management, project management, product design, biographies, content operations, user operations, etc.), establishing a comprehensive business perspective.
Assembled a cross-functional team covering full-stack development, algorithm research, product design, user operations, and content operations, further polishing my leadership qualities.
Iteratively optimized multi-facet AI-native SOPs, covering market research, user modeling, product iteration, development collaboration, content production, user interviews, and community operations, leveraging cutting-edge AI products to multiply team collaboration and project management efficiency.
Continuously iterated on user personas, journeys, pain points, needs, and solutions, successfully determined the product's market positioning and MVP, enhancing rapid iteration and optimization capabilities.
Comprehensively enhanced full-stack development capabilities, deepened understanding and proficient application of future programming paradigms (i.e., vibe coding).

2022.09 - 2024.09

Main Contributor

Targeting the high resource consumption associated with multi-user personalization scenarios, introduce MoS and PRoLoRA algorithms, reducing the trainable parameters by times compared to vanilla LoRA.
Targeting the under-performance of multi-modal tasks, propose the ProReason framework to decouple visual and reasoning capabilities and leverage LLMs to enhance reasoning, improving multi-modal performance.
Propose the QSpec technique to share quantized weights and KV Cache in a speculative decoding framework, achieving lossless inference acceleration in low-resource scenarios (e.g., edge devices).
Construct Cantonese evaluation datasets for common knowledge, factual generation and complex reasoning tasks, and compare the performance of 35 open-source and closed-source LLMs.
Extract audio from 1939 hours of dysarthric video, convert it into transcripts with offline automatic speech recognition (ASR) services, and finalize it through manual revision.
Implement MoChA and Whisper models for end-to-end ASR, design enhancement modules for dysarthric speech, expand vocabulary, and post-train models for improved Cantonese ASR performance.
Train a Transformer-based acoustic model and HiFi-GAN vocoder for text-to-speech (TTS) based on the "aidatatang_200zh" corpus, and inject customized voice characteristics into it with a speaker encoder.

2021.09 - 2022.12

Only Major Contributor

Set up 14 sound source positions at the 4-th floor of Chow Yei Ching Building in the HKU, and collect 1984 pieces of multi-channel microphone data, totaling about 500 minutes. Process it with voice signal framing, windowing, filtering, normalization, and other preprocessing techniques.
Investigate and test traditional and learning-based noise suppression and speech separation algorithms, and finally select the NSNet2 model for noise reduction of the preprocessed speech signals.
Extract GCC-PHAT and MFCC features, and innovatively design an ensemble learning-inspired parallel module to enhance RD3Net for sound source localization (SSL), improving the accuracy from 88.2% to 93.6%. In real-world scenarios, combined with the above noise suppression module, RD3Net performs satisfactorily, facilitating basic indoor voice navigation.
Utilize A2C and D3QN reinforcement learning algorithms to fine-tune the model online, and obtain the expected effect in both the simulation and real environment.
Design a low-computation traditional SSL algorithm based on TDOA, accurately detecting near-field sound source positions for user fall alerts.

2020.03 - 2020.04

Team Leader

Learn data mining and mathematical modeling methods, deepening the understanding of data processing and scenario modeling.
Analyze the rating and review data of products on Amazon, identify key patterns and relationships, design indicators, explore potential function designs, and formulate e-commerce sales strategies to promote product reputation and sales.
Organize our team for efficient preparation, assign tasks reasonably, collect literature, design the modeling scheme, and write the final paper.

2019.07 - 2019.07

Main Contributer

Collaborate to complete the "Heaven's Scrutiny" project, a basic criminal arrest system based on face recognition and skeleton detection.
Design feature extraction module, optimize AlphaPose algorithm, and integrate all the modules of team members.
Rank first in the course evaluation system, be recognized by the judge panel, and win the double titles of "Best Team" and "Best Individual".
Complete high-intensity learning tasks and projects, cultivating the team spirit and the ability to work under high pressure.

2017.09 - 2020.07

Organize various activities of our class, boosting the ability of organization and coordination.
Communicate with teachers and classmates proactively, improving communication skills.
Handle multiple affairs of class and individual at the same time, cultivating self-management ability.