Experience
Research, product, and entrepreneurial work.
Family Education
2025.02 - 2026.01Leader
- Extensively read books (spanning psychology, family education, team management, project management, product design, biographies, content operations, user operations, etc.), establishing a comprehensive business perspective.
- Assembled a cross-functional team covering full-stack development, algorithm research, product design, user operations, and content operations, further polishing my leadership qualities.
- Iteratively optimized multi-facet AI-native SOPs, covering market research, user modeling, product iteration, development collaboration, content production, user interviews, and community operations, leveraging cutting-edge AI products to multiply team collaboration and project management efficiency.
- Continuously iterated on user personas, journeys, pain points, needs, and solutions, successfully determined the product's market positioning and MVP, enhancing rapid iteration and optimization capabilities.
- Comprehensively enhanced full-stack development capabilities, deepened understanding and proficient application of future programming paradigms (i.e., vibe coding).
Smart Speaker
2022.09 - 2024.09Main Contributor
- Targeting the high resource consumption associated with multi-user personalization scenarios, introduce MoS and PRoLoRA algorithms, reducing the trainable parameters by times compared to vanilla LoRA.
- Targeting the under-performance of multi-modal tasks, propose the ProReason framework to decouple visual and reasoning capabilities and leverage LLMs to enhance reasoning, improving multi-modal performance.
- Propose the QSpec technique to share quantized weights and KV Cache in a speculative decoding framework, achieving lossless inference acceleration in low-resource scenarios (e.g., edge devices).
- Construct Cantonese evaluation datasets for common knowledge, factual generation and complex reasoning tasks, and compare the performance of 35 open-source and closed-source LLMs.
- Extract audio from 1939 hours of dysarthric video, convert it into transcripts with offline automatic speech recognition (ASR) services, and finalize it through manual revision.
- Implement MoChA and Whisper models for end-to-end ASR, design enhancement modules for dysarthric speech, expand vocabulary, and post-train models for improved Cantonese ASR performance.
- Train a Transformer-based acoustic model and HiFi-GAN vocoder for text-to-speech (TTS) based on the "aidatatang_200zh" corpus, and inject customized voice characteristics into it with a speaker encoder.
Smart Robotic Walker - Sound Source Localization
2021.09 - 2022.12Only Major Contributor
- Set up 14 sound source positions at the 4-th floor of Chow Yei Ching Building in the HKU, and collect 1984 pieces of multi-channel microphone data, totaling about 500 minutes. Process it with voice signal framing, windowing, filtering, normalization, and other preprocessing techniques.
- Investigate and test traditional and learning-based noise suppression and speech separation algorithms, and finally select the NSNet2 model for noise reduction of the preprocessed speech signals.
- Extract GCC-PHAT and MFCC features, and innovatively design an ensemble learning-inspired parallel module to enhance RD3Net for sound source localization (SSL), improving the accuracy from 88.2% to 93.6%. In real-world scenarios, combined with the above noise suppression module, RD3Net performs satisfactorily, facilitating basic indoor voice navigation.
- Utilize A2C and D3QN reinforcement learning algorithms to fine-tune the model online, and obtain the expected effect in both the simulation and real environment.
- Design a low-computation traditional SSL algorithm based on TDOA, accurately detecting near-field sound source positions for user fall alerts.
Mathematical Contest in Modeling
2020.03 - 2020.04Team Leader
- Learn data mining and mathematical modeling methods, deepening the understanding of data processing and scenario modeling.
- Analyze the rating and review data of products on Amazon, identify key patterns and relationships, design indicators, explore potential function designs, and formulate e-commerce sales strategies to promote product reputation and sales.
- Organize our team for efficient preparation, assign tasks reasonably, collect literature, design the modeling scheme, and write the final paper.
AI Summer Experience in National University of Singapore
2019.07 - 2019.07Main Contributer
- Collaborate to complete the "Heaven's Scrutiny" project, a basic criminal arrest system based on face recognition and skeleton detection.
- Design feature extraction module, optimize AlphaPose algorithm, and integrate all the modules of team members.
- Rank first in the course evaluation system, be recognized by the judge panel, and win the double titles of "Best Team" and "Best Individual".
- Complete high-intensity learning tasks and projects, cultivating the team spirit and the ability to work under high pressure.
Monitor
2017.09 - 2020.07- Organize various activities of our class, boosting the ability of organization and coordination.
- Communicate with teachers and classmates proactively, improving communication skills.
- Handle multiple affairs of class and individual at the same time, cultivating self-management ability.