Experience

Research, product, and entrepreneurial work.

Family Education

2025.02 - 2026.01

Leader

  • Extensively read books (spanning psychology, family education, team management, project management, product design, biographies, content operations, user operations, etc.), establishing a comprehensive business perspective.
  • Assembled a cross-functional team covering full-stack development, algorithm research, product design, user operations, and content operations, further polishing my leadership qualities.
  • Iteratively optimized multi-facet AI-native SOPs, covering market research, user modeling, product iteration, development collaboration, content production, user interviews, and community operations, leveraging cutting-edge AI products to multiply team collaboration and project management efficiency.
  • Continuously iterated on user personas, journeys, pain points, needs, and solutions, successfully determined the product's market positioning and MVP, enhancing rapid iteration and optimization capabilities.
  • Comprehensively enhanced full-stack development capabilities, deepened understanding and proficient application of future programming paradigms (i.e., vibe coding).

Smart Speaker

2022.09 - 2024.09

Main Contributor

  • Targeting the high resource consumption associated with multi-user personalization scenarios, introduce MoS and PRoLoRA algorithms, reducing the trainable parameters by times compared to vanilla LoRA.
  • Targeting the under-performance of multi-modal tasks, propose the ProReason framework to decouple visual and reasoning capabilities and leverage LLMs to enhance reasoning, improving multi-modal performance.
  • Propose the QSpec technique to share quantized weights and KV Cache in a speculative decoding framework, achieving lossless inference acceleration in low-resource scenarios (e.g., edge devices).
  • Construct Cantonese evaluation datasets for common knowledge, factual generation and complex reasoning tasks, and compare the performance of 35 open-source and closed-source LLMs.
  • Extract audio from 1939 hours of dysarthric video, convert it into transcripts with offline automatic speech recognition (ASR) services, and finalize it through manual revision.
  • Implement MoChA and Whisper models for end-to-end ASR, design enhancement modules for dysarthric speech, expand vocabulary, and post-train models for improved Cantonese ASR performance.
  • Train a Transformer-based acoustic model and HiFi-GAN vocoder for text-to-speech (TTS) based on the "aidatatang_200zh" corpus, and inject customized voice characteristics into it with a speaker encoder.

Smart Robotic Walker - Sound Source Localization

2021.09 - 2022.12

Only Major Contributor

  • Set up 14 sound source positions at the 4-th floor of Chow Yei Ching Building in the HKU, and collect 1984 pieces of multi-channel microphone data, totaling about 500 minutes. Process it with voice signal framing, windowing, filtering, normalization, and other preprocessing techniques.
  • Investigate and test traditional and learning-based noise suppression and speech separation algorithms, and finally select the NSNet2 model for noise reduction of the preprocessed speech signals.
  • Extract GCC-PHAT and MFCC features, and innovatively design an ensemble learning-inspired parallel module to enhance RD3Net for sound source localization (SSL), improving the accuracy from 88.2% to 93.6%. In real-world scenarios, combined with the above noise suppression module, RD3Net performs satisfactorily, facilitating basic indoor voice navigation.
  • Utilize A2C and D3QN reinforcement learning algorithms to fine-tune the model online, and obtain the expected effect in both the simulation and real environment.
  • Design a low-computation traditional SSL algorithm based on TDOA, accurately detecting near-field sound source positions for user fall alerts.

Mathematical Contest in Modeling

2020.03 - 2020.04

Team Leader

  • Learn data mining and mathematical modeling methods, deepening the understanding of data processing and scenario modeling.
  • Analyze the rating and review data of products on Amazon, identify key patterns and relationships, design indicators, explore potential function designs, and formulate e-commerce sales strategies to promote product reputation and sales.
  • Organize our team for efficient preparation, assign tasks reasonably, collect literature, design the modeling scheme, and write the final paper.

AI Summer Experience in National University of Singapore

2019.07 - 2019.07

Main Contributer

  • Collaborate to complete the "Heaven's Scrutiny" project, a basic criminal arrest system based on face recognition and skeleton detection.
  • Design feature extraction module, optimize AlphaPose algorithm, and integrate all the modules of team members.
  • Rank first in the course evaluation system, be recognized by the judge panel, and win the double titles of "Best Team" and "Best Individual".
  • Complete high-intensity learning tasks and projects, cultivating the team spirit and the ability to work under high pressure.

Monitor

2017.09 - 2020.07
  • Organize various activities of our class, boosting the ability of organization and coordination.
  • Communicate with teachers and classmates proactively, improving communication skills.
  • Handle multiple affairs of class and individual at the same time, cultivating self-management ability.