I craft tech, communities, and the future with AI research.
𝗡𝗔𝗠𝗘: Haoli Yin
𝗔𝗚𝗘: 22
𝗗𝗘𝗚𝗥𝗘𝗘: M.S. in Computer Science, B.S. in Computer Science and minor in Mathematics
𝗚𝗥𝗔𝗗𝗨𝗔𝗧𝗜𝗢𝗡: May 2025
about me
Research Interests: data curation, multimodal ML, and scaling (both ML sys and research).
what i’m up to now
- Member of Technical Staff @ DatologyAI
- Researcher, Data Engineer, or anything for the mission
- Scrolling X/Twitter
- Lifting, Badminton, Horror Movies
Maybe the purpose of life is to just be alive; we inject meaning to keep us living.
publications
UniCat: Crafting a Stronger Fusion Baseline for Multimodal Re-Identification
- Jennifer Crawford, Haoli Yin, Luke McDermott, Daniel Cummings
- Accepted at NeurIPS 2023 UniReps Workshop
- Achieved State-of-the-Art (by a significant margin) on several multimodal re-identification benchmark datasets
GraFT: Gradual Fusion Transformer for Multimodal Re-Identification
- Haoli Yin*, Jiayao (Emily) Li*, Eva Schiller*, Luke McDermott, Daniel Cummings
- *Work done during Summer 2023 Research Scientist Internship at Modern Intelligence
- Borderline Accept reviews at WACV 2024
Digital Staining of Unpaired White and Blue Light Cystoscopy Videos for Bladder Cancer Detection in the Clinic
GitHub | Short Paper | Poster
- Shuang Chang, Haoli Yin, Kristen R Scarpato, Amy N Luckenbaugh, Sam Chang, Christian Bolenz, Maximilian C Kriegmair, Nikolaos C Deliolanis, Soheil Kolouri, Audrey Bowden
- Accepted to Poster presentation for MIDL 2023 Short Paper track
- Manuscript for publication in progress
SpecReFlow: A Specular Reflection Restoration Framework using Flow-Guided Video Completion
- Haoli Yin, Rachel Eimen, Daniel Moyer, Audrey Bowden
- Submitted to SPIE Journal of Medical Imaging
- Delivered 20-minute oral talk at SPIE Photonics West 2023 Conference
Prostate Lesion Detection and Salient Feature Assessment using Zone-based Classifiers
- Haoli Yin, Nithin Buduma
- Selected as top 10 best papers at the Summer STEM Institute
experience
DatologyAI
Member of Technical Staff | June 2024 - Present
- Orchestrated data pipeline at multi-billion sample scale, curating image-text multimodal datasets to speed up CLIP pretraining by >10x to reach the same performance as the uncurated raw data baseline and >2x vs CLIPScore filtering
- Ported the OpenCLIP repository for internal use, enabling multi-node, multi-gpu distributed training with SLURM and implemented comprehensive eval suite. Monitored CLIP pretraining in WandB and managed artifact storage with AWS S3.
- Fine-tuned Multimodal Data Filtering Networks for improved scoring and curating multimodal pretraining datasets
- Led a synthetic data generation project to leverage vLLM, improving pretraining time by 40% and avg eval performance by up to 50%.
Come work with me and learn more here!
Modern Intelligence
AI Research Scientist Intern | Jan 2023 - Dec 2023
- Authored and developed “GraFT: Gradual Fusion Transformer for Multimodal Re-identification,” which reduced transformer model size by 62%, achieved state-of-the-art performance in multimodal vehicle re-identification benchmarks, and introduced a novel multimodal contrastive learning objective, thereby establishing a unique market differentiator for the company.
- Led 3-day sprints🏃♀️, swiftly evolving ideas into fully-realized experiments with intern team, catalyzing project momentum 🚀.
- Engineered a robust, modular PyTorch infrastructure, harnessing Lightning Fabric for multi-GPU training to supercharge model training speed by 400% 🚄, advancing overall project timeline.
- Authored a custom job scheduler with a user-centric interface, optimizing load balancing and training run management, improving resource utilization by 40% 🛠️.
Work with us at modernintelligence.ai/careers
Bowden Biomedical Optics Lab
Research Assistant | Nov. 2021 - Present
- Spearheaded 🚀 deep learning research on specular reflection restoration in white-light endoscopy videos as first author in Bowden Biomedical Optics Lab, achieving state-of-the-art results with ASPP model 🥇 (92.8% Dice Score, 52.3% sensitivity increase over U-net models) and flow-guided video completion pipeline 🎥 leveraging optical flow estimation 🌊 and vision transformer models 🤖 (16.8% PSNR, 10.1% SSIM improvements over spatial inpainting methods).
- Project areas include 3D hollow organ model reconstruction 🧠, video artifact restoration 🎞️, and current project of using GAN models 🎨 for semantically-aware modality transfer to enhance sensitivity of bladder cancer detection 🚨.
Visit us at lab.vanderbilt.edu/bowdenlab
Yoomi Health
Machine Learning Engineer (Contract) | Sept. 2022 - Feb 2023
- Spearheaded research initiatives as third hire in pre-seed physical therapy startup, delivering a state-of-the-art 💪 pose estimation model with 🔥 98% mAP using the EfficientFormerV2 transformer backbone for 📱mobile optimization.
- Pioneered efficient in-browser 💻 edge-deployment of the core 2D pose estimation model using TF.js and int8 quantization, achieving real-time inference optimization and driving significant improvements in ⚡️ speed and performance.
- Watch as our featured experience was demoed to Mark Cuban 🦈 for $46k in pre-seed funding 💰.
Visit our site at yoomi.health.
Lynntech Inc.
Data Science Research Intern | May 2022 - Aug. 2022
- Designed and implemented a 🚀GPU-accelerated state estimation engine using C++ and MAGMA (CUDA wrapper), reducing runtime by 21.8% compared to the MATLAB baseline.
- Developed over 20 GPU-accelerated linear algebra utility functions 💪 with unit tests using C++.
- Validated the effectiveness of 20+ adversarial ML patterns 🕵️♀️ on 30+ state-of-the-art PyTorch classification models using Anaconda, Jupyter Notebook, NumPy, OpenCV, and Pandas
Learn more at lynntech.com/about
honors
- Neo Scholar Finalist (2024)
- Named one of 150 Neo Scholar Finalists out of pool of over 1,000 talented engineers and researchers. See Neo Scholars Website.
- Goldwater Scholarship (2023)
- Named one of 413 Scholars out of estimated pool of over 5,000 college sophomores and juniors, highlighting exceptional aptitude and potential for a research career in natural sciences, engineering, and mathematics. See Goldwater Scholarship Website.
- Google CS Research Mentorship Program (2023a)
- Accepted to a three month program that matches students with Google mentors and peers to support their pursuit of computer science research pathways. See Google CSRMP Website.
- Cornelius Vanderbilt Scholarship (2021 - Present)
- Awarded full-tuition academic merit scholarship awarded to top candidates who demonstrate academic achievement, intellectual promise, and leadership qualities, continuing the mission of Vanderbilt's founder to unite people and ideas across the world. See CV Scholarship Website.
- Equitable Excellence $10k Scholarship (2021)
- Recognized as a recipient of the Equitable Excellence Scholarship, a flagship program of Equitable Foundation, providing development opportunities and support to empower students' future plans and make a positive impact in their communities. See Website.
- National Merit Scholar (2021)
- Coca-Cola Scholarship Semifinalist (2020)
- Chosen as one of 1,609 applicants from a pool of 99,403 applications (1.6% acceptance) based on superior leadership, community service, and academic excellence.
- Science Olympiad National Medalist (2019-2021)
- USA Biology Olympiad Semifinalist (2018-2020)
technical skills
- Programming Languages:
- Proficient in Python and C++
- Knowledge of MATLAB, Java, JavaScript, SQL
- Prompt Engineering as a Programming Language (I’ll call this PEaaPL)
- ML/DL: PyTorch, OpenCV, Pandas, NumPy, Matplotlib, Optuna
- Web Dev: HTML, CSS, Gradio, HuggingFace
- MLOps: Hydra, Git, Ubuntu Linux, Weights & Biases
leadership
VandyHacks
Sponsorship Assistant Director | Aug. 2021 - Aug 2023
- 🎯 Spearhead the coordination of cold calls and lead the management of existing sponsor relations to exceed our goal of raising $80,000 in funding for Vanderbilt's fall hackathon event
- 🤝 Represent VandyHacks organization at conferences and networking events, increasing lead generation by 40% while building strong partnerships with potential sponsors for future events.
- 🔍 Always looking for new and creative ways to secure funding and sponsorship to ensure the success and growth of VandyHacks.
Vanderbilt Commodore Orchestra
Viola Section Leader | Aug. 2021 - Nov. 2024
- 💪 Scheduled and led viola section practices, fostering a collaborative environment that improved the overall quality of performances
- 🎵 Provide motivation and coaching to slower members, ensuring that all members are able to play to their full potential
- 👥 Collaborated with fellow section leaders to enhance the overall cohesion and excellence of the orchestra