San Francisco, CAMember of Technical Staff at Datology

Building multimodal systems where benchmarks can change the next decision.

At Datology, I work on evaluation, curation, and training systems for multimodal models. I want benchmark results, pipeline choices, and training outcomes to agree.

Field note

Good research infrastructure shortens the distance between a data choice and a trustworthy comparison.

DatBench figure
Research tile

DatBench

A benchmark designed to make VLM evaluation more discriminative and decision-useful.

Read paper
Portrait avatar for Haoli Yin
Personal anchor

Haoli Yin

Research engineer focused on multimodal curation, evaluation, and training systems.

Current work

What I'm working on now

Three lanes: evaluation, curation, and training systems. The point is to make the next experiment clearer and harder to fool.

Current lane
01

Decision-useful VLM evaluation

Designing benchmarks and scoring paths that are selective enough to inform curation and model choices, not just report a leaderboard.

Current lane
02

Multimodal curation and data quality

Working on embeddings, filtering, and dataset export paths for large image-text corpora, with attention to alignment and redundancy.

Current lane
03

Training and launch systems

Building the dataloading, evaluation, and multi-node launch infrastructure that makes multimodal iteration fast enough to matter.

Public artifact

A smaller public tool, but the same bias toward practical research infrastructure.

Benchmark Dataloader

Benchmark Dataloader

A small benchmarking harness for multimodal dataloaders, built to surface throughput bottlenecks before they become expensive training-time surprises.

GitHub