Ayaan Haque
Hi! I'm Ayaan Haque, a 19 y/o research scientist at Luma AI. I'm also a student at UC Berkeley studying EECS, where I'm advised by Angjoo Kanazawa. I work on generative 3D research. Recently, I worked on Genie,
Luma's text to 3D foundation model which can generate high-fidelty 3D objects in seconds.
In the past, I've worked on self-supervised and unsupervised representation learning. I've previously interned at Samsung SDSA, and got my research career jumpstarted back in high school with Wang Group at Stanford.
At Berkeley, I'm a part of Machine Learning @ Berkeley and the Nerfstudio team.
In a past life, I was a builder and hacker (I'm a MLH Top-50 Hacker!), and now I'm exploring deep-tech startups. Other than that, I enjoy writing, watching/playing sports, eating out with friends,
and just having a good time. My ongoing goal and dream:
Twitter  / 
Email  / 
Google Scholar  / 
Github  / 
Resume  / 
Medium  / 
LinkedIn
|
Learning about learning π―
|
Updates
-
[October 2023] Gave oral talk on Instruct-NeRF2NeRF at ICCV in Paris!
-
[July 2023] Instruct-NeRF2NeRF accepted to ICCV 2023 (Oral)!
-
[May 2023] Joining Luma AI, a Series A startup building the future of 3D!
-
[Mar 2023] Released new pre-print Instruct-NeRF2NeRF!
|
Research
Since I've worked on a wide variety of topics (and am still exploring new topics), I've split my publications into currently relevant works and previous works.
Only relevant works are listed below. For a full list (and chronological list) of my papers,
visit my Google Scholar.
|
|
Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
Ayaan Haque,
Matthew Tancik,
Alexei A. Efros,
Aleksander Holynski,
Angjoo Kanazawa
UC Berkeley
ICCV, 2023 (Oral Presentation)
Project Page /
ArXiv /
Oral /
Code
We propose a method for editing NeRF scenes with text-instructions. Given a NeRF of a scene and the collection
of images used to reconstruct it, our method uses an image-conditioned diffusion model (InstructPix2Pix) to
iteratively edit the input images while optimizing the underlying scene, resulting in an optimized 3D scene that
respects the edit instruction.
|
|
Self-Supervised Contrastive Representation Learning for 3D Mesh Segmentation
Ayaan Haque,
Hankyu Moon,
Heng Hao,
Sima Didari,
Jae Oh Woo,
Patrick Bangert
Samsung SDS Research America
AAAI, 2023
ArXiv /
Poster
We introduce self-supervised MeshCNN, or SSL-MeshCNN, a novel mesh-specialized contrastive learning method to
perform downstream segmentation with limited-labeled data. We create an augmentation policy tailored for meshes,
enabling the network to learn efficient visual representations through contrastive pre-training.
|
|
Deep Learning for Suicide and Depression Identification with Unsupervised Label Correction
Ayaan Haque*1,
Viraaj Reddi*1,
Tyler Giallanza2
Saratoga High School1, Princeton University2
ICANN, 2021
Project Page /
ArXiv /
Teaser Video /
Poster /
Code /
Blog
We propose SDCNL to address the unexplored problem of classifying between depression and more severe suicidal
tendencies using web-scraped data. Our method introduces a novel label correction
method to remove inherent noise in web-scraped data using unsupervised learning combined with a deep-learning classifier
based on pre-trained transformers.
|
|
MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images
Ayaan Haque*1,
Abdullah-Al-Zubaer Imran*2,3,
Adam Wang2,
Demetri Terzopoulos3,4
Saratoga High School1, Stanford University2, University of California, Los Angeles3, VoxelCloud Inc.4
IEEE ISBI, 2021
Project Page /
ArXiv /
Poster /
Presentation /
Code /
Blog
We introduce MultiMix, a joint semi-supervised classification and segmentation model
employing a confidence-based augmentation strategy for semi-supervised classification
along with a novel saliency bridge module that guides segmentation and provides explainability
for the joint tasks.
|
|
EC-GAN: Low-Sample Classification using Semi-Supervised Algorithms and GANs
Ayaan Haque
Saratoga High School
AAAI, 2021 (Best Student Abstract Finalist, Oral Presentation)
Project Page /
ArXiv /
Oral /
Poster /
Presentation /
Code /
Blog
We propose EC-GAN, which combines a Generative Adversarial Network with a classifier to leverage artifical GAN
generations to increase the size of restricted, fully-supervised datasets using semi-supervised algorithms.
Mentored by Microsoft Postdoc and Princeton University PhD Jordan T. Ash.
|
|
Noise2Quality: Non-Reference, Pixel-Wise Assessment of Low Dose CT Image Quality
Ayaan Haque1, 2,
Adam Wang2,
Abdullah-Al-Zubaer Imran2
Saratoga High School1, Stanford University2
SPIE Medical Imaging (SPIE), 2022
Project Page /
Paper /
Presentation /
Poster /
Code
We propose Noise2Quality (N2Q), a novel, self-supervised IQA model which predicts SSIM Image Quality maps from
low-dose CT. We propose a self-supervised regularization task of dose-level estimation creating a
multi-tasking framework to improve performance.
|
|
Window Level is a Strong Denoising Surrogate
Ayaan Haque1, 2,
Adam Wang2,
Abdullah-Al-Zubaer Imran2
Saratoga High School1, Stanford University2
MICCAI MLMI, 2021
Project Page /
ArXiv /
Poster /
Code /
Blog
We introduce SSWL-IDN, a novel self-supervised CT denoising window-level prediction surrogate task. Our method is task-relevant
and related to the downstream task, yielding improved performance over recent methods.
|
|
Generalized Multi-Task Learning from Substantially Unlabeled Multi-Source Medical Image Data
Ayaan Haque1, 2,
Abdullah-Al-Zubaer Imran2,3,
Adam Wang2,
Demetri Terzopoulos3,4
Saratoga High School1, Stanford University2, University of California, Los Angeles3, VoxelCloud Inc.4
MELBA, 2021
Project Page /
Journal Page /
Paper /
Code
We expand upon MultiMix (in ISBI 2021). Our extended manuscript contains a detailed
explanation of the methods, saliency map visualizations from multiple datasets, and
quantitative (performance metrics tables) and qualitative (mask predictions, Bland
Altman plots, ROC curves, consistency plots).
|
|
Research Scientist at Luma AI
Palo Alto, CA
May 2023 - Present
- Neural rendering startup building the future of 3D content creation, raised $20 million Series A
- Released Genie, a text to 3D foundation model which can generate high-fidelty 3D objects in seconds
|
|
Research Intern at Berkeley AI Research (BAIR)
Kanazawa AI Lab (KAIR), Berkeley, CA
Dec 2022 - Present
- Working on NeRFs and diffusion models
|
|
Research Intern at Samsung SDSA
AI Research Group, San Jose, CA
June 2022 - September 2022
- Proposed "SSL-MeshCNN", a novel self-supervised algorithm for segmenting non-uniform, irregular 3D meshes
- Introduced new SimCLR-inspired stochastic augmentation policy for mesh-specialized contrastive learning
- Matched accuracy of fully-supervised training (90.50%) with just 67% of labels on benchmark datasets
- Wrote paper accepted to AAAI 2023, available on ArXiv
|
|
Research Intern at Stanford
Wang Group in RSL, Stanford, CA
July 2020 - June 2022
- Worked on learning from limited labeled data for clinical imaging tasks using unsupervised, self-supervised, and semi-supervised techniques
- Developed research skills by running hundreds of experiments, writing papers, preparing supplementals, and writing rebuttals
|
|
Software Engineering Intern at Openwater Accelerator
Internal SWE Team, Menlo Park, CA
August 2020 - December 2020
- Accelerator providing early-stage startups with software and human capital instead of direct funding (closed as of late 2021)
- Developing a Waitlist API which is to be sold to porfolio companies in the program, where companies can establish
waitlists for their products to build a market
- Using React.js, MongoDB, Flask and other web dev/backend tech, integrating Stripe payment features and referral features, writing documentation
|
Activities
Outside of research, I enjoy building practical applications in both competitive and casual formats.
|
|
Hackathons
Team Captain
May 2019 - June 2022
- Team Captain of 5 total members (shoutout Viraaj, Adithya, Ishaan, and Sajiv)
- Created numerous projects (listed in Projects section)
- π 33x Award Winner, 9x First Place, 22x Top 3, $10,000+ in earnings
- Chosen for MLH Top 50 Hackers Class of 2021, one of five high schoolers
|
|
MSET Robotics Team 649
Software Team
August 2018 - April 2021
- FRC Robotics Software Team, ML-Specialist (FTC Captain 9th Grade)
- Worked with AI/ML for in-game object detection and using predictive models for shot selection, work on shooter
trajectory modeling, write documentation
- π 2021 Skills Competition Finalist in Carbon Group π 2021 Engineering Excellence Award π CalGames 2019 Finalist π ChezyChamps 2019 Semi-Finalist
|
Projects
I've just listed a few of my favorite projects, and the remaining are available on my Github. On Github, I have 500+ commits and ~300 stars across all my repositories.
Check out this cool commit graph, and check this out for my Github stats.
|
|
Nerfstudio
A collaboration friendly studio for NeRFs
Jan 2023 - Present
Website /
Github
Contribute (in small parts) to large-scale open-source project. Helped implement research methods (Instruct-NeRF2NeRF, CLIP-based NeRF)
into main repository.
Stack: Python, PyTorch
|
|
SuiSense
Using Artificial Intelligence to distinguish between suicidal and depressive messages
June 2020 - Dec 2020
Website /
Demo /
Github /
Devpost /
Medium Article /
Research Paper (SDCNL)
SuiSense is a progressive web application that uses Artificial Intelligence (AI) and Natural Language Processing
(NLP) to distinguish between depressive and suicidal phrases and help concerned friends and family determine whether
their struggling loved one is on the path to suicide. SuiSense uses an implementation of SDCNL.
π 4th Place Congressional App Challenge 2020 π 2nd Place @ GeomHacks 2020 π HM @ MLH Summer League SHDH 2020
Stack: Python, HTML, CSS, JavaScript, Tensorflow, PyTorch, BERT, Flask, PythonAnywhere, Pandas, Sci-Kit Learn
|
|
Drishti Smartphone Retinal Camera System
CAD Files and Implementation Drishti's Retinal Camera System Prototype
June 2021 - Sep 2021
Github (CAD Files) /
Assembly Guide /
Instructional Guide /
Documentation /
Technical Blog /
Drishti Website
This mobile, on-the-go system is designed for clinics in Bangladesh to screen patients for
Diabetic Retinopathy (DR) using a smartphone camera with a retinal attachment. The purpose
of this rig is to allow precise positioning of the smartphone to any patient's left and right
eye such that the images can be efficiently fed into Drishti's AI algorithms for DR diagnosis.
The system is completely adjustable for all head sizes. It is made of readily available
components that can be purchased at many local hardware stores, and is designed for low-cost
fabrication. All assembly tools are common household tools or easily purchasable/rentable
from a local hardware store. The 3D-printed components can be printed on low-end machines
and with cheap PLA filament. We designed this system to be completely collapsable, such that
it can fit into a standard size backpack.
Stack: SolidWorks, Hardware Materials
|
|
Tickbird
Streamlined prescription analysis for visually impaired patients (Available on the App Store)
September 2019 - June 2020
Website /
App Store /
Demo /
Github /
Slides /
Devpost
Tickbird is an advanced Swift mobile app based on the TesseractOCR neural network framework allowing visually impaired patients
to aurally understand their prescriptions or the labels on their pill bottles in order to gain independence and avoid
the prospect of lethal miscommunication regarding necessary medicines from their doctors. Moreover, the app's smart
profiling feature not only finds the nearest pharmacy containing the user's prescription, but it also uses AI/ML
algorithms to detect and set notifications for the times the user has to take or refill their medicine.
π 2x Award Winner @ OmniHacks 2019 π App Store April 2020, 1000+ Impresions
Stack: Swift, Xcode, IOS, Firebase, TesseractOCR, Ruby
|
Writing
I write on Medium (semi-regularly) to share my thoughts with the world. Here are a few of my favorite medium articles that I have written.
|
|
In Response to βWhatβs the F-ing Point?β
No Publication, will not profit off this story
October 6th, 2021
A response to an article discussing our purpose in this world combined with a discussion of my own purpose
This article is a reponse to my friend's article, where
he discusses critiques of our Saratoga society. In my article, I respond to his ideas and then share my own story of finding my purpose in life.
|
|
How Five High-Schoolers Won $9.5K From Hackathons in One Summer
Better Programming
August 28th, 2020
Coding, winning prizes, and proving ourselves
Authored by Ayaan Haque, Adithya Peruvemba, Viraaj Reddi, Sajiv Shah, and Ishaan Bhandari
This article travels through the journey of my team, Haleakala Hacksquad, and how we became great hackers.
|
|