Research
(2026) Explainability-Steered Deep Generative Models for Biological Sequence Design
Developed explainability-guided VAE and diffusion frameworks for protein and DNA design that incorporate attribution signals directly into the generative process, achieving a 62% improvement in sample efficiency over standard generative baselines. (under submission at NeurIPS ‘26)
(2026) Explainability-Driven Optimization under Limited Feedback for Biological Sequences
Developed IDEAS, an attribution-guided evolutionary learning framework that integrates explainability into active learning-based black-box optimization of biological sequences, improving sample efficiency by 19% across benchmarks. (ICML ‘26, ICLR ‘26 MLGenX)
(2026) Interpretable Model for Temporal Attribution in Time-Series Data
Developed TimeSliver, an interpretable deep learning model that integrates raw and symbolically binned time-series data to capture temporal interactions and compute temporal attribution scores, achieving an 11% improvement over state-of-the-art explainable methods. (ICLR ‘26)
(2025) Interpretable Model for Monomeric Attribution in Protein Sequences
Developed COLOR, an interpretable deep learning model that transforms higher-dimensional protein sequences into a lower-dimensional interpretable representation to estimate the contribution of each monomer to a given property. COLOR achieves 22% higher explainability than existing gradient- and attention-based methods. (ICLR ‘25 MLGenX, ACS JCIM ‘25)
(2025) EMG-to-Text Conversion with LLMs
Developed a LLaMA-3-based model to convert surface electromyography (EMG) signals, which capture muscle activations, into speech. On a closed vocabulary task, our model achieves approximately 20% lower word error rate (WER) compared to specialized baselines. (ACL ‘25 Main)
(2024) Predictive Model for Spider Silk Mechanical Properties
Developed an interpretable feature-based deep learning framework to predict the properties of spider silk and identify important motifs in a data-constrained setting. Showed that using B-factor as a motif descriptor improves prediction performance by 15% compared to traditional descriptors such as hydrophobicity and charge. (Nature Communications Materials ‘24)
(2023) B-Factor Prediction in Proteins
Developed a many-to-many LSTM model to predict the B-factor (atomic flexibility) of alpha-carbon atoms in proteins, achieving a 30% improvement over the CNN-based state-of-the-art. Analysis revealed that atoms within 15 Å contribute most significantly to B-factor values. (Cell Patterns ‘23)
(2023) Audio-Based Emotion Prediction
As part of an ACM Multimedia Challenge, developed an emotion prediction model using an audio foundation model. Found that HuBERT-Large as the audio backbone significantly improved performance by 4% over wav2vec. (ACM Multimedia ‘23)
(2023) Person Identification from Biosignals
As part of the ICASSP ‘23 Challenge, developed a multichannel CNN-based deep learning model with a late fusion strategy to identify individuals from multivariate temporal biosignals, boosting accuracy from 62% to 91.36% and securing 3rd place. (ICASSP ‘23)
