Professor Charles L. Brooks III (Director of Biophysics) and his colleagues Xinqiang Ding (former Brooks research group member and former graduate student in Bioinformatics at UM, now postdoc at MIT) and Zhengting Zou (research fellow in EEB) published a study in Nature Communications (Nature Comm, 10, 5633 (2019); https://www.nature.com/articles/s41467-019-13633-0#Abs1) titled Deciphering protein evolution and fitness landscapes with latent space models.

In their study they demonstrate the application of variational auto encoders (VAEs) to infer key relationships between protein sequences using the latent space representation of the sequences as “learned” through training a VAE on protein multiple sequence alignments. Moreover, they demonstrate that one can build sequence - function relationships utilizing the latent space representation and functional annotations of the latent space sequence representations to predict new sequences with desired functional activity.