Physically constrained autoencoder-assisted Bayesian optimization for refinement of high-dimensional defect-sensitive single crystalline structure
Joseph Oche Agada, Andrew McAninch, Haley Day, Yasemin Tanyu, Ewan McCombs, Seyed M. Koohpayeh, Brian H. Toby, Yishu Wang, Arpan BiswasPhysical properties and functionalities of materials are dictated by global crystal structures and local defects. To establish a structure–property relationship, we require not only the crystallographic symmetry but also quantitative knowledge about defects. Here, we present a hybrid machine learning (ML) framework that integrates a physically constrained variational autoencoder (pc-VAE) with different Bayesian optimization (BO) methods to systematically accelerate and improve crystal structure refinement with resolution of defects. We chose the pyrochlore-structured Ho2Ti2O7 as a model system and employed the GSAS-II package for benchmarking crystallographic parameters from traditional refinement based on the least-square algorithm and for training data generation. However, the function space of these material systems is highly non-linear, which limits optimizers, such as in traditional least-square refinement, into trapping fits at local minima. In addition, these naïve methods do not provide an extensive learning about the overall function space, which is essential for large space, large time consuming explorations to identify various potential regions of interest. Thus, we present the approach of exploring the high-dimensional structure parameters of defect-sensitive systems via pretrained pc-VAE-assisted Bayesian optimization and Sparse Axis-Aligned Bayesian Optimization. The pc-VAE, designed and trained on physically plausible Ho2Ti2O7 structure models, projects high-dimensional diffraction data consisting of thousands of independently measured diffraction orders into a low-D latent space while enforcing scaling invariance and physical relevance of the latent space. In this proposed design of closed-loop autonomous exploration, we aim to minimize the χ2 errors, also known as L2 norm, in the real and latent spaces separately between experimental and simulated diffraction patterns, thereby steering the refinement toward potential optimum in the parameter space of crystal structures. We investigated and compared the results among different methods, such as pc-VAE-assisted BO, non pc-VAE-assisted BO, and Rietveld least-square refinement. The result shows that the methodology can be generalized to other complex materials where ultra-precise determination of structural defects is needed to reveal subtle structure–property relationships, highlighting a new paradigm for integrating crystallography with ML to accelerate discoveries and characterizations of magnetic materials.