Integrating machine learning with density functional tight binding: method development and applications in material simulations
Veröffentlichungsdatum
2025-12-12
Autoren
Sun, Wenbo
Betreuer
Gutachter
Zusammenfassung
Theoretical computational methods capable of predicting electronic properties have become increasingly important in materials research, providing fundamental insights and guiding material design. Among these, Density Functional Theory (DFT) is widely adopted for its accuracy and broad applicability. However, the high computational cost of ab initio methods limits their scalability to large systems. Density Functional Tight Binding (DFTB), an approximate method derived from DFT, offers a favorable balance between accuracy and efficiency, making it suitable for large-scale simulations. Achieving high performance with DFTB, however, depends on the quality of its parameterization. Recent developments in machine learning (ML) present opportunities to enhance DFTB, enabling more accurate and efficient simulations.
This thesis explores two strategies for integrating ML into DFTB. The first approach involves optimizing the distance-dependent two-center integrals of DFTB parameterizations using ML within an automatic gradient-tracking framework. To support this, we have developed TBMaLT (PyTorch-based Tight-Binding Machine Learning Toolkit), an open-source framework that enables flexible construction and optimization of ML-enhanced tight-binding models. The framework is applied to defective periodic systems of silicon and silicon carbide, with the density of states (DOS) selected as the target property for optimization. The results demonstrate that the DFTB Hamiltonian and overlap matrices can be fine-tuned through backpropagation to achieve DOS predictions approaching DFT-level accuracy. Importantly, the integration of ML preserves access to additional electronic properties, such as projected DOS, Mulliken populations, and bandstructures. These properties are examined to ensure they remain physically meaningful and consistent following ML-based optimization. The model’s transferability and scalability are further validated on systems with varying sizes and defect types not included in the training data.
The second approach employs Bayesian optimization to refine atomic parameters for generating electronic terms, aimed at applications in organic photovoltaics (OPVs). Two sets of parameters are constructed based on the B3LYP and CAM-B3LYP functionals, covering the elements H, C, N, O, F, S, and Cl, which are key components in OPV donor and acceptor molecules. Benchmarking against a dataset of 12 representative OPV systems demonstrates good agreement with DFT reference calculations for ground-state properties, including optimized geometries and frontier orbital energies. Additionally, excited-state properties of monomers and donor–acceptor dimers are investigated using real-time time-dependent DFTB. Charge-transfer excitations are observed in the dimer systems, and the influence of alkyl side chains on the photoinduced charge-transfer process is investigated.
Together, these two approaches show the potential of integrating ML with physically grounded computational methods, offering new pathways for developing efficient and accurate simulation tools for complex materials systems.
This thesis explores two strategies for integrating ML into DFTB. The first approach involves optimizing the distance-dependent two-center integrals of DFTB parameterizations using ML within an automatic gradient-tracking framework. To support this, we have developed TBMaLT (PyTorch-based Tight-Binding Machine Learning Toolkit), an open-source framework that enables flexible construction and optimization of ML-enhanced tight-binding models. The framework is applied to defective periodic systems of silicon and silicon carbide, with the density of states (DOS) selected as the target property for optimization. The results demonstrate that the DFTB Hamiltonian and overlap matrices can be fine-tuned through backpropagation to achieve DOS predictions approaching DFT-level accuracy. Importantly, the integration of ML preserves access to additional electronic properties, such as projected DOS, Mulliken populations, and bandstructures. These properties are examined to ensure they remain physically meaningful and consistent following ML-based optimization. The model’s transferability and scalability are further validated on systems with varying sizes and defect types not included in the training data.
The second approach employs Bayesian optimization to refine atomic parameters for generating electronic terms, aimed at applications in organic photovoltaics (OPVs). Two sets of parameters are constructed based on the B3LYP and CAM-B3LYP functionals, covering the elements H, C, N, O, F, S, and Cl, which are key components in OPV donor and acceptor molecules. Benchmarking against a dataset of 12 representative OPV systems demonstrates good agreement with DFT reference calculations for ground-state properties, including optimized geometries and frontier orbital energies. Additionally, excited-state properties of monomers and donor–acceptor dimers are investigated using real-time time-dependent DFTB. Charge-transfer excitations are observed in the dimer systems, and the influence of alkyl side chains on the photoinduced charge-transfer process is investigated.
Together, these two approaches show the potential of integrating ML with physically grounded computational methods, offering new pathways for developing efficient and accurate simulation tools for complex materials systems.
Schlagwörter
DFTB
;
DFT
;
Machine learning
;
Method development
;
Material simulation
;
OPV
Institution
Fachbereich
Dokumenttyp
Dissertation
Sprache
Englisch
Dateien![Vorschaubild]()
Lade...
Name
Integrating machine learning with density functional tight binding.pdf
Size
27.63 MB
Format
Adobe PDF
Checksum
(MD5):c0103394f93c14db19fc1d38db3f228d
