Chemistry - How does optimizing the molecular orbital coefficients in CASSCF improve its multi-reference capabilties?
Solution 1:
I think you are maybe confusing how dynamical and static correlations are treated with different method. Also CASSCF by itself is not a multi-reference method.
CI in general is able to describe both dynamical and static correlation (FCI does at least). What is treating dynamical correlations (but not static) is the truncation scheme using different degrees of excitation, or example CISD, or in a similar way CCSD. Those are called single-reference methods, because they generate configurations based a single reference wave function.
To deal with static correlations you can for example use the Active Space approach (CASCI), or select configurations by hand (since you might not need to many here).
The term multi-reference refers to doing both, dynamic and static correlation, by generating excitations starting from multiple configurations. One first does a calculation for static correlation (usually MCSCF or CASSCF). Then a second calculation adds dynamical correlations by using the configurations from the first calculation (all or only the most important ones) as multiple reference points (e.g. MRCI-SD or MRCC-SD).
Optimizing the MO coefficients in a MCSCF calculation (e.g. CASSCF) is about optimizing the one-electron basis for your specific CI problem. Since you include more than just one configuration, the HF orbitals are not optimal anymore. In principle you could combine MCSCF with CISD as well, but usually CISD generates to many configurations making such an approach unfeasible.
TL;DR
The optimization of the orbitals in MCSCF does not directly improve the multi-reference capabilities. But it improves the description of static correlation, so you get a better starting point for your multi-reference calculation.
Maybe this explains better the need to optimize both sets of coefficients:
[...] The presence of several important configurations poses a difficult challenge for ab initio electronic-structure theory. The single-configuration Hartree-Fock approximation, by its very construction, is incapable of representing systems dominated by several configurations. By the same argument, methods designed to improve on the Hartree-Fock description by taking into account the effects of dynamical correlation, such as the coupled-cluster and Moller-Plesset methods, are also not suitable for such systems. Furthermore, to carry out a CI calculation 'on top of the Hartree-Fock calculation'is problematic as well since the Hartree-Fock model is inappropriate as an orbital generator: the orbitals generated self-consistently in the field of a single electronic configuration may have little or no relevance to a mutliconfigurational system.
An obvious solution to the multiconfigurational problem is to carry out a CI calculation where the orbitals are variationally optimized simultaneously with the coefficients of the electronic configurations, thereby ensuring that the orbitals employed in the wave function are optimal for the problem at hand and do not introduce a bias (towards a particular configuration) in the calculations. This approach is referred to as the multiconfigurational self-consistent field (MCSCF) method. [...] it may be used either as a wave function in its own right (for a qualitative description of the electronic system) or as an orbital generator for more elaborate treatments of the electronic structure.
Taken from the introduction to the MCSCF chapter in the book T. Helgaker, P. Jorgensen, and J. Olsen. Molecular electronic-structure theory. New York: Wiley, 2000.
Solution 2:
The distinction between static and dynamic correlation is not well-defined.1 The distinction is only sensible with respect to a single-particle picture, i.e. viewing the many-electron wavefunctions as built up from single electrons. Let's start with some notes about configuration interaction (CI). The idea of configuration interaction is to express the wavefunction as a linear combination of Slater determinants (simple wavefunctions with orbitals that are either doubly, singly, or unoccupied by an electron). Generally, most CI methods require that all of the Slater determinants have the same orbitals (although NOCI or non-orthogonal CI expansions are sometimes used) and so these different configurations differ in the occupation of their orbitals. This then gives two sets of parameters to describe the CI wavefunction: CI expansion coefficients and orbitals in the Slater determinant. In methods like CIS, CISD, CASCI, etc. the orbitals are not optimized self-consistently. In so-called MCSCF (multi-configuration self-consistent field) methods, the orbitals are optimized (this is computational relatively expensive and the optimization is much more difficult than with standard SCF methods).
The biggest CI wavefunction one can do (in a given basis set) is to expand the wavefunction in all possible configurations that can be obtained from all possible occupations of all of the orbitals. This is called Full CI and it is exact (in a given basis), that there is no wavefunction lower in energy than the FCI wavefunction. The FCI wavefunction is also invariant to orbital changes; because all possible configurations are considered, changing the orbitals just changes the expansion coefficient weights, but not the total energy (or any density matrices). FCI is too expensive to calculate for almost all molecules and so we make approximations. One such approximation are active-space methods (such as CAS). In CAS, a set of orbitals and electrons are termed "active" and the FCI problem is solved within these active orbitals while the remaining orbitals are solved via mean-field (i.e. HF). These methods account for static correlation, usually described as correlation energy due to low-lying excited states as occurs in bond-breakings and with some transition metal systems. The main hallmark of static correlation is in orbital occupation numbers that differ significantly from 0, 1, or 2. Dynamic correlation is the "other" part of the correlation, which are usually described as correlations due to instantaneous fluctuations of the electrons. MP2 and coupled-cluster are ways of recovering that dynamic correlation for a single reference. Standard multi-configurational methods like CASCI/CASSCF recover static correlation principally, but without having an active-space of all of the electrons, they will not recover all of the dynamic correlation without other tricks. Enter the CASPT2/NEVPT2 method which seek to capture the missing dynamic correlation via perturbation theory on top of the CASSCF wavefunction, or the MR-CI methods which bring in more determinants by considering excitations from some set MR expansion (i.e. a CASSCF procedure).2
On to your question, about what does the orbital optimization step accomplish in CASSCF. The orbitals that are often fed into a CASSCF to define an active-space are based on single-reference orbitals. They represent some mean-field approximation to the wavefunction which may or may not be any good depending on the system. The CAS procedures allows properly fractional electron occupation numbers and so the orbital optimization procedure adjusts the active-space to select the best orbitals in this less approximate wavefunction that has the needed flexibility to correctly describe the system.
Another important note: do not look at the configurations' coefficient weights to determine if the system has a lot of static correlation or not. Changing the orbitals will change the coefficients dramatically. Consider the simplest case, $\ce{H2}$. If the orbitals are the $1s$ on $\ce{H}_a$ and the $1s$ on $\ce{H}_b$, the CAS(2,2) wavefunction will have two configurations of equal weight (where the electrons are on $\ce{H}_a$ or $\ce{H}_b$). If you switch to the orbitals being the $\sigma$ and $\sigma^*$, the CAS(2,2) energy will be identical but the two configurations will now have very unequal weight (it will be almost 100% doubly occupied $\sigma$ and very little of the $\sigma^*$). So you can't just look at the weights.
- Ramos-Cordoba, E., Salvador, P., Matito, E. Separation of dynamic and nondynamic correlation. Phys. Chem. Chem. Phys., 2016, 18, 24015-24023. DOI: 10.1039/C6CP03072F
- Szalay, P. G., Müller, T., Gidofalvi, G., Lischka, H., Shepard, R. Multiconfiguration Self-Consistent Field and Multireference Configuration Interaction Methods and Applications. Chem. Rev., 2012, 112 (1), 108–181 DOI: 10.1021/cr200137a