Chemistry - What to do with (large) imaginary frequencies for constrained minimum structures?

Solution 1:

The first I noticed is you said:

The two gradient tolerances have been loosened slightly in order to converge the calculations. By using the default gradient tolerances, the calculations simply do not, I have found (especially the maximum gradient).

At first sight, the structure of your molecule should not leads to serious problems, but the above is a bad signal, and turns the results unreliable (at least doubtful). The constrains must be considered in the calculation of vibrational frequencies for they make sense.

Maybe no one can give you a foolproof steps series to follow. My suggestions are below.

What to do/check

  • If comments are right, and Orca does not care about fixing atoms for the calculation of the vibrational frequencies, then you need to change the software for this task (because in this case results are not meaningful) (This also answer the question "Is it "safe" to disregard the frequencies that mainly involved fixed atoms?"). The simpler way to check this is by looking at the number of freedom degrees reported (or deduced form the output). I was referring directly to vibrational f.d. because low frequency modes may be treated like rotations. I doubt that Orca lacks of this basic functionality. In such case I bet that this is described their User Manual.

  • The step you are doing are reasonable. My guess is that maybe you are far from a true minimum, in some unstable geometry, and that is why you have problems with accuracy changes. You said: "Should I converge my geometry tighter to come closer to the true minimum? ". Yes, it is always advisable to get good frequencies. So I would use not only tighter geometry convergence criteria but the TIGHT convergence option for the SCF part.

  • In the same lines, you can try using modern semi-empirical methods and then make the fine tuning with DFT. Even using a simpler DFT functional.

  • I agree with comments that you are not using the best basis set out there, but I think that this is not a problem here.

  • If you want to discard weird behaviors you can try with another functional, but I do not think that this is the problem (because it is known that B3LYP performs well in compounds with the elements involved here).

In brief. I would check if Orca compute correctly the frequencies, if not: change the software. If yes: discard those results, try to converge the structure using cheaper methods (if you are lack of computational resources) from different initial conditions, store those results and start new geometry optimizations and vibrational frequencies calculations starting from those previous converged states. The last step using the tight convergence criteria for SCF and normal convergence criteria for geometries, and using the target basis set (for energy and its derivatives).

Good luck

Note:

Once I start computing reaction paths, I plan to try different functionals and basis sets to see if this give drastically different results.

Why do you expect that something like this would happen? You can save time by using a little large basis set and a better functional once you get the B3LYP/6-31g(d,p) results. It should not take too much time.


EDIT DUE TO COMMENTS

This answers generated some discussion about B3LYP or not B3LYP?

In that respect, my viewpoint is as follows:

  • B3LYP is not the most accurate functional (there should be no doubt about that).
  • It is reasonably good for geometries in systems like yours. Errors in geometries can be larger due to small basis set than due to B3LYP, or at least of the same order.
  • While there are functional with better results for geometries, you also must take care of how large basis sets those functional needs to be a better option. I mean, maybe using 5$\zeta$ basis sets they are better, but not using a small basis set. (and maybe not)
  • The above discussion can be very long, but in any case, you are using approximations that (in my opinion) are much stronger than the errors due to B3LYP. For example, the permittivity value, or the geometry of the constained parts, etc..
  • Also, even if you could perform a FullCI/CBS you shoud notice that the simulated system is different to the real system, so extremely exacts results in the simulated system may no improve your knowledge about reality.
  • So, if the functionals are accurate enough, what it is important here is obtaining the corresponding results for the functional/basis set. The functionales can be more or less problematic. B3LYP should not be problematic for your system. My guess is what I said in the answer.
  • Because of the above, I think that B3LYP is reasonable and enough for your case.

Solution 2:

Large imaginary frequencies, while you are looking for a local minimum, are usually an indicator that your guessed starting geometry is not good. Values larger than $10i~\pu{cm-1}$ are only observed for transition states, and only if you have one of these modes, it is actually meaningful.

When you are fixing atom coordinates, then you are (in a practical sense) employing an external potential. If the program you are using does not account for that constraint, it is not suited for these kinds of optimisations.note
At the very least, your thermochemistry will be wrong. If your aiming for anything else than a rough guesstimate, then this is not good enough (if you are contempt with that, why bother with frequencies).

In general there are a few things you can do to improve convergence. One of the most important ones and often underrated is the SCF convergence. If you have a sloppy converged SCF, then also gradients will be sloppy, which leads to poor performance of the geometry optimiser. Tighten it to 1E-7 to achieve better results.
Often numerical noise is a problem in calculations, make sure that you are using a tight (tighter, the tightest) grid for DFT calculations. In principle the same as above applies; if it is to coarse, your SCF will converge sloppy, etc.
This should at least let you converge the geometry on default cutoffs (gradient, displacement). If you then have imaginary frequencies, you should use path following algorithms to get rid of these; and if they are not numerical noise, this should not be a problem.

Use dispersion corrections. There is no excuse not using them (except missing parameters, but then use a different method).

From my experience, solvent models can drastically influence the convergence, in many cases it may lead to oscillation. Try optimising without it first, turn it back on later.

You have stated that this is a part of an enzyme. My first guess seeing this large imaginary modes, was that you chose to cut the moieties at the wrong position, substituting more demanding groups with hydrogen atoms. While this is common practise for many computational models, if you have problems with the molecular structure, this is a sign that you cut away something important. Increase your cut-out; don't replace with hydrogen, but with methyl instead (in doubt fix those, too).

Leaving something out of the calculation can lead to all sorts of artefacts. You should have a look at a methodology where you can describe the system in a whole. There a various applications developed just for this case. Something along the lines of DFTB+ might do wonders for your system. Another possibility is using QM/MM, and ORCA is capable of that.
If you want to describe reactions, in solution, sooner, rather than later, you need to look at these options. Aqueous media are especially challenging.

B3LYP is a terrible functional. This is probably more often true than not. It is not only one of the slowest performers (for hybrid functionals), it also has been shown to produce unreliable results (just search B3LYP failure). Starting with a hybrid functional is overkill for this system size in any case. Always use a non-hybrid functional (BP86-D3(BJ), B97D3, M06-L, etc.) first, it saves a lot of time, resources, brain cells. Work your way up Jacob's ladder, not down.
Use def2- basis sets. The close cooperation with Ahlrich's group has many of the great features developed there (re: turbomole) readily implemented in ORCA. This includes the extremely well performing basis set, which do already come with especially prepared auxiliary basis sets for density fitting (RI).

TL;DR: Tighten SCF; tighten grid; use dispersion; use a cheaper functional; use better basis; increase system size (you can fix adjacent moieties).


Note: If the program does not correct for these constraints, then the derived gradients won't be good enough to achieve convergence.