How to catch proof errors during self study?

You can post some (not all) proofs here with the proof-verification tag. It would be helpful if you flagged the few particular places where you were in doubt.

If an old professor is willing to spend occasional time, go for it.

One suggestion. Rather than learning the basics from the bottom up, start with something you really want to know for its own sake and work backwards through the prerequisites as necessary. You will probably discover that you need a lot more linear algebra than you thought, and a lot less functional analysis.

Finally, six years is a long time to study all alone. Good grad schools do support their students. Consider applying sooner.


To answer your first question, about how you can catch errors during self-study, I think that you need to have others check your proofs. There have been numerous alleged proofs in the history of mathematics by well-known mathematicians that were later demonstrated to be insufficient or wrong. So, I think you need to find a community of researchers, online or not, to exchange your ideas with them.

As a matter of fact, these days I talk to many people about my future B.Sc. thesis which is going to be about machine learning. What I'm going to write is something that has been said to me by my professors and students studying at higher levels, and I don't claim that it's the best possible approach. So, please keep that in mind.

I think the starting point is to get a copy of the book Elements of Statistical Learning by Hastie and Tibshirani. As a more advanced text to supplement it, you can use Pattern Recognition and Machine Learning by C. Bishop. I think you already know this or probably have even better suggestions for this part.

After reading these two books, you can read the book that Ian Goodfellow, Yoshua Bengio and Aaron Courville have written about deep learning with the same name: Deep Learning. Once you start reading the book, you will be surprised to see how little you need to know to read through the chapters.

You need to take a course in Stochastic Processes. Now, engineering students take this course too. If you can, take this course from the engineering department because they usually avoid measure theory and depending on the lecturers, you may learn some things about signals and systems during the course.

If you want to take the rigorous path, you will need to learn measure theory first. Then you'll be able to understand stochastic calculus rigorously. Last semester, I took a course in stochastic processes from the computer engineering department. You will be surprised to know that most computer engineers know little about the rigorous treatment of the stuff they work with everyday. A book that engineers use for a more or less mathematical treatment is Gallagher's Stochastic Processes which is a terrible book in my opinion. It doesn't satisfy mathematicians, neither does it explain the beautiful intuitions that sometimes engineering offers.

One advantage to the rigorous path is that you get to learn about some other fields like financial mathematics as well. The rigorous approach is helpful when you want to define things like conditional expectation and Radon-Nikodym derivative. But after all, I think it's not wise to spend too much time on 'abstractions'.

You need to spend a lot of time on programming. Learn Python or R, preferably. You need to learn about Markov chain Monte Carlo methods. You also may need to learn about calculus of variation at some point. Overall, the list of things that you can learn is endless. You may like to learn differential geometry to understand information geometry which is more theoretical than practical. Also, some knowledge from physics like thermodynamics can be helpful when you study things like the Boltzmann machine, etc. Again, I would like to emphasize that many of the recent advances in neural networks and deep learning do not really require advanced (abstract) mathematics. Just some linear algebra, a good understanding of probability theory, some experience with matrix calculations as in The Matrix Cookbook and some creativity that engineers have is enough to start your journey. Once you have started and you have chosen your final destination, you will acquire the knowledge you need along the way.