From Basics to Breakthroughs: How Fresh PhDs in Machine Learning Contribute to World-Class Research

There isn't a need for a lengthy explanation. Merely two self-explanatory words answer your question: Moor's law.

The Power of Moore's Law and Rapid Knowledge Expansion

It is true for many different intellectual ventures. In the realm of machine learning research, rapid advancements driven by Moore's law and the exponential growth of knowledge mean that recent PhD graduates have access to the latest tools, techniques, and datasets. This accelerates their ability to contribute to cutting-edge research.

A Bottom-Up Approach: The Limits of Deep Understanding

When I was in high school and early years of college, I took a very bottom-up view in understanding concepts, whether it was stochastic processes, abstract or linear algebra, or algorithms and data structures. I would not accept conclusions at face value and often spent days trying to work through proofs and exploring tangentially related fields, only to find very little tangible results.

For instance, while I spent two weeks glancing through some papers on cryptographic hash functions, I encountered the concept of ed25519 encryption in the context of pushing code to a Gitlab repository. Realizing that issuing a few commands and pasting the public key could accomplish this task without needing to understand the underlying Elliptic Curve Cryptography, I saw the practical simplicity overshadowing the theoretical depth.

While this bottom-up approach can lead to a strong understanding of basics, it has distinct disadvantages if the goal is to move beyond mere understanding. You must understand a huge amount before you can contribute or build something, and given the explosion of human knowledge, you might not even be able to fully master a discipline in a lifetime. This can result in exhausted research grants or job termination in corporate settings.

Embracing Abstractions: A Path Forward

Look, a cool machine learning algorithm might seem complex if you see it in its most generalized and elegant form, often based on concepts from topology and metric spaces. However, you don't need to read a topology textbook to leverage the algorithm in your projects. Understand the limitations but realize that with effort, you can discover new use cases or extensions that the authors or no one else has considered.

I took a while to change my mindset and adopt a more pragmatic approach. Now, I understand that it's crucial to grasp the basics, read classic survey papers to get a feel for the field's evolution, but you don't need to rebuild everything from the basics in your conceptual framework. Be comfortable with taking existing results as a given and focus on applying them instead of delving too deeply into their origins. In the developer's terminology, think of each established theorem as another layer of abstraction that simplifies the complexity. It's okay to work with the abstraction if you are comfortable applying it and understanding its use cases.

Abstractions in Machine Learning Research

In the field of machine/deep learning, the abstractions you will use generally come from two fronts:

Conceptual Framework of Why an Algorithm Works: If you delve deeper, you'll eventually need to understand tensors and topology in metric spaces. If you want to explore and understand these, you might need to do your PhD in mathematics departments at prestigious institutions like Princeton or Oxford. However, if your goal is to apply these at a practical level, you can do so without mastering these underlying concepts. How the Algorithm Works with Your Code or Library: If you continue to dig deeper, you'll eventually encounter low-level hardware architecture aspects. Again, if you are truly committed to exploring this, you should pursue a PhD in hardware or VLSI projects, not in a typical machine learning program.

As a researcher in the highly mathematical and commercial field of machine learning, you must be comfortable working with abstractions and unknowns on both fronts, standing on the shoulders of giants who came before you. Changing your point of view will transform past half a century of progress from a barrier to an asset that you can utilize.

Conclusion

The key to contributing to world-class machine learning research as a fresh PhD graduate lies in embracing abstractions and being comfortable with applying them without needing to fully understand all the underlying complexities. This pragmatic approach allows you to leverage the latest tools, techniques, and datasets, making significant contributions to the field.