June 17, 2020
Over the years, a number of friends have asked me about how they can get started in NLP and ML research. While no expert in CS curriculum design, I made this transition myself coming out of college (I majored in ECE, with focus in computer hardware and high power energy systems). This guide is a summary of how to get started in NLP/ML research. Following this guide will not make you an expert - that would require a formal education and years of practice. Rather, this aims to help you acquire enough experience to quickly pursue work in this area (e.g. working with a research lab).
This guide is for someone who wants to pursue NLP/ML research, but does not have the prerequisite experience. For example:
This guide aims to get you from knowing linear algebra and programming to being able to be productive in a research project. The fastest way to be productive is to be able to contribute code. At the time of writing, the majority of NLP/ML code is written in Python. If you know how to program but are not proficient in Python, I would recommend going through Google's Python Class for people who already know how to code, taught by Nick Parlante.
At this point, we can go one of two ways. First, we learn neural networks, which we apply to toy tasks such as MNIST to get a sense of how to implement them and how they work. Second, we learn a problem domain such as NLP, identify its subproblems, and discuss how to solve these subproblems. I tend to favor the second approach, because it emphasizes the problem instead of the method du jour.
For NLP, Dan Jurafsky's book Speech and Language Processing provides a nice overview of the important problems. I suggest that you read only the chapters essential to your problem of interest. For example, if you would like to work in dialogue research, then you can probably skip chapters such as information extraction, word sense, etc. These are important topics, but you should quickly get to a point where you can make valuable contributions to the project instead of initially attempting to understand all sub-areas of NLP. Some important chapters regardless of your research topic include:
I encourage you to work through some of the problems in this book, which will evaluate your understanding of the chapters. In particular, you can try implementing the algorithms discussed in the book on real text data, in Python.
PyTorch is currently the most popular ML framework in academic research due to its simplicity and flexibility. Once you understand the basics of neural networks (e.g. layers, backpropagation), you can treat PyTorch as an automation tool that performs automatic differentiation. Deep Learning with PyTorch: A 60 Minute Blitz provides a quick introduction to PyTorch. You can proceed with this section in parallel with the NLP section.
Now you should be able to conduct your own experiments. Pick some problems of interest and try to write up a model for them in PyTorch. Here are some example tasks that should not require an expensive GPU:
The above links directly refer to raw data. There are many frameworks on top of PyTorch available. Should you use them? Generally, I recommend against the usage of other people's frameworks when learning, because it is helpful to understand the details. For example:
By coding these details, you will gain intuition on why things work and where things go wrong. As a result, you should always write your own preprocessing and your own training loop.
Once you have successfully done some experiments on your own, you are ready to contribute to research projects.
The easiest way to find a research project is through emailing researchers (e.g. PhD students) at your local institution.
What should you expect? In large research labs, it is unlikely that you will work with the professor directly. Instead, you will likely work with a PhD student on their project, and occasionally meet with the professor with the PhD student. Once you have shown a capacity for independent work, you may lead projects with an advisor.
What opportunities should you look for? Because you are just getting started, you should look for projects that are well-defined. Open-ended projects can be more impactful, but often times they do not work out even if the researcher is experienced. Initially working on well-scoped projects will give you the experience and confidence to work on more complex, open-ended problems. On a related note, you may want to find experienced mentors (e.g. senior PhD students) who know how to scope the project for you.
Here are some additional tools you might find helpful when experimenting with your own data and models:
That's it! I hope you found this helpful.