A Beginner's Guide to PyTorch

Machine learning diagram As if we needed more evidence that machine learning is making its way out of the lab and into the hands of “regular” developers and their applications, along comes PyTorch, a Python open-source package developed at Facebook that enables neural network modeling, training, and testing, with a focus on deep learning and high performance.

PyTorch (brand-spanking new and still in beta at this writing) is distantly related to Torch, a machine learning framework also developed and used at Facebook. However, it’s a complete rewrite, so the similarities pretty much begin and end with the name.

PyTorch: The 10,000-Foot View

So what’s so great about PyTorch? Without diving too deeply into programming esoterica, here are a few things:

PyTorch is Python-centric, designed for deep integration in Python code instead of being an interface to a library written in some other language.
PyTorch is designed to leverage the math processing power and acceleration libraries of graphical processing units (GPUs), giving it blazing speed.
With memory optimization built-in, PyTorch does its work with a minimum of resource overhead.
PyTorch supports dynamic neural networks—that is, the network’s behavior can be changed programmatically at run-time. This gives PyTorch a major advantage over other machine-learning frameworks. (More on this later.)
Perhaps best of all to developers who are just joining the machine-learning party, it’s easy to learn and use.

On the Ground: How PyTorch Compares

How does PyTorch compare with other machine learning frameworks?

The main difference between PyTorch and other frameworks, such as TensorFlow, is its support for dynamic neural networks. TensorFlow and other approaches treat the neural network as a static object; if you want to change the behavior of your model you have to start from scratch. With PyTorch, the neural network can be tweaked on the fly at run-time, making it easier to optimize the model.

Another major difference lies in how developers go about debugging. Effective debugging with TensorFlow requires a special debugger tool that enables you to examine how the network nodes are doing their calculations at each step. PyTorch can be debugged using one of the many widely available Python debugging tools.

The last major advantage of PyTorch is the ease with which it can distribute computational work among multiple CPU or GPU cores. Although this parallelism can be done in other machine-learning tools, it’s much easier in PyTorch.

Despite its advantages, PyTorch does have some shortcomings. It still hasn’t had an official version 1.0 release, so it’s not quite stable enough for real production work, whereas TensorFlow and similar frameworks have more miles on them and thus better support, more thorough documentation, and larger developer communities. TensorFlow also comes with Tensorboard, a highly capable visualization tool for building the model graph (a “map” of the neural network) and various data representations that are specialized for machine learning. PyTorch doesn’t have anything like this yet, so developers will need to rely on one of the many existing Python data visualization tools.

As it stands now, and for the foreseeable future as it moves from beta to production, PyTorch appears to be best suited for drastically shortening the design, training, and testing cycle for new neural networks for specific purposes. This brings developers closer to “real” machine-learning development, somewhere between the arcane experimentation going on in computer science labs and tools such as CoreML, which provide predefined models that developers can use but not easily modify. That said, PyTorch is not an end-to-end machine learning development tool; development of actual applications requires conversion of the PyTorch code into another framework such as Caffe2 to deploy applications to servers, workstations, and mobile devices.

The Future of PyTorch

Given its advantages, PyTorch represents a significant step forward in the evolution of machine learning development tools. Even if it never gets much further than it is now, it should serve as an inspiration to other machine-learning frameworks to add features to simplify and shorten the neural-network design cycle. Ultimately, by making neural-network design more accessible to more developers, it will bring us closer to one or more machine-learning “killer apps,” the lack of which has held artificial intelligence back for so long.

Look for further development and refinement of PyTorch and other machine-learning development tools, then buckle up for an exciting ride. At AndPlus, we’re looking forward to it, and hope we can bring you along to see how machine learning can address your automation needs.