Amazon’s VP of machine learning, Swami Sivasubramanian, offered a keynote on machine learning during the second week of Amazon Web Services’ re:Invent conference.
During the keynote, Sivasubramanian announced that SageMaker, the company’s middleware platform for machine learning, will go forward with a plan to automatically divide parts of a large neural net and distribute it across several computers.
The method is called parallel computing and is based on a model known as parallelism. Typically, it takes a lot of effort to pull off. The new capability is part of the plan to bring machine learning to more people and avoid containing it in a small group of scientists who have the skills to develop the tech.
Bringing machine learning to the masses
Sivasubramanian remembers his early days at AWS when the company released services like S3 storage. Machine learning is at a similar point in its development, implying that AI is having an AWS moment and could be more broadly available.
He posed the question, “how can AWS bring machine learning to the large and growing database of developers and analysts?”
Training the deep learning models can take weeks, even with a team of scientists. Amazon can do the same thing in hours. This was achieved by training a large neural network named “T5”, a version of the Transformer natural language processing developed by Google, in three hours.
Neural networks just got easier
Sivasubramanian says that Amazon used distributed training to reduce the average time it takes to train extensive deep learning networks by 40%.
With the aid of distributed parallelism, he says that Amazon could train, not just T5, but another sizeable neural network named Mask R CNN, which is used in object detection and happens to be quite convoluted.
More capabilities, like the easy preparation of data for ML models, were also introduced.