LinkedIn makes its tool TonY open source available. The tool makes it possible to connect the machine learning framework TensorFlow to data stored in Apache Hadoop.
Google launched the open source software library TensorFlow in 2015. This should make it easier for developers to design, build and train deep learning models. Hadoop is a distributed processing software framework that manages data processing and big data storage.
TonY was built by LinkedIn because more and more people rely on deep neural networks to enable some of the features on the website. These include the news feed and smart replies. Many of the features were created with TensorFlow, which had no reliable way of connecting to Hadoop clusters.
TensorFlow itself did support distributed training, which is a technique used to process large data sets such as those in Hadoop. But LinkedIn had the problem that this process has to be set up by hand, which is a big task. Moreover, most data scientists cannot do this.
TonY needs to automate the task. The software works similar to how MapReduce allows you to run Apache Pig or Apache Hive scripts on Hadoop. The software offers a number of features to improve distributed training for neural networks. These include planning GPU for better resource management and support for TensorBoard.
The tool is now being made open source so that others interested in running distributed machine learning on Hadoop can use and contribute to the project. TonY is now available for download via GitHub.This news article was automatically translated from Dutch to give Techzine.eu a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.