Twitter has created an in-house developed tool for discovering performance anomalies and peaks in network usage.
Twitter uses Rezolus to measure workloads, measure runtime performance and obtain data for network optimization, says Brian Martin, site reliability engineer at Twitter in the blog post.
Martin further explained that Twitter has built Rezolus to be able to observe the performance of its systems in detail, on a very fine time scale. The makers of Rezolus regularly encountered differences in performance of just a few seconds in the network. However, they were unable to find the cause of these problems. Before that, the systems used had a measurement frequency that was too low in relation to the duration of the performance differences.
Problems of a few seconds
“Once, different services for a few minutes had to deal with a success rate that was getting worse and worse,” explains Martin. “These services all turned out to be slowed down by a back-end service. The team in charge of that service didn’t see anything on the monitoring systems they used to find out what happened during the minutes that a malfunction occurred. They did know that throttling is determined on a smaller time scale, so they began to suspect that the problems took place in a time span of less than a minute.
By using Rezolus in the service in question, Martin’s team was able to find out exactly what had happened. Martin also reports that Twitter has decided to make Rezolus open source, because that will help other companies. There are probably several players who have similar network problems. In this way, a larger community around Rezolus can also be built up.This news article was automatically translated from Dutch to give Techzine.eu a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.