What is latency and why is it crucial for music? A window into the history of FarPlay.

Have you ever thought that the teaching of music, not to mention music in general, was one of the fields most damaged by COVID lockdowns? Closed venues and cancelled concerts were just the tip of the iceberg. Musicians lost not just the ability to perform but also the ability to teach, rehearse, and in many cases even make music at all. There are lots of things we learned how to do remotely. You can hold staff standups, business meetings or teach math through Zoom. But not play music.

If you are surprised with this statement, you are not alone. It was a huge surprise for many musicians when they started searching for tools for remote collaboration in 2020. They just found it was impossible at all to keep time while playing remotely. Why? The short answer is latency. Delivering audio signals from one end to another takes time. We can tolerate it when we speak (even though it is the cause of a lot of awkward moments in video calls). But for playing music it is a killer issue. There were a number of research works showing that the maximum one-way latency that musicians can tolerate without losing time sync is about 30-50 ms. This is the time that a sound wave takes to travel a distance of 10-15 meters. This is why large orchestras need a conductor. Without visual gestures serving as synchronization signals, it would be impossible for musicians spread over an area of this (or larger) size to keep time. But the minimum latency available with video conferencing apps is about 100 ms, 2-3 times higher than the threshold. And this is for the most perfect network conditions and close geographic locations. In more typical cases it goes up to 200-400 ms and beyond.

The physical distance plays a critical role too. The speed of signal propagation in networks (no matter whether they’re wire or fiber) is close to the speed of light (~300 km/ms). It seems quite high but bear in mind that IP network routes consist of many intermediate legs between hops. Each I/O operation implies extra delay which nearly doubles the ideal theoretical latency. As an example, a typical round-trip ping time between Europe and the East Coast of the US is about 100 ms which means adding an extra 50 ms to the one-way latency.

In the spring of 2020 I got a call from my friend, a sound engineer. “Anton, musicians desperately need low latency audio conferencing”. I spent years grappling with latency during my career in IP telephony. I did know a lot about the difficulties and insurmountable hurdles for this. I said “forget it man, it’s impossible”. But we started searching around and eventually found that in the early 2000s Chris Chafe at Stanford University started a project called JackTrip aimed to address these issues. The idea was to remove as much extra latency as possible from the communication path by optimizing I/O, buffering and signal processing. You cannot change the speed of light or global network architecture. But there is still quite a bit of room for optimization. JackTrip made it possible to play rhythmic music over the Internet over distances up to 500-1000 km.

The one issue with JackTrip was its complexity. It was written for research, not for most musicians. It had no user interface, just a UNIX-style command line. It used direct IP address-port connections to the remote peer which means you have to configure your network router to open ports. It was based on Jackd audio server for internal audio I/O which is a standard tool on Linux, but setting it up on Mac or Windows feels like a nightmare even for professionals. Very few musicians were able to overcome all the difficulties and start using JackTrip for their music.

One of the early adopters and enthusiasts of JackTrip was Dan Tepfer. An NYC based jazz pianist and composer, he was also doing several projects combining music and technology (you might want to check out his Natural Machines to see). Since spring of 2020 he has used JackTrip for his weekly Monday Livestreams to play remote duos and trios with other musicians.

I started working with Chris and Dan in the summer of 2020. In cooperation with Dan, we developed a Broadcast Output feature for JackTrip that dramatically improved the time accuracy and audio quality of streaming from JackTrip sessions. But the main obstacle to JackTrip’s adoption by musicians, its complexity, remained untouched. That’s why in early 2021 Dan and I decided to start a new project focused on usability and a musician-friendly interface. That’s how FarPlay was born.

Initially we planned to utilize JackTrip code (which is open sourced under MIT license). But during this work I found that its architecture was hardly suited to our goals. It was designed as a UNIX command-line tool, not as a general purpose library. Thus I ended up writing a completely new implementation from scratch. I removed Jack dependency which allowed achieving even lower latency in some cases. Dan designed a GUI based on his experience using JackTrip in live-streaming and teaching others how to use it. We launched a closed pilot in August 2021 and an open beta in October 2021.

Leave a Reply

Your email address will not be published.