In celebration of the recent webrtc page, I decided to collect some stories from using WebRTC through the years.
The first time I worked with audio/video encodings and RTP was in the Xbox One days, when working on Xbox SmartGlass on a number of things, some of which eventually became the stream-from-console functionality.
There was little in way of negotiation (although there was some to be sure), and I remember learning a lot around various encoding profiles and what hardware worked well with which ones (the iPad was a novel thing back then and we could only push the processors so much, hardware decode had a big latency hit that we just couldn't take for games).
I remember my 'a-ha!' moment when I finally understood how a Fourier Transform could do its magic and preserve information - I was almost skipping around the office telling everyone that how I finally felt comfortable with that part of the code.
And of course all the discussions around whether we should use bandwidth to do more forward error correction or bump up quality or try to relieve network pressure, or how to smoothly fix broken frames, and how that helped me frame so many other future real-time network problems in terms of various kinds of tradeoffs.
"The H.264 book" for me was Wiley's The H.264 Advanced Video Compression Standard, 2nd Edition, which has amazing walk-throughs of the lifecycle of streams, frames, encodings, and great illustrations and pictures to follow along.
Years later, after spending a few years working on HLSL, I landed on the Mixed Reality group (the last days when it was still known as Analog).
The MixedReality-WebRTC libraries were very popular for all sorts of prototypes and early experiments, and that finally got me learning more about how SDP worked, along with "proper" ICE (where before I had only done mostly STUN-based implementations, without even knowing the proper names for the techniques I was using).
"The WebRTC book", aka WebRTC - APIs and RTCWEB Protocols of the HTML5 Real-Time Web was something I had picked up earlier when playing around with browser-based implementations, so thankfully I was able to more or less hit the ground running.
Another project I worked on was about scaling out the system to allow many people to find each other and communicate in various ways. I learned a lot more about SDP here, with heterogenous clients connecting with each other and having to "properly" negotiate the desired settings.
For this part of the project, a fun aspect was maintaining a C# tool for quick tests and diagnostics laid over native libraries (something that I had in the past with the HLSL .NET WinForms tool for example).
The other very interesting thing was getting into services and such, and leveraging Janus as well as building other tools around to dynamically scale and connect nodes.
Yet another fun adventure was trying to port the library to a new platform, in one of the early efforts to enable UWP or UWP-like environments (honestly I don't quite recall the details).
The fun parts here were mostly around figuring out how to properly configure audio codecs and get them well initializated, and pick the correct CPU instruction set. The target processor if I recall correctly was kind of mid-way between two classic vectorized instructions sets, so I had to touch up by hand which set of helpers would be used with which set/level of vectorized instructions.
The other challenge for me, possibly less fun in retrospect but quite fascinating at the time, was getting familiar with Google's build system and how to touch things up so we'd get the right flags passed around. I remember this was at a very low-level, and I ended up doing something horrible like sticking in some flags right next to the literal with the compiler name as a way of forcing certain flags/settings, which is absolutely the wrong thing to do for the build system but allowed me to maintain a diff patch that was in the order of a handful of localized lines updated.
Funnily enough, the browser aspect of webrtc is something that I never worked on a lot.
The standards-based foundation was quite solid in practice, however, and I ended up using browsers are testing and validation platforms in multiple scenarios, but I never did ever built something intended to be released to the public with a browser endpoint in mind.
Happy real-time communications!