[00:00.000 --> 00:08.160] This is actually my first time at FOSDEM. [00:08.160 --> 00:12.520] I've been meeting to come here for many, many years, [00:12.520 --> 00:15.720] and Saul Lorenzo been bothering me that I should come here and speak. [00:15.720 --> 00:18.360] So I'm glad, actually, I finally made it. [00:18.360 --> 00:23.960] If you, I run a blog, WebRC hacks for developers. [00:23.960 --> 00:28.880] I got a lot of guest authors and I hope readers in the audience here. [00:28.880 --> 00:33.800] You might also recognize me if you watch YouTube videos about WebRC stuff. [00:33.800 --> 00:37.560] I do an event series and videos at Cranky Geek. [00:37.560 --> 00:43.880] But today, I'm here to talk about some trends I did by [00:43.880 --> 00:46.360] analyzing GitHub and similar repositories. [00:46.360 --> 00:47.080] I'll talk about that in a second. [00:47.080 --> 00:48.640] But before we start, [00:48.640 --> 00:52.600] what are some things that would be nice to know what's going on in the tree? [00:52.600 --> 00:56.240] I'm an analyst like to watch this stuff and write about it. [00:56.240 --> 00:58.800] So one, is the community still growing? [00:58.800 --> 01:00.440] What are some of the interesting projects? [01:00.440 --> 01:02.160] What are some of the new trends? [01:02.160 --> 01:06.080] Are people using things like WebCodex yet or not? [01:06.080 --> 01:08.720] So how do you go about doing that? [01:08.720 --> 01:12.360] Well, I came up with a, I have a methodology. [01:12.360 --> 01:17.120] It's largely based on BigQuery and there's a bunch actually, [01:17.120 --> 01:22.680] a bunch of providers give data to BigQuery or make their datasets public available there. [01:22.680 --> 01:23.680] So you can grab that. [01:23.680 --> 01:25.720] GitHub is definitely the best one, [01:25.720 --> 01:28.560] basically any time, any public repo, [01:28.560 --> 01:32.360] any time you do any kind of commit pull request issue that all gets [01:32.360 --> 01:36.400] sent and stored in a BigQuery database. [01:36.400 --> 01:40.000] There's actually some other datasets that are cool in there too. [01:40.000 --> 01:43.480] I've used in the past and I actually use Stack Overflow in this one. [01:43.480 --> 01:48.000] And I often cross-reference that with other sources like the Chrome platform status is [01:48.000 --> 01:51.760] a good way to see what actual APIs are being used, [01:51.960 --> 01:54.280] at least in the Chrome world. [01:55.640 --> 01:58.240] So you get all that data. [01:58.240 --> 02:03.560] I actually like to use Colab, which is just a Google's hosted [02:03.560 --> 02:07.440] Jupyter notebook type ecosystem to do the analysis and follow things. [02:07.440 --> 02:11.520] And then when I get frustrated about coding and doing stuff like [02:11.520 --> 02:17.200] making charts with Python, then I fall back to Excel for quick analysis. [02:17.200 --> 02:19.960] So some of the hard things about this and important to keep in mind about [02:19.960 --> 02:21.480] the analysis, you can't see it there. [02:21.480 --> 02:26.080] But all this is really based on regular expressions and pulling out [02:26.080 --> 02:34.240] keywords to identify a repo as WebRTC or something as one term or another. [02:34.240 --> 02:37.920] So that obviously has some flaws because there'd be false hits in there and [02:37.920 --> 02:39.400] you have to be careful with your selection. [02:39.400 --> 02:41.760] So a lot of the time I spend is actually just going through looking at [02:41.760 --> 02:43.400] the data to make sure it's not junk. [02:43.400 --> 02:46.240] And I found in the past, there's a lot of bots. [02:46.240 --> 02:50.000] So you gotta remove those things and these data sets themselves aren't perfect. [02:50.000 --> 02:52.040] There's always missing data or junk data. [02:52.040 --> 02:54.320] So keep that in mind. [02:54.320 --> 02:58.440] But I've been doing this occasionally more frequently now. [02:58.440 --> 03:01.840] Since 2015, I've been tweaking the methodology along the way. [03:01.840 --> 03:06.640] It's held up so far, but always open to feedback. [03:06.640 --> 03:08.000] So let's just dive in. [03:08.000 --> 03:09.800] How are we doing here in WebRTC? [03:11.120 --> 03:13.640] So this just looks since 2019. [03:13.640 --> 03:16.840] I don't think it's a big surprise to anyone. [03:16.840 --> 03:19.520] But there was a peak during the pandemic. [03:19.520 --> 03:23.920] So you can see here, it's actually April of that year. [03:23.920 --> 03:27.480] And this particular graph shows distinct users, right? [03:27.480 --> 03:32.400] So anybody like anyone on GitHub, just based on unique GitHub IDs per [03:32.400 --> 03:35.960] period in there, so that was over 10,000 in April. [03:35.960 --> 03:38.760] But if you look here, actually, we're not doing so bad. [03:38.760 --> 03:40.760] This is bad data, I was missing that month. [03:41.320 --> 03:47.640] We're not actually doing so bad here, but we're only about 60% of past peak. [03:47.640 --> 03:52.040] So it's still pretty above where we were before the pandemic. [03:53.800 --> 03:58.120] But another thing I find actually very interesting is also because you can look [03:58.120 --> 04:00.080] at these events, who's actually contributing, right? [04:00.080 --> 04:05.280] And you can look at the blog, I'll have some links to more on the methodology. [04:05.280 --> 04:08.560] But essentially, people doing pull requests, pull requests reviews. [04:09.560 --> 04:12.680] That sort of thing that counts as a contributor, right? [04:12.680 --> 04:17.920] And actually contributors, if you believe me, are actually up more than ever. [04:19.000 --> 04:22.840] So first, one interesting point here you can see. [04:22.840 --> 04:24.800] So there was a peak number of users in April, but [04:24.800 --> 04:27.880] actually the peak contributions, it was in actually May. [04:27.880 --> 04:30.040] And maybe that makes sense, like people during the pandemic, [04:31.520 --> 04:37.080] got an issue in WebRTC, start looking at it, they maybe could star a repo, [04:37.600 --> 04:40.880] get involved, but then it takes them a few weeks actually before they actually [04:40.880 --> 04:44.320] contribute, start adding things into that repo. [04:44.320 --> 04:49.200] But as I said here, our most recent peak here was actually in August. [04:49.200 --> 04:52.240] And we're not actually too far off for the rest of the year looking at that. [04:52.240 --> 04:57.440] So I want to look into see what was going on and compare these two peaks. [04:57.440 --> 05:03.120] So one is this is actually from the August peak, I dug into the sea. [05:03.120 --> 05:07.120] Where's some of the repos that we're actually having, I had the most activity. [05:07.120 --> 05:10.840] And one of them, very popular WebRTC project is Coturn, [05:10.840 --> 05:15.080] the open source kind of stone and turn server. [05:15.080 --> 05:18.800] And actually one of my co-workers Gustavo took over that project. [05:18.800 --> 05:21.120] So I asked him what was going on here, what happened, [05:21.120 --> 05:26.040] why do we have such a big spike in this and wrote a whole blog post about it. [05:26.040 --> 05:31.360] But essentially, that project was kind of stale, not a lot of activity for [05:31.360 --> 05:38.120] a while, he and Pavel took that over and then started really get the community [05:38.120 --> 05:40.960] involved and there's a huge spike and things like that. [05:40.960 --> 05:47.160] So then I wondered, is basically is this, sorry, [05:47.160 --> 05:51.720] is the other peaks in August in here because of spikes? [05:51.720 --> 05:53.680] Or is there a lot of regular activity? [05:53.680 --> 06:00.280] You can see here, interestingly, you see things like AdGuard is always high in there. [06:00.280 --> 06:03.560] Like people actually plan to block WebRTC and its usage, right? [06:03.560 --> 06:07.960] But they have a lot of activity every month around that. [06:07.960 --> 06:12.800] But actually it wasn't actually, you see some commonality, but some difference here. [06:12.800 --> 06:18.280] And, sorry, but when you actually look at the distributions and [06:18.280 --> 06:23.160] the change between April of 2020 and August of 2022, [06:23.160 --> 06:26.760] the actual distributions across the top 10, top 20, top 100, [06:26.760 --> 06:29.320] they're actually not a whole lot different. [06:29.320 --> 06:31.640] So what does this all mean? [06:31.640 --> 06:35.040] It's like actually the WebRTC development actually is not really getting a lot more [06:35.040 --> 06:36.080] concentrated. [06:36.080 --> 06:37.440] You can look for a given period of time. [06:37.440 --> 06:39.640] Obviously some projects are doing more popular and [06:39.640 --> 06:42.400] doing well, have more activity than others. [06:42.400 --> 06:46.400] But overall, it's not like we're consolidating down to a few projects, right? [06:46.400 --> 06:52.120] It's the same kind of more equal distribution that's existed at least for [06:52.120 --> 06:52.680] several years now. [06:53.520 --> 06:59.360] So another data set, and this is actually a new one I hadn't looked at before, [06:59.360 --> 07:01.640] is Stack Overflow. [07:01.640 --> 07:05.840] So I zoomed in to take a look at that. [07:05.840 --> 07:08.920] And that's to see if this follows a similar pattern. [07:08.920 --> 07:12.640] Now bear in mind compared to the previous charts, this goes back all the way to [07:12.640 --> 07:15.160] 2012, so it's a much longer data period. [07:15.160 --> 07:20.360] And you can see here, this is comments on Stack Overflow questions and [07:20.560 --> 07:23.240] actually the questions themselves. [07:23.240 --> 07:28.480] And unfortunately, you can't see the font too much of answers within Stack [07:28.480 --> 07:32.960] Overflow, but it essentially looks very similar to the questions side of things. [07:32.960 --> 07:37.360] And you can see very similar here, peak in April of 2020. [07:39.480 --> 07:45.680] Also, unlike the GitHub analysis, this actually shows a peak and [07:45.680 --> 07:50.800] is here also in April of 2022. [07:50.800 --> 07:54.600] I didn't have a chance to dig into to see what was driving that particular peak [07:54.600 --> 07:55.720] this year. [07:55.720 --> 07:59.480] But overall, I think you can see it's a little bit harder compared to the other [07:59.480 --> 08:04.000] one, but we're still generally up compared to prior to the pandemic in terms of [08:04.000 --> 08:04.880] questions and answers. [08:04.880 --> 08:07.720] And actually, it's a pretty good sign that there's a lot of activity there. [08:09.640 --> 08:15.160] And I also just took a look to see as a percentage of all the questions on [08:15.160 --> 08:19.560] Stack Overflow, what percentage of them had at least something that mentioned [08:19.560 --> 08:21.640] WebRTC or one of these terms? [08:22.640 --> 08:25.440] And very surprising, actually, it's actually very high. [08:27.120 --> 08:33.120] So it's something like one in, during the pandemic, [08:33.120 --> 08:37.400] it was one out of every 1400 questions on Stack Overflow had something that [08:37.400 --> 08:40.640] mentioned WebRTC, which that seems like quite a bit because I still consider [08:40.640 --> 08:43.560] WebRTC kind of a very niche sort of thing. [08:43.560 --> 08:49.680] And even if you look today, just in the last data point in this one is November, [08:49.680 --> 08:55.720] at that point it was still one in, I'm sorry, it was one in 900 during the peak [08:55.720 --> 08:58.560] of April of 2020. [08:58.560 --> 09:01.880] It's still one in 1400 today, which was still actually very high. [09:01.880 --> 09:04.400] So you can see, you can interpret this two ways. [09:04.400 --> 09:07.760] One, WebRTC is very confusing and people have a lot of questions. [09:07.760 --> 09:10.240] So you need to comment on it, or you can also see there's a lot of people involved. [09:10.240 --> 09:12.960] I think both are good, right? [09:15.460 --> 09:15.960] But yeah. [09:17.760 --> 09:22.920] So also very interesting that can we look at this data set to understand [09:22.920 --> 09:26.600] development trends, what's going on in the market. [09:26.600 --> 09:29.720] And one of the very interesting things I always like to look at is what are some of [09:29.720 --> 09:33.840] the language trends, programming languages that people are using. [09:33.840 --> 09:38.360] And this is a jumble and hard to see, so let's actually zoom in. [09:38.360 --> 09:43.320] So one of the ones I've been trying to track for a while is JavaScript versus TypeScript. [09:43.320 --> 09:50.760] I've been delaying, converting to TypeScript and I'm kind of wondering, do I need to, is [09:50.760 --> 09:54.240] it time for me to really switch over or can I wait a little bit longer? [09:54.240 --> 09:57.840] You can see here, well, obviously TypeScript's been getting more popular. [09:57.840 --> 10:03.160] We just, you know, just in December reached the 50-50 point, right? [10:03.160 --> 10:06.720] Where, you know, all these repos where TypeScript's half. [10:06.720 --> 10:09.480] So I think I'm probably behind it and need to switch. [10:13.600 --> 10:19.560] So there's also, at this conference, a bunch of exciting new languages that are coming out. [10:19.560 --> 10:24.080] So I wanted to zoom in and kind of take a look to see what's going on with them. [10:24.080 --> 10:27.920] So, you know, Go, Kotlin and Rust in particular. [10:27.920 --> 10:32.920] So I will say one of the challenges, I didn't get any chance to filter this out, but [10:32.920 --> 10:37.400] this Go jump from November to December is some bots. [10:37.400 --> 10:39.720] So that's just bot activity. [10:39.720 --> 10:44.680] So you can, yeah, I thought originally maybe it's just Christmas and Go developers don't [10:44.680 --> 10:45.680] have anything better to do. [10:45.680 --> 10:49.200] So over the holidays, you know, they're just programming a lot and starting a lot of new [10:49.200 --> 10:50.200] repos. [10:50.200 --> 10:53.120] That wasn't, it was actually, it was some bugs. [10:53.120 --> 10:55.240] But you can see here, you know, steadily increasing. [10:55.240 --> 11:01.600] It's not a huge, huge spike for these other, these other two, but it is going up. [11:01.600 --> 11:07.360] But as a new language that's getting popular, I guess you'd expect to see more of that. [11:07.360 --> 11:12.440] So in addition to languages, also there's a bunch of the new APIs, some that were referenced [11:12.440 --> 11:13.680] earlier today. [11:13.680 --> 11:20.360] So Insertable Streams is one such API, and that's actually two sub-APIs, a media stream [11:20.360 --> 11:23.840] track processor and track generator. [11:23.840 --> 11:29.760] First took a look on Chrome status, actually credit to Fippo, you know, Phil Hankey for [11:29.760 --> 11:35.480] having a, he built a custom viewer of the Chrome status information that you can see [11:35.480 --> 11:37.880] or so compare them both at the same time. [11:37.880 --> 11:41.200] You can see that, you know, these are actually peaking, you know, quite a bit towards the [11:41.200 --> 11:42.200] end of the year going up quite a bit. [11:42.200 --> 11:46.440] So I was curious, like, who, you know, can we see our open source repos actually using [11:46.440 --> 11:49.040] these or is it somebody else? [11:49.040 --> 11:55.720] And looking at it, you know, there's a big spike here, but it doesn't look like much. [11:55.720 --> 11:57.240] So what's going on there? [11:57.280 --> 12:01.920] Zoom in a little bit more, and again, apologies, it's really small, but like that initial spike, [12:01.920 --> 12:10.200] a lot of that was just pure standards related activity of the W3C repos and WebKit and others [12:10.200 --> 12:13.400] that are just basically adopting, you know, adopting those APIs in the first place. [12:13.400 --> 12:15.440] So you see a big jump. [12:15.440 --> 12:20.280] After that, it's really kind of hit or miss, and I was, I mean, I love working with Insertable [12:20.280 --> 12:22.520] Streams, you know, let's see, do a lot of fun stuff. [12:22.840 --> 12:26.120] So hoping to see more, but it's kind of just spotty. [12:26.120 --> 12:28.440] So, you know, going back to the Chrome status, what does that mean? [12:28.440 --> 12:34.520] Well, at least people that are using it are probably someone like Google Meet, sort of [12:34.520 --> 12:36.400] thing that don't have public repos, right? [12:36.400 --> 12:41.520] Or there's something else outside of the public GitHub data set that's driving that usage. [12:43.800 --> 12:46.640] So another one is Web Codex. [12:46.640 --> 12:47.320] It's another one. [12:47.320 --> 12:48.680] This one doesn't have quite the same peak. [12:48.720 --> 12:54.640] It's a little bit, you know, Web Codex is not quite as far along, but they're still driving up. [12:54.640 --> 12:57.320] You want to see if there's something going on here too. [12:57.320 --> 13:02.000] And again, you see gradual increase, not a ton, except for this one spike. [13:02.000 --> 13:08.560] And this one spike, again, was largely related to, you know, the initial standards release [13:08.560 --> 13:13.040] of WebKit and W3C type repos and related once to deploy that. [13:13.040 --> 13:17.680] So we see some uptick, but nothing all that significant yet. [13:19.680 --> 13:27.320] And for the last section, I was also wondering, is WebRC winning? [13:27.320 --> 13:31.760] Like, we don't talk a whole lot about WebRC having competition so much anymore, at least I haven't. [13:31.760 --> 13:34.800] But in the early days, it was always, you know, WebRC versus SIP. [13:34.800 --> 13:37.000] And is it, you know, should SIP, you know, those SIP type developers, [13:37.000 --> 13:39.600] that ecosystems, should they shift over to WebRTC? [13:39.600 --> 13:43.840] And we haven't seen that a whole lot, but I think in reality, there still is competition. [13:43.840 --> 13:48.560] And that is certainly during the pandemic, you know, well, it's Zoom, right? [13:48.560 --> 13:52.160] And I actually presented this a couple of years ago at Dan's conference, [13:52.160 --> 13:57.160] an interesting fact where, you know, there was a month in time where Zoom was worth more [13:57.160 --> 14:00.640] than the seventh largest Analyze put together, at least our market capitalization, [14:00.640 --> 14:02.440] which is still insane when you think about it, right? [14:02.440 --> 14:03.800] But, you know, that was the reality. [14:03.800 --> 14:09.360] So I did want to check to see if that's still the case, and it's not, right? [14:09.360 --> 14:14.720] So, yeah, Zoom is, you know, using next to the same data point and just extending it out, [14:14.720 --> 14:18.920] you know, a little bit further, you know, Zoom's down near 80% where they were back [14:18.920 --> 14:20.920] in February of 2020. [14:20.920 --> 14:23.280] The airlines, though, aren't actually doing all that much better, right? [14:23.280 --> 14:27.600] So still relative, Zoom's not doing some bad relative to the airlines, [14:27.600 --> 14:29.400] at least those same seven. [14:29.400 --> 14:37.120] But anyway, Zoom, not quite what it was, but they still really are competition, right? [14:37.120 --> 14:40.480] And particularly because Zoom now has released a Zoom SDK, [14:40.480 --> 14:42.120] and they have a Web version of this SDK. [14:42.120 --> 14:46.320] So as a developer, you do have a choice. [14:46.320 --> 14:49.520] Hey, I want to go build a real-time communications application. [14:49.520 --> 14:52.600] You can choose to use the WebRC and, you know, all the vendors in the ecosystem, [14:52.600 --> 14:55.040] or you can go to choose Zoom for this. [14:55.040 --> 14:57.920] And I was curious, in Zoom's marketing, it's a lot. [14:57.920 --> 15:02.800] I'll probably have a post on WebRC hacks with football, hopefully in a few weeks [15:02.800 --> 15:05.720] that, you know, where Zoom's been promoting the benefits of this. [15:05.720 --> 15:11.560] And it's a, I'll go into why that's not completely true. [15:12.000 --> 15:12.640] During the post. [15:12.640 --> 15:17.400] But I wanted to see our developers actually choosing Zoom over, [15:17.400 --> 15:22.920] or at least mentioning the Zoom SDK over WebRC. [15:22.920 --> 15:27.440] It was going to take me a while to dig into all this on the GitHub analysis. [15:27.440 --> 15:30.160] It wasn't clear, so I didn't include that part yet. [15:30.160 --> 15:32.000] But on Stack Overflow, it's pretty actually clear. [15:32.000 --> 15:37.000] There's a distinct Zoom SDK tag or label that they have there. [15:37.000 --> 15:38.960] And you can see here, actually, at least for now, [15:38.960 --> 15:42.360] WebRC is still way more popular than the Zoom SDK. [15:44.200 --> 15:45.840] Two minutes, okay. [15:45.840 --> 15:48.120] And actually, I am done. [15:48.120 --> 15:53.000] So I guess what we've learned here, I mean, part of it is what are your expectations here? [15:53.000 --> 15:55.880] I didn't necessarily go into any expectations other than I was interested [15:55.880 --> 16:00.680] to see what are some of the trends and can we find or like learn things about new APIs [16:00.680 --> 16:01.760] or new repos. [16:01.760 --> 16:05.240] And I do go through the list, actually, is interesting to see new projects. [16:05.240 --> 16:06.640] Didn't have time to fit all that stuff in there. [16:06.640 --> 16:09.360] But again, you can reference some of the blog posts on this. [16:09.360 --> 16:11.840] But overall, my impression of WebRC is still doing pretty well. [16:11.840 --> 16:13.520] Obviously, it's not pandemic well. [16:13.520 --> 16:18.400] But given the circumstances, we're better than it was before the pandemic. [16:18.400 --> 16:21.760] We have more developers involved and it seems that developers that are involved, [16:21.760 --> 16:24.080] it is a lot of measures to say that they're more mature. [16:24.080 --> 16:25.000] They're better developers, right? [16:25.000 --> 16:27.480] They're contributing more than in the past. [16:27.480 --> 16:28.720] And I think that's a great thing.