[00:00.000 --> 00:09.680] Last but not least, it's Christoph Masio, but first of all, I would like to thank him [00:09.680 --> 00:11.920] for his work organizing the Dev Room all day. [00:11.920 --> 00:12.920] Thank you. [00:12.920 --> 00:20.680] Yes, Vostem is a volunteer event. [00:20.680 --> 00:24.760] And Christoph will be talking about distributing multicast channels to third parties, a case [00:24.760 --> 00:29.400] study with open source software and virtualization slash SROV. [00:29.400 --> 00:37.640] Thank you Kiran and also thank you Kiran for co-organizing the Dev Room with me. [00:37.640 --> 00:44.400] And if next year there are more volunteers, I think we would be happy to have that. [00:44.400 --> 00:45.400] So I'm Christoph Masio. [00:45.400 --> 00:51.200] I've been in the broadcast business for quite some time now and I run a company called Easy [00:51.200 --> 00:54.200] Tools. [00:54.200 --> 00:59.320] One of the purpose of this company is to distribute linear channels. [00:59.320 --> 01:04.680] So either we create them or we get them from satellites or from abroad or we get them in [01:04.680 --> 01:10.440] data centers and we deliver them somewhere in the world or to network operators and so [01:10.440 --> 01:11.440] on. [01:11.440 --> 01:15.600] So one of the critical parts of doing that is knowing how to distribute multicast to [01:15.600 --> 01:18.200] other people. [01:18.200 --> 01:19.680] Usually how does it work? [01:19.680 --> 01:25.480] So first I must clarify a linear channel from my point of view at least. [01:25.480 --> 01:27.080] It's basically transport stream. [01:27.080 --> 01:36.840] So it's MPEG-TS over UDP or RTP, depends on your religion, both exist and generally multicast. [01:36.840 --> 01:42.040] So usually you put seven TS packets inside one UDP frame and you're done. [01:42.040 --> 01:45.240] And it's a continuous stream all the time. [01:45.240 --> 01:47.400] How do you exchange these kind of streams? [01:47.400 --> 01:50.800] Usually you take a rack in a well-known point of exchange. [01:50.800 --> 01:54.640] In Paris there is one called Ted House, it's very popular. [01:54.640 --> 01:58.000] He knows that in Paris. [01:58.000 --> 02:03.520] And so you have your rack, you put your switch and you buy cross-connects from your data [02:03.520 --> 02:05.880] center to your peers. [02:05.880 --> 02:11.120] So your peers can be your sources, people provide your streams or could be where you [02:11.120 --> 02:12.120] distribute them. [02:12.120 --> 02:16.280] So the distributors, the operators and so on. [02:16.280 --> 02:23.360] So for safety reasons you will want to have each source or each operator in a different [02:23.360 --> 02:29.600] VLAN so that they don't see each other from confidentiality and also some content is a [02:29.600 --> 02:32.120] little bit critical. [02:32.120 --> 02:39.040] So the question will be how do you copy basically a multicationals from one source VLAN to a [02:39.040 --> 02:40.120] destination VLAN. [02:40.120 --> 02:48.240] It sounds stupid and perfectly easy to answer, but it's not so easy. [02:48.240 --> 02:50.440] There is a pure network solution. [02:50.440 --> 02:54.920] With some switches you have a feature called multicast service reflection, that's what [02:54.920 --> 02:55.920] Cisco provides. [02:55.920 --> 03:00.960] I'm pretty sure there is the equivalent on Juniper and that kind of thing. [03:00.960 --> 03:05.920] But it's not so widely available, lots of switches don't handle it, only a small range [03:05.920 --> 03:07.840] of that. [03:07.840 --> 03:13.320] And so basically you can type a command there to say that you copy this multicast address [03:13.320 --> 03:20.200] to this destination and so you copy from a VLAN to another VLAN. [03:20.200 --> 03:25.800] So it's not widely available, there are good chances that your switches do not support it. [03:25.800 --> 03:32.000] And also you cannot handle complex use cases like some operators want RTP, some don't want [03:32.000 --> 03:35.160] RTP, you cannot remove RTP with that. [03:35.160 --> 03:39.240] And some operators also want you to have specific PID, specific service ID, different [03:39.240 --> 03:42.840] service name and so on, that you cannot do of course. [03:42.840 --> 03:50.400] So generally you will end up using devices on top of your switch, you will have to plug [03:50.400 --> 03:54.600] a device that will do that kind of job. [03:54.600 --> 04:01.000] The broadcast solution, the most popular one is what we call DCM, it's a very popular [04:01.000 --> 04:02.000] brand. [04:02.000 --> 04:08.040] Initially by Cisco, now it's a company called CineMedia, it's a hardware service with electronics [04:08.040 --> 04:13.080] inside and it does all this work of switching, it can even transcode actually with some cards, [04:13.080 --> 04:18.880] it does a lot of things, but it's also very expensive of course. [04:18.880 --> 04:23.000] But we have an open source alternative, it's actually something I wrote maybe 15 years [04:23.000 --> 04:28.560] ago, it's called DVBLAST, I've been doing a lot of talks about that, but in this use [04:28.560 --> 04:30.040] case it also helps. [04:30.040 --> 04:37.960] Originally it was written as a DVB DMAX, so you have a satellite card or DTT card and [04:37.960 --> 04:42.400] you want to get a transponder and split each channel into a multicast address, that was [04:42.400 --> 04:44.400] the original goal of DVBLAST. [04:44.400 --> 04:51.720] But actually with the dash D option you can also read from a multicast channel. [04:51.720 --> 04:57.000] And in the arguments you can also say which is the IP address of the interface you want [04:57.000 --> 05:01.560] to read it from, so basically which VLAN, that means which VLAN you will read it from. [05:01.560 --> 05:07.200] So basically that reads a multicast stream from a specific VLAN. [05:07.200 --> 05:12.640] There is a configuration file associated with DVBLAST, and you can put as many lines [05:12.640 --> 05:16.840] as you want, so as many distributors as you want, and for each line you will just put [05:16.840 --> 05:20.280] the multicast address you want to send it to. [05:20.280 --> 05:26.240] And you can also optionally give the address of the interface, again you want to send it [05:26.240 --> 05:27.880] to. [05:27.880 --> 05:33.880] The VLANs you have on your switch, they will appear as network interfaces on your server, [05:33.880 --> 05:38.560] so you can decide to which VLAN you want to send and which multicast address. [05:38.560 --> 05:45.280] You have a number of options to turn on or off RTP, you can remap PID, SID, channel service [05:45.280 --> 05:52.080] name and you can even spoof the source address, which is very useful in case your peer wants [05:52.080 --> 05:58.600] to do IGMPv3, in that case you ask you to put a specific source address and that's the [05:58.600 --> 06:02.320] easiest way to spoof it. [06:02.320 --> 06:06.240] So problem solved, end of talk, thank you very much. [06:06.240 --> 06:10.080] Now I wanted to add a little something. [06:10.080 --> 06:14.360] You may want to run hundreds of DVBLAST on one server, but for some reason I wanted to [06:14.360 --> 06:16.640] do some virtualization. [06:16.640 --> 06:18.040] Why virtualize? [06:18.040 --> 06:23.440] Because I have different customers and each of the customer has different channels, some [06:23.440 --> 06:27.760] of them are doing adult content, some of them doing children content, maybe I don't want [06:27.760 --> 06:30.320] to mix that in the same server. [06:30.320 --> 06:35.680] And I don't have the money to have multiple servers at 10 hours because it's expensive. [06:35.680 --> 06:40.040] So for client isolation it's required. [06:40.040 --> 06:46.440] Also some of my clients have direct access to the VM, because that's the service I sell. [06:46.440 --> 06:51.800] So again I don't want them to see the streams of the competitors potentially. [06:51.800 --> 06:59.760] So I used Proxmox, that's a very nice distribution based on DBN and it's a very big front end [06:59.760 --> 07:04.000] over KVM, very useful. [07:04.000 --> 07:07.880] And in Proxmox what you will do is each of the VLAN you get from the Twitch, you will [07:07.880 --> 07:14.840] bridge that and on your guest, guest virtual machine, it will appear as a network interface [07:14.840 --> 07:16.320] with a VIRT-Io driver. [07:16.320 --> 07:19.120] The VIRT-Io driver is the most optimized one. [07:19.120 --> 07:22.720] You can also emulate a networker, but it's much slower. [07:22.720 --> 07:26.960] So that's why everybody uses a VIRT-Io driver. [07:26.960 --> 07:30.320] So everything works fine, end of talk and thank you very much for coming. [07:30.320 --> 07:32.280] There is just one little problem. [07:32.280 --> 07:36.880] The morning when you get called by one of your big customers and he says, I have discontinuities [07:36.880 --> 07:39.400] on what you send me. [07:39.400 --> 07:44.480] Discontinuities, that's what every people in broadcast feels. [07:44.480 --> 07:50.760] And you start another DV Blast on the IP address that you put in, you see nothing. [07:50.760 --> 07:55.960] Everything is fine, and they insist and another customer complains. [07:55.960 --> 08:00.160] So what you do is you rack another server, that's a use case by the way, that's why I'm [08:00.160 --> 08:01.160] telling my life. [08:01.160 --> 08:05.800] So you rack another server and you listen to what the other server is sending. [08:05.800 --> 08:10.720] And then you see, a lot of discontinuities indeed. [08:10.720 --> 08:17.000] And you dig it a little and you just see that the VIRT-Io driver, I don't know if it's [08:17.000 --> 08:20.600] guest side or host side, re-orders some packets. [08:20.600 --> 08:24.600] I mean, probably there are several queues inside the drivers. [08:24.600 --> 08:28.760] And if this packet does not have any luck and goes into a queue that for some reason [08:28.760 --> 08:33.040] doesn't run for some time, it will be pushed there. [08:33.040 --> 08:39.280] So I hear now the network guy will tell me that well, there is no guarantee on the order [08:39.280 --> 08:43.840] of UDP packets, that's the specification, that's true. [08:43.840 --> 08:47.920] But we are in an industry that relies on it. [08:47.920 --> 08:53.960] Because if you don't have RTP, there is absolutely no way you can reorder your UDP transport string. [08:53.960 --> 08:55.560] So I had to find a solution. [08:55.560 --> 09:01.720] I cannot tell my customer, no, you have to use RTP and I don't care, I cannot, you cannot. [09:01.720 --> 09:07.600] So I found the first workaround is by using another driver. [09:07.600 --> 09:11.440] It's also a driver designed for virtual system, the VMware. [09:11.440 --> 09:14.080] But it's supported by Poxmox, KVM and so on. [09:14.080 --> 09:20.440] So it's called VMXNet and probably this one only has one queue and it solves the problem. [09:20.440 --> 09:26.640] The only problem is that it uses 30% more CPU and this server is only doing that. [09:26.640 --> 09:36.440] And I already have 64 core IMD epic servers that are not full, but more than 50% already. [09:36.440 --> 09:38.600] So there is some kind of need to optimize. [09:38.600 --> 09:43.640] So that's why I started to look at other alternatives and many big, some clever ideas that I found [09:43.640 --> 09:45.040] on the net. [09:45.040 --> 09:51.280] And one of them that you probably heard of maybe is called SRRUV. [09:51.280 --> 09:52.280] So what is SRRUV? [09:52.280 --> 09:57.280] So it's a feature of some network cards, not all network cards, some network card of that [09:57.280 --> 09:58.680] feature. [09:58.680 --> 10:05.480] In a normal installation, so the network card is owned by the host. [10:05.480 --> 10:11.880] You have some kind of software switch that is handled by the virtualizer. [10:11.880 --> 10:17.160] And here you have virtual interfaces to each of the virtual machine. [10:17.160 --> 10:23.720] In an SRRUV setup, what happens is that the network card will create new PCI devices. [10:23.720 --> 10:26.320] So it's a different PCI device. [10:26.320 --> 10:30.520] And this PCI device, there is a feature called VTD on Intel that you have the current on [10:30.520 --> 10:36.160] AMD that will allow you to dedicate a PCI device to a virtual machine. [10:36.160 --> 10:40.760] And doing that, the virtual machine directly talks to the PCI device without anything going [10:40.760 --> 10:43.000] through the host. [10:43.000 --> 10:46.280] So that looks quite interesting. [10:46.280 --> 10:51.800] These new devices are called virtual functions, VF. [10:51.800 --> 10:57.520] And so on the VM, you just need to have a VF driver, which is included in Linux anyway. [10:57.520 --> 11:00.560] So in my use case, I used Intel cards. [11:00.560 --> 11:05.400] They're not the only one doing SRRUV, but it may be a little bit different if you use [11:05.400 --> 11:06.400] different cards. [11:06.400 --> 11:10.080] Not all cards have the same features. [11:10.080 --> 11:15.320] So using SRRUV, it's a little bit tricky. [11:15.320 --> 11:21.560] First it requires support from the motherboard, the CPU, the BIOS itself, and of course the [11:21.560 --> 11:24.360] network card, as I was just saying. [11:24.360 --> 11:30.680] You have to enable a number of features in the BIOS, IOMMU, but also ACS, ARI, which [11:30.680 --> 11:36.040] is some kind of PCI Express routing protocol, lots of things. [11:36.040 --> 11:41.000] And basically, we spent a couple hours, I think, just setting it up. [11:41.000 --> 11:48.200] My personal advice would be for you to upgrade the drivers, the inside drivers, to the latest [11:48.200 --> 11:52.680] version and the card firmware, because there is a firmware running on the card, to the [11:52.680 --> 11:56.960] latest version supported by the driver. [11:56.960 --> 12:01.160] Once I did not do that when I started my test, and I ended up having half of my VLANs working [12:01.160 --> 12:04.480] and the other half didn't work, and there is absolutely no way to know what's going [12:04.480 --> 12:06.200] on, I just had to reboot. [12:06.200 --> 12:11.960] And it was in production, so I had to imagine what happened. [12:11.960 --> 12:15.240] Creating the virtual function is actually quite easy, it's just echoed into a slash [12:15.240 --> 12:22.160] CIS device, and with my Intel card, I can create up to 64 virtual functions, so that's [12:22.160 --> 12:27.920] up to typically 64 VMs. [12:27.920 --> 12:34.880] So the path through is very easy on the Proxmox, just a menu, and potentially, each VM has [12:34.880 --> 12:40.280] access to other VLANs, so that is also, that may be also drawback because it sees all the [12:40.280 --> 12:49.840] traffic that it can send to any VLAN, can send packets on any VLAN, so you have to trust [12:49.840 --> 12:54.520] your clients a little bit, so it may not be adapted to all situations. [12:54.520 --> 12:58.560] But it's quite useful, because you don't have to create a bridge every time you want [12:58.560 --> 13:07.840] to create a new VLAN, you just create a network interface, an U821.Q network interface into [13:07.840 --> 13:12.960] your VM, and that's all, so it's actually easier to manage. [13:12.960 --> 13:19.720] Problem is, so everything looks perfectly fine, so that's the end of the talk, not yet. [13:19.720 --> 13:25.960] I talked about sending to VLANs, but how to receive multicast from the VLAN, that's another [13:25.960 --> 13:27.120] problem. [13:27.120 --> 13:33.280] But initially, it looks like it works, so you put it in production, maybe a bit early. [13:33.280 --> 13:39.960] So how does it work, how does a network card work with multicast, first I should maybe [13:39.960 --> 13:41.880] remind you how it works. [13:41.880 --> 13:50.400] So you have your multicast IP address, you translate that to the multicast MAC address, [13:50.400 --> 13:55.520] so the end is the same and the beginning is just dropped. [13:55.520 --> 14:00.400] And then you tell the card, every packet that arrives on this MAC address, I want it. [14:00.400 --> 14:05.000] So that's how it works on any normal PC, that's quite a standard, it's called a MAC filter, [14:05.000 --> 14:06.840] it's quite a standard feature. [14:06.840 --> 14:12.800] The trick is, on these cards, the number of MAC filters is limited, and the way they [14:12.800 --> 14:18.280] limit it is a little bit stupid, they take the whole buffer of the card, the whole number [14:18.280 --> 14:24.640] of MAC filters they have, and they divide it by the number of VF you have, so 64. [14:24.640 --> 14:30.360] And in the end, according to my calculation, your limit is around 100 multicast addresses. [14:30.360 --> 14:37.840] So 100 may be a lot, but I have hundreds of multicast streams in my network. [14:37.840 --> 14:43.040] So you may reach it, so you may think about segmenting your virtual machines not to go [14:43.040 --> 14:48.560] above the threshold info, but it's still a dangerous game because there is a feature [14:48.560 --> 14:54.440] of the intro driver, that if you reach that limit, it's a scientific phase, of course, [14:54.440 --> 14:59.320] while you have a message in the message, but nobody reads that. [14:59.320 --> 15:03.880] So you will try again, and try again, and try again, and after just a few trials, like [15:03.880 --> 15:10.560] five, the kernel decides that your VM is crazy, and it won't talk to it anymore. [15:10.560 --> 15:15.840] So you will still receive your multicast, but if you have any other command to send [15:15.840 --> 15:20.520] to the card, like creating a new VLAN, which could happen, you have a new distributor, [15:20.520 --> 15:27.520] so you can't, you have to reboot, you have to reboot your VM, fortunately, not the host. [15:27.520 --> 15:33.720] So it's not that practical, and to be honest, I have a patch in all my installation that [15:33.720 --> 15:40.640] disables the doing dead feature, and disables using the MAC filter at all, actually, because [15:40.640 --> 15:42.920] I have found it not practical, in reality. [15:42.920 --> 15:43.920] Yeah. [15:43.920 --> 15:48.920] Is there a need for the MAC filtering at all, given that, like, modern switches, like, [15:48.920 --> 15:53.120] you're not going to get any multicast traffic, unless you do an IGNB billing? [15:53.120 --> 15:57.600] Yeah, but the thing is, so the question is, do we need the MAC filter at all? [15:57.600 --> 16:02.320] The thing is, you will receive approximately two or three gigabit of traffic, and maybe [16:02.320 --> 16:05.440] your VM does not need all of that. [16:05.440 --> 16:10.360] So you have to decide which multicast you want, actually, that's my next line. [16:10.360 --> 16:14.440] My first idea for a workaround was to put everything in promiscuous mode. [16:14.440 --> 16:18.720] So promiscuous mode means what it means, it means your VM will receive all the traffic [16:18.720 --> 16:23.120] that is received by the network card. [16:23.120 --> 16:27.680] It looks like a good idea, but it dramatically increases if you use it, because from maybe [16:27.680 --> 16:32.160] two gigabit per second of data, you only need 200 megabit per second of data, and the rest, [16:32.160 --> 16:33.920] the kernel, we have to filter it. [16:33.920 --> 16:36.200] So your kernel will do a lot of job. [16:36.200 --> 16:41.080] And from what I've calculated, basically, the gain you had from going from VertiO to [16:41.080 --> 16:45.480] SRIOV, you lose it, right there. [16:45.480 --> 16:47.560] So that's first. [16:47.560 --> 16:54.400] One problem is that, imagine you have two gigabit per second on your network, and you [16:54.400 --> 16:56.640] have 20 virtual machines. [16:56.640 --> 17:01.360] The network card will send 20 times two gigabit per second to your virtual machines. [17:01.360 --> 17:02.840] And that means 40. [17:02.840 --> 17:05.160] And 40 is the limit of the card. [17:05.160 --> 17:08.200] And at that point, you will start losing packets randomly. [17:08.200 --> 17:12.160] Again, silently, you will not know what happens. [17:12.160 --> 17:17.160] And of course, obviously, you only know that in production, because when you first started [17:17.160 --> 17:19.400] with one, two, three VM, it works perfectly. [17:19.400 --> 17:22.200] So you say, yes, I have my solution. [17:22.200 --> 17:28.440] And then you put all your load on it, and then one day, it just stops working. [17:28.440 --> 17:32.200] So while activating promiscuous mode, it's actually quite easy. [17:32.200 --> 17:40.920] Again, it's an echo in such a file. [17:40.920 --> 17:45.320] So I have found a second workaround, which is a little bit better. [17:45.320 --> 17:48.200] I'm using it in production. [17:48.200 --> 17:50.680] It's called, maybe it's only an Intel card. [17:50.680 --> 17:55.440] I don't know if it exists on other brands, but it's a feature called VLAN mirror. [17:55.440 --> 18:00.080] And basically, it tells the card to send all the traffic belonging to a VLAN to a particular [18:00.080 --> 18:03.840] virtual function, to a particular VM. [18:03.840 --> 18:10.000] So that kind of promiscuous feature, but only for one VLAN, which is kind of a good practice [18:10.000 --> 18:16.800] because it means that I think most people who have ever done multicast, when you have [18:16.800 --> 18:22.640] a backbone, you put all of your multicast addresses in the same VLAN, maybe with different [18:22.640 --> 18:25.000] address ranges, or maybe not. [18:25.000 --> 18:31.680] And you expect at the other end that the receiver will pick up which multicast address you want. [18:31.680 --> 18:35.280] This approach forces you to have different VLANs per customer. [18:35.280 --> 18:43.080] So it's actually not a bad idea, but there is one ruleback. [18:43.080 --> 18:46.160] One specific VLAN can only be sent to one VM. [18:46.160 --> 18:52.960] So if you have, let's say, a big broadcaster that is sending you channels and several VMs [18:52.960 --> 18:59.120] needs those channels, you cannot use that solution because only one will be able to read [18:59.120 --> 19:00.200] from that VLAN. [19:00.200 --> 19:05.960] So I have a third workaround, it's just to go back to the good old vertio. [19:05.960 --> 19:07.160] After all, why not? [19:07.160 --> 19:15.800] The packets in version were only on TX, not on RX. [19:15.800 --> 19:16.800] So it works. [19:16.800 --> 19:21.000] Actually, it has also some additional features because the bridge in Proxmox actually is [19:21.000 --> 19:25.120] but most of the time it has IGNP snooping. [19:25.120 --> 19:29.000] So we'll only receive the multicast addresses that you subscribe to. [19:29.000 --> 19:33.720] So it's actually a good solution, but then it means you have basically interfaces to [19:33.720 --> 19:39.920] read the packets from and interfaces to send the packets to, which is a bit of a mess, [19:39.920 --> 19:41.440] but still it's a good compromise. [19:41.440 --> 19:46.600] That's my compromise currently, VLAN mirror, all this one depending on the nature of the [19:46.600 --> 19:48.880] VLAN I have to read from. [19:48.880 --> 19:51.640] So all is good in the best of words. [19:51.640 --> 19:53.360] At the end of the talk, thank you very much. [19:53.360 --> 19:54.360] Not quite. [19:54.360 --> 19:58.440] There is another topic I haven't mentioned yet. [19:58.440 --> 20:07.440] What if I want to read a multicast stream coming from another VM on the same server? [20:07.440 --> 20:14.960] That doesn't work because when you write through SIOV, it just outputs to the switch. [20:14.960 --> 20:19.560] As far as I know, maybe some of you have a solution, there is no way to get the traffic [20:19.560 --> 20:25.880] back to the network card and use it in another virtual machine. [20:25.880 --> 20:36.040] You could do that, but if you don't want to have different VLANs, if you want to read [20:36.040 --> 20:39.120] from the same VLAN, then you cannot read. [20:39.120 --> 20:40.120] Okay. [20:40.120 --> 20:47.520] Well, there is another solution with the intercard again. [20:47.520 --> 20:50.480] It's called egress mirror. [20:50.480 --> 20:54.920] You can make it so that everything that's output on virtual function number one will [20:54.920 --> 20:58.520] be mirrored to virtual function number seven. [20:58.520 --> 21:03.400] So virtual function number seven will be on your receiving side and virtual function [21:03.400 --> 21:08.400] number one will be on the transmitting side. [21:08.400 --> 21:13.520] So that actually works and also I use that in production. [21:13.520 --> 21:15.640] So conclusions. [21:15.640 --> 21:22.440] While multicast on virtualized environment is no picnic actually, and I'm surprised. [21:22.440 --> 21:25.600] You don't find many papers about that. [21:25.600 --> 21:34.040] I've struggled literally for years on this problem with a number of problems in production [21:34.040 --> 21:38.680] because you only see the problem in production because you only see them on the load. [21:38.680 --> 21:42.680] And so this has been a little bit tiring. [21:42.680 --> 21:45.400] Thank you very much for listening to this. [21:45.400 --> 21:48.400] And if you have any questions. [21:48.400 --> 22:00.400] I'm guessing you tried that, but in Vert.io there are ways to actually ask it to be dumber [22:00.400 --> 22:06.280] by not doing some of the things it does and I'm not sure but probably you tried that. [22:06.280 --> 22:13.000] But you could probably have asked them to say, hey, I just keep them in order. [22:13.000 --> 22:19.000] You say in Vert.io there are ways to ask it to be dumber. [22:19.000 --> 22:20.000] I'm not sure. [22:20.000 --> 22:23.840] Yeah, I'm not sure I've actually tested it. [22:23.840 --> 22:28.400] I'm pretty sure it would have the same effect as using VMX net and probably you would see [22:28.400 --> 22:35.960] an increase in CPU consumption anywhere I wanted to go to SRIOV because for other reasons [22:35.960 --> 22:39.960] because I wanted to have my VM talk directly to the network card. [22:39.960 --> 22:43.960] But I think it was better practice than Vert.io. [22:43.960 --> 22:45.960] Yeah, James. [22:45.960 --> 22:49.960] Did you just back the 3D slide, please? [22:49.960 --> 22:50.960] Yeah. [22:50.960 --> 22:53.960] Okay, there was no question. [22:53.960 --> 22:54.960] Yeah? [22:54.960 --> 23:01.960] Just AWS provide the DC to make sense with the SRIOV. [23:01.960 --> 23:02.960] With SRIOV? [23:02.960 --> 23:03.960] Yeah. [23:03.960 --> 23:04.960] Okay. [23:04.960 --> 23:11.960] So he says AWS provides instances with SRIOV. [23:11.960 --> 23:17.960] I guess in the cloud you don't have that kind of problem because usually you do SRT. [23:17.960 --> 23:18.960] Usually. [23:18.960 --> 23:24.960] And SRT doesn't care if you need to roll those packets because it will. [23:24.960 --> 23:25.960] Yeah. [23:25.960 --> 23:31.960] Did you consider forcing, well, either disable multi-Q on the host or forcing each VM to [23:31.960 --> 23:32.960] a different queue? [23:32.960 --> 23:35.960] So that by definition you will not be reordering? [23:35.960 --> 23:38.960] Forcing each VM to a different queue, I'm not sure it works. [23:38.960 --> 23:43.960] Disable queuing probably it will work but with additional CPU usage. [23:43.960 --> 23:44.960] Yeah. [23:44.960 --> 23:52.960] I wanted to, as I said, I have 60 core APIC servers which are pretty big and they're not full, [23:52.960 --> 23:53.960] but... [23:53.960 --> 23:58.960] It should be possible to pin, it's possible to pin a process to give, if you do XPS CPUs [23:58.960 --> 23:59.960] you can pin a process. [23:59.960 --> 24:01.960] At the end I don't know if you can do that. [24:01.960 --> 24:06.960] I'm not sure if the inversion happens on the host side or on the guest side. [24:06.960 --> 24:08.960] Disable multi-Q in your host. [24:08.960 --> 24:09.960] Yeah. [24:09.960 --> 24:10.960] Enjoying your guest. [24:10.960 --> 24:11.960] Yeah. [24:11.960 --> 24:12.960] But yes. [24:12.960 --> 24:13.960] Mm-mm. [24:13.960 --> 24:14.960] eBay? [24:14.960 --> 24:23.960] Why do you want to use SRIOV for, like, what seems differently is, like, what are the other [24:23.960 --> 24:28.960] use cases you're thinking about with SRIOV? [24:28.960 --> 24:34.960] The question is why did I want to use SRIOV? [24:34.960 --> 24:35.960] Speed. [24:35.960 --> 24:40.960] Speed is the first argument, actually, but also some kind of clean design. [24:40.960 --> 24:46.960] Maybe I'm a little bit of a purist and I wanted my VM to have direct access, physical access, [24:46.960 --> 24:48.960] I mean, to the network. [24:48.960 --> 24:57.960] Also, so if you really want the list of argument, the traditional way of doing VLAN bridging [24:57.960 --> 25:03.960] works, works, but probably other environments, you create a bridge for each VLAN. [25:03.960 --> 25:05.960] And so you have a network interface for each VLAN. [25:05.960 --> 25:09.960] Maybe there is a way to avoid that, but I didn't look too much into that. [25:09.960 --> 25:13.960] That means you have a limit because I think you're limited to 32 network interfaces. [25:13.960 --> 25:16.960] The limit is low enough. [25:16.960 --> 25:23.960] Otherwise you have to switch to a VLAN switch, but considering the amount of problems I had [25:23.960 --> 25:28.960] without it, I thought maybe I will not go into that. [25:28.960 --> 25:30.960] I haven't tried, to be honest. [25:30.960 --> 25:34.960] These are probably people, you don't see papers about the most people who are provided [25:34.960 --> 25:37.960] to use HTTP. [25:37.960 --> 25:38.960] Exactly. [25:38.960 --> 25:43.960] As you say, SRIOV is mainly used by people who do HTTP or some TCP based. [25:43.960 --> 25:47.960] And using SRIOV for non-networking things? [25:47.960 --> 25:52.960] Ah, you better ask if I'm considering using SRIOV for non-networking things. [25:52.960 --> 25:55.960] I suppose you imply GPU maybe? [25:55.960 --> 25:56.960] Maybe. [25:56.960 --> 26:03.960] But my question is, do you envision other places where multimedial creatures and short [26:03.960 --> 26:10.960] GPU is almost an obvious one, but have you ever thought of using that to do other things? [26:10.960 --> 26:13.960] GPU, we've actually tried to. [26:13.960 --> 26:17.960] I'm not sure of the one we had supported SRIOV, but that... [26:17.960 --> 26:19.960] On the really expensive ones. [26:19.960 --> 26:23.960] Yeah, they have features like VTG, I think. [26:23.960 --> 26:27.960] But in our experience, it didn't work well. [26:27.960 --> 26:35.960] But we would like to have the ability to share a GPU among several instances so that you [26:35.960 --> 26:39.960] could decode or encode on different VMs and so on. [26:39.960 --> 26:44.960] These virtual machines that we have at Tellhouse, they don't do that because Tellhouse is not [26:44.960 --> 26:47.960] the place to do transcoding because it's too expensive. [26:47.960 --> 26:49.960] Fundamentally, yes. [26:49.960 --> 26:51.960] I would be interested in having that for GPUs as well. [26:51.960 --> 26:54.960] At least GPUs, if you have any other ideas, tell me. [26:54.960 --> 26:57.960] That's my whole thought behind the question. [26:57.960 --> 27:00.960] Just mentioning, because of Proxmox, I mean, it's support. [27:00.960 --> 27:06.960] Sometimes it's support as well for SRIOV for decode, and they are expensive. [27:06.960 --> 27:11.960] And it's just landing in Proxmox and QMU, so you can actually play with that if you have [27:11.960 --> 27:18.960] to. [27:18.960 --> 27:26.960] Could you use that to SDI or NDI pass through to get directly processing? [27:26.960 --> 27:28.960] Yeah, you don't need the SRIOV for that. [27:28.960 --> 27:34.960] The question was, can you use that for SDI or NDI pass through? [27:34.960 --> 27:36.960] Well, NDI is network. [27:36.960 --> 27:45.960] But SDI, I think we tried it and you can pass through the Blackmagic, for instance, device. [27:45.960 --> 27:47.960] Yeah, you pass it through a VTD. [27:47.960 --> 27:49.960] VTD is one thing. [27:49.960 --> 27:52.960] VTD, I already passed through DVB tuners. [27:52.960 --> 27:57.960] You can probably pass through DVB ASI, probably SDI. [27:57.960 --> 28:00.960] Even though we don't do that on a regular basis. [28:00.960 --> 28:03.960] But have you tried a different connector on a different VM? [28:03.960 --> 28:06.960] Yes, you have four inputs. [28:06.960 --> 28:10.960] A different connector on a different VM that would not be possible with a Blackmagic driver, [28:10.960 --> 28:14.960] because it's seen as one device, one PCI device, so it's not possible. [28:14.960 --> 28:16.960] If you design your own card, maybe. [28:16.960 --> 28:20.960] Siwan, you may want to answer that question. [28:20.960 --> 28:22.960] No? [28:22.960 --> 28:24.960] Yes, we could do it, yes. [28:24.960 --> 28:27.960] There's no technical reason. [28:27.960 --> 28:28.960] Yes, Gaff? [28:28.960 --> 28:33.960] Any container technology? [28:33.960 --> 28:36.960] We could use container. [28:36.960 --> 28:40.960] The thing is, as you know, we have our own software. [28:40.960 --> 28:46.960] And our own software is actually delivered as a disk image, so it can only run as a virtual machine. [28:46.960 --> 28:50.960] The idea to use container is not a bad idea. [28:50.960 --> 28:57.960] There is less isolation, though, than what you have with virtual machine. [28:57.960 --> 29:01.960] But this is something I would like to try in the middle term, yes. [29:01.960 --> 29:07.960] Because there has been a lot of improvements in the past years, regarding the network and the space. [29:07.960 --> 29:13.960] And you can probably have some direct MagValan interface, [29:13.960 --> 29:20.960] so I think some kind of physical interface that you can have directly. [29:20.960 --> 29:29.960] Yes, so it says we can probably have a direct MagValan in the container and do a direct physical interface. [29:29.960 --> 29:32.960] I think we are running out of time. [29:32.960 --> 29:51.960] Wow, oh, so...