[00:00.000 --> 00:12.280]  Hi. Welcome to my talk. So you want to build a deterministic networking system, a gentle
[00:12.280 --> 00:17.760]  introduction to time-sensitive networking just out of interest. How many of you have
[00:17.760 --> 00:23.720]  heard of TSN or time-sensitive networking so far? That's quite a few for a networking
[00:23.720 --> 00:30.720]  session. That's great. How many of you have already worked with that? Not so many. Okay.
[00:30.720 --> 00:38.200]  You will after that talk. Yeah. Who am I? I think I'm a former system engineer. I worked
[00:38.200 --> 00:44.680]  a lot with time-sensitive networking and its predecessors. I also took part in standardization.
[00:44.680 --> 00:50.160]  So I also did some of that. And since last summer, I worked at a kernel developer at
[00:50.160 --> 00:57.160]  Pengatronics. That's a German Linux consulting and support company. We have roughly 7,600
[00:57.160 --> 01:04.240]  patches in the kernel. And we also do consulting for real-time networking amongst many other
[01:04.240 --> 01:11.720]  stuff. And by the way, we're hiring, of course. Now, to what we will look into today, we will
[01:11.720 --> 01:19.600]  look into applications. I will give you some examples why you would probably want to do
[01:19.600 --> 01:27.240]  networking over or real-time data transport over networking and what the implications
[01:27.240 --> 01:32.800]  of that is, what the requirements of these applications are. We will look into the basic
[01:32.800 --> 01:38.520]  building blocks. So sorry for the folks who already know about that. And we will talk
[01:38.520 --> 01:46.080]  a bit about which Linux user space and kernel components are used in building these applications.
[01:46.080 --> 01:51.760]  And I will sum up the state of the union a bit. And then, just as an announcement in
[01:51.760 --> 01:56.760]  advance, there are some bonus slides where I will give some more details and some references
[01:56.760 --> 02:02.320]  to open-source projects already working with TSM. So if you're interested in that, just
[02:02.320 --> 02:08.560]  download the slides from the penta and, well, check out the links. And I also gave an example
[02:08.560 --> 02:14.560]  of how to basically glue together a stage box, so a transport system for audio data
[02:14.560 --> 02:21.320]  over the network. I won't make that into the talk because it has been shortened to half
[02:21.320 --> 02:29.280]  an hour. So the example I will focus on today is audio video bridging. So if you want to
[02:29.280 --> 02:35.760]  transport real-time data over a network for an application just as this talk, you want
[02:35.760 --> 02:40.720]  to have as low jitter buff or as small jitter buff as possible to reduce latency in the
[02:40.720 --> 02:47.000]  system because if you transport data over a traditional network, packets could get dropped.
[02:47.000 --> 02:53.280]  So you have to resend them or you have to make sure that somehow, magically, interfering
[02:53.280 --> 02:59.960]  traffic doesn't do you any harm. And that usually involves quite large jitter buffers
[02:59.960 --> 03:05.480]  up to several seconds. And if I talk now and you hear me from stage and you hear me from
[03:05.480 --> 03:09.360]  the PA four seconds after that, that would be quite annoying. So you want to cut that
[03:09.360 --> 03:20.040]  down to as low as possible transmission latency, overall end-to-end latency. Of course, for
[03:20.040 --> 03:27.880]  TSN, which started as audio video bridging or AVB as a standard, they came across the
[03:27.880 --> 03:33.680]  fact that this technology could also be useful for quite some other applications. Most of
[03:33.680 --> 03:38.720]  the customers do like machine control stuff with that. So if you have a large production
[03:38.720 --> 03:45.360]  line and you want to transmit data between your PLC and your server drives or your robot
[03:45.360 --> 03:53.880]  arms and stuff, you also want to make sure that your control data arrives in time at
[03:53.880 --> 04:00.920]  the actor or your sensor data is read in within a certain point in time. And that's quite
[04:00.920 --> 04:06.680]  important to keep that timing. Same holds, of course, for aerospace and automotive and
[04:06.680 --> 04:12.960]  railways and stuff. I won't go into these applications today because we're, as I said,
[04:12.960 --> 04:19.080]  short on time. The first requirement of said applications is that you need to establish
[04:19.080 --> 04:24.800]  a common time base in the network. That's due to the fact that while measuring time
[04:24.800 --> 04:30.680]  in computers, it's basically hooking up a hardware counter to a crystal oscillator.
[04:30.680 --> 04:37.160]  These crystal oscillators tend to have frequency drift over time, especially with temperature.
[04:37.160 --> 04:42.400]  And due to the different switch on points in time, you also have quite large offsets.
[04:42.400 --> 04:50.360]  So if you start one device, say at 12 o'clock and the other at 1 p.m., they have one hour
[04:50.360 --> 04:58.840]  of offset in there. So you want to make sure that all your network devices have a common
[04:58.840 --> 05:08.680]  meaning or a common sense of time passing and a common sense of what time it is. Because
[05:08.680 --> 05:13.800]  lots of scheduling decisions for networking traffic may depend on timing. Also, for some
[05:13.800 --> 05:18.840]  applications as the audio example, you also would like to regenerate your audio sampling
[05:18.840 --> 05:26.120]  clocks. So basically in order not to introduce any additional degradation in audio quality,
[05:26.120 --> 05:34.080]  you want to make sure that your sampling clocks of your ADC and DAC run basically in lockstep.
[05:34.080 --> 05:39.320]  And that is why you want to make sure that your time is distributed evenly. And the way
[05:39.320 --> 05:46.120]  that this is done usually in networks is just shown basically in this old style picture.
[05:46.120 --> 05:52.360]  You elect a so-called master clock. So basically that's the best clock reference in your network
[05:52.360 --> 05:58.600]  or the most stable clock reference in your network. And then basically you compare all
[05:58.600 --> 06:05.320]  other clocks to that clock reference and they have to adjust their local time for that reference
[06:05.320 --> 06:11.840]  time. It's basically just as those three gentlemen do in that picture. I like that comparison
[06:11.840 --> 06:17.920]  because you find a lot of analogies and the standards to just the way that works with
[06:17.920 --> 06:28.880]  like pocket watches. And if you look into that, you will find that basic idea quite
[06:28.880 --> 06:36.400]  useful to keep in mind. Now the other thing we want to have guaranteed is as I already
[06:36.400 --> 06:42.800]  said bound and transmission latency. So if we go across the transmission of a data stream
[06:42.800 --> 06:48.480]  in the network, so that's what the standard calls a talker at the left. And that's what
[06:48.480 --> 06:54.440]  the standard calls bridges. Usually as we're dealing with layer two, that's ethnic switches.
[06:54.440 --> 06:59.440]  And in the right, that's what the standard calls a listener. You also call it a source
[06:59.440 --> 07:07.600]  and a sync. But the standard talks about talkers and listeners. And the packet goes from bridge
[07:07.600 --> 07:15.280]  to bridge to along its pass across the network. And each switch basically a bridge has an
[07:15.280 --> 07:21.200]  ingress queue and a switch fabric and an egress queue. That's due to the fact that you can
[07:21.200 --> 07:27.680]  only transmit one packet out of a certain network port at a time. You can't just if
[07:27.680 --> 07:32.880]  another packet at another port arrives for that destination port, you have to store it.
[07:32.880 --> 07:37.400]  And you have to wait until the last transmission is done. And then you can transmit the next
[07:37.400 --> 07:43.040]  packet. And this introduces what's called the residence time in each switch. So even
[07:43.040 --> 07:49.160]  if you have a perfect pass through through network without any additional interfering
[07:49.160 --> 07:54.840]  traffic, you add a little time at each step, your payload packet travels through the network.
[07:54.840 --> 08:00.120]  So if our audio starts here, it's a bit later when it arrives here, and a bit later when
[08:00.120 --> 08:07.240]  it arrives there, and so on so forth. So that's fine, as long as you have no interfering traffic
[08:07.240 --> 08:12.760]  because if you have additional interfering traffic, and that might be because we of course
[08:12.760 --> 08:17.560]  want to use our audio on converged networks. So we want to use the same network for say
[08:17.560 --> 08:25.720]  our live PA system and for our network internet connection. And we want to download large file
[08:25.720 --> 08:33.680]  because we want to download a presentation recording from FOSTA. And basically that's
[08:33.680 --> 08:41.920]  where this entity arrives and it's introduced or it creates a large amount of traffic here.
[08:41.920 --> 08:47.680]  This will cause the packet here to be delayed until it's sent out of the egress port. And
[08:47.680 --> 08:54.120]  basically it won't arrive in time. And if we go for a small jitter buffers as possible,
[08:54.120 --> 08:59.880]  that's a problem because we have a buffer underrun at the listener side. And basically
[08:59.880 --> 09:04.840]  we have audio dropouts in the audio case, or we have stalling motors in the industrial
[09:04.840 --> 09:11.000]  control case. That's something we have to avoid under any circumstances. So basically
[09:11.000 --> 09:18.680]  something we want to have is quality of service. And so the picture, of course, your professional
[09:18.680 --> 09:22.760]  networking engineer, so you don't need that picture, but the picture I like to use for
[09:22.760 --> 09:29.000]  that is a bus lane in the street because also the bus runs in a more or less isochronous
[09:29.000 --> 09:39.600]  way. So you send those bus or packets down the lane and the way not to be hindered by
[09:39.600 --> 09:46.440]  the interfering traffic there is just basically to introduce a priority lane. And that is
[09:46.440 --> 09:54.080]  what we also use in networks basically when we introduce quality of service measures.
[09:54.080 --> 09:59.560]  Another thing we need for at least some of these applications is link layer redundancy.
[09:59.560 --> 10:06.120]  So imagine if there's a mixing desk right in the back and we run a network link back
[10:06.120 --> 10:12.200]  there and someone just trips over that link, rips out the cable, or maybe it's a fiber
[10:12.200 --> 10:17.320]  link and someone stomps on the fiber link, bad things happen. And basically if our stem
[10:17.320 --> 10:24.920]  is over, we don't want to have that. So we want to introduce means of having redundancy
[10:24.920 --> 10:32.080]  schemes there. Basically you can't think of it as a real-time capable, real-time healing
[10:32.080 --> 10:41.440]  with no waiting time like spanning tree-ish thing you want to have. The standard spanning
[10:41.440 --> 10:46.640]  trees quite don't cut it for these kinds of applications. So we have to introduce other
[10:46.640 --> 10:52.120]  stuff there. We have some other application requirements there. They're not so important
[10:52.120 --> 11:00.880]  so I leave them out for now. Now what does the or what kernel and user space components
[11:00.880 --> 11:08.520]  do we have to implement that? We will look into what the TSM components are later or
[11:08.520 --> 11:13.600]  what the TSM standards are because that's basically just numbers and letters. So for
[11:13.600 --> 11:20.160]  time synchronization, especially TSM, we use GPTP. That's a flavor of the precision
[11:20.160 --> 11:28.000]  time protocol, generalised precision time protocol, of which you can think of PTP standard
[11:28.000 --> 11:35.960]  PTP, IEEE 1588 boils down to layer 2. So of course we're dealing with raw, ethnic frames
[11:35.960 --> 11:41.800]  so we can't use UDP for transport and it also has some other quirks but they're not
[11:41.800 --> 11:47.040]  too important right there. And the way we do that with Linux kernel, we have the hardware
[11:47.040 --> 11:53.280]  time sampling units and the PTP hardware clocks. That's basically the interface to
[11:53.280 --> 12:00.880]  hardware clocks in your FNMAC or FI. And the user space component to run all the remaining
[12:00.880 --> 12:05.560]  stuff is PTP for Linux. That's basically the way it works and it works quite well. You
[12:05.560 --> 12:12.120]  can achieve down to several nanoseconds precision from point to point with that. For traffic
[12:12.120 --> 12:17.520]  shaping, that's the quality of service measure we want to employ. The kernel has the TC
[12:17.520 --> 12:26.200]  subsystem and usually if you configure that manually you use IPv2 or netlink if you want
[12:26.200 --> 12:33.840]  to do that programmatically and that's basically the way it works and we will look into a bit
[12:33.840 --> 12:41.280]  of detail later. For network management, so basically if you have to reserve a data flow
[12:41.280 --> 12:46.640]  from a talker to a listener, that's where it gets a bit sketchy because that's of course
[12:46.640 --> 12:53.120]  user space demons and there aren't much. There's also a problem because there's several ways
[12:53.120 --> 13:00.240]  of doing that, the traditional way or ABB style, the initial implementation used the
[13:00.240 --> 13:10.360]  so-called stream reservation protocol. Modern ways for especially pre-calculated or pre-engineered
[13:10.360 --> 13:18.840]  networks is using young NETCON extensions and there are some demons for that but support
[13:18.840 --> 13:24.800]  for the TSN extensions is not too great. So if you're into that, that's quite a nice
[13:24.800 --> 13:35.720]  thing to work on. For the real-time data packetization, that's mostly user space. Of course you want
[13:35.720 --> 13:44.360]  to use some kernel features like ETF, Qdisk and XDP to have as low overhead as possible
[13:44.360 --> 13:50.920]  and to make sure that your transmission is sent out as asynchronously as possible and
[13:50.920 --> 13:58.560]  you want to use offloading for that and then there's some very application-specific user
[13:58.560 --> 14:05.960]  space components. So for audio-video stuff, you can use the G-streamer plugins and for
[14:05.960 --> 14:13.280]  industrial control, I'd recommend to use a 2G Open 6651 implementation. That's not quite
[14:13.280 --> 14:20.880]  finished yet but it's a good starting point at least. And for the link layer redundancy,
[14:20.880 --> 14:29.080]  that's what PCR and FRER is, basically the standards are finished since one or two years.
[14:29.080 --> 14:35.640]  There's not much hardware supporting that yet and you really want to have hardware offloading
[14:35.640 --> 14:42.440]  for that. So you're basically down to proprietary vendor stacks at the moment. There are efforts
[14:42.440 --> 14:49.840]  to put stuff mainline but there are not quite there yet. But stuff is coming and that's
[14:49.840 --> 15:01.680]  the good thing with that. So I think one slide is missing there, which is not a too big problem.
[15:01.680 --> 15:09.520]  Yes, one slide is missing. So basically the stuff, how to put stuff together with TSN,
[15:09.520 --> 15:19.120]  I will summarize it without a slide. With TSN we have GPTP, that's IEEE 802.1AS for
[15:19.120 --> 15:27.160]  the IEEE standard fetishists here in the room. And traffic shaping, the basic standard stuff
[15:27.160 --> 15:34.520]  is the credit-based shaper but there are more time-aware shapers available right now. They
[15:34.520 --> 15:41.240]  are basically making more efficient use of your network and the way that works is basically
[15:41.240 --> 15:50.560]  a reserving bandwidth along your data flow path in your network. Network management,
[15:50.560 --> 15:59.880]  again, that's a bit, that's a bit application-specific. So the audio video and professional audio
[15:59.880 --> 16:08.040]  video stuff is still using the stream reservation protocols and for the payload, as I already
[16:08.040 --> 16:16.360]  told, that's really, really application-specific. And for redundancy we use PCR and FRER. Usually
[16:16.360 --> 16:22.320]  there are some exceptions to that, especially for professional audio video. PCR and FRER
[16:22.320 --> 16:26.880]  were unstandardized when those standards were written so there are some proprietary
[16:26.880 --> 16:34.360]  or not proprietary but some other redundancy schemes where you basically send two different
[16:34.360 --> 16:44.000]  streams and try to separate your networks via means of VLANs usually and try to force
[16:44.000 --> 16:52.040]  different data paths through network. Basically nowadays you want to go PCR and FRER whenever
[16:52.040 --> 17:00.040]  your hardware supports that. So state of the union, the hard stuff is already done. So
[17:00.040 --> 17:07.440]  there's already implementations in the kernel, there are user space demons available. That's
[17:07.440 --> 17:14.400]  again the stuff that's difficult to get right. So if you want to implement those standards,
[17:14.400 --> 17:22.400]  first of all you have to read tons of paper. I did that for an employer, took me two years.
[17:22.400 --> 17:27.960]  So that's really hard to get right. And the good thing is that that is already implemented,
[17:27.960 --> 17:35.600]  you just have to use it and you have to use the right knobs. For some stuff like GPTP and
[17:35.600 --> 17:42.080]  traffic shaping you want to really, really use, for GPTP you have to use, for traffic
[17:42.080 --> 17:48.960]  shaping you want to use, hardware offloading. You have to bear in mind that your network
[17:48.960 --> 17:57.280]  gear has to support explicitly GPTP and traffic shaping. So about the preservation and basically
[17:57.280 --> 18:06.560]  making sure that your traffic shaping is applied properly. That's not true for every hardware,
[18:06.560 --> 18:13.680]  especially not for commodity hardware. And bear in mind that sometimes configuration
[18:13.680 --> 18:20.880]  especially for traffic shaping can be quite tricky. As I said, I have added bonus slides
[18:20.880 --> 18:27.960]  to the presentation. I will check that they have the right slides in there later on or
[18:27.960 --> 18:34.960]  just contact me. And the point is especially credit based shapers can be really, really
[18:34.960 --> 18:40.120]  tricky to set up properly and to make sure that you reserve the bandwidth you want because
[18:40.120 --> 18:46.440]  you want to have the remaining bandwidth to be available for best effort traffic. So
[18:46.440 --> 18:51.520]  the idea is that you can use like say 70% of your link for your audio video stuff and
[18:51.520 --> 18:56.680]  still have like 30% of your gigabit link, which is what we're usually dealing with for
[18:56.680 --> 19:05.440]  like audio video available for just best effort network management traffic and what so ever.
[19:05.440 --> 19:11.680]  So you really want to make sure your shapers are configured the right way TM. And it's
[19:11.680 --> 19:19.880]  quite hard to treat the right knobs and IP route too. So there are good examples and
[19:19.880 --> 19:25.600]  I'd strongly recommend to read the docs on that. There's also a link to the TSN read
[19:25.600 --> 19:32.600]  the docs for Linux. It's quite a good starting point for getting into that whole topic. And
[19:32.600 --> 19:47.680]  yeah, basically I think that's it. Do you have any questions? Any questions here?
[19:47.680 --> 19:55.760]  Thanks for this. What's the highest speed Ethernet implementation of this you've seen?
[19:55.760 --> 20:04.320]  Have you seen anything beyond like 10 gig E for example? I have seen a 10 gig implementation
[20:04.320 --> 20:11.960]  for that. As far as I recall the standards and have some limitations with respect to
[20:11.960 --> 20:21.720]  how you communicate your bandwidth requirements and they're a bit capped. I'm sure and I know
[20:21.720 --> 20:27.280]  that they are working on that for future revisions of the standards because of course now faster
[20:27.280 --> 20:36.000]  links are becoming available more and more. Most applications for TSN like the control
[20:36.000 --> 20:42.840]  stuff or the AV stuff are running on 100 megabit links still. You want to go to gigabit links
[20:42.840 --> 20:51.720]  because you can achieve quite a bit lower end to end latencies on faster links. But
[20:51.720 --> 20:59.080]  I haven't seen, personally haven't seen faster stuff than 10 gigs so far. But I'd be interested
[20:59.080 --> 21:06.520]  to do so. Do you have happy stories or really users
[21:06.520 --> 21:13.400]  that have put this in production and can you tell more about this? Yeah, so if you want
[21:13.400 --> 21:20.200]  to check that out you can just Google for Milan and TSN which is the professional audio
[21:20.200 --> 21:26.280]  video stuff and they just before Covid started, shortly before Covid started they ran the
[21:26.280 --> 21:33.720]  Rammstein concert in Munich over a TSN system. It's a really large system with several video
[21:33.720 --> 21:41.720]  walls and several like hundreds or thousands of audio streams and pyrotechnics and light
[21:41.720 --> 21:46.920]  control and stuff all in the same network converged. So that's the largest installation
[21:46.920 --> 21:57.000]  for live audio I know of and I think that's quite a good story to tell. I was curious
[21:57.000 --> 22:03.040]  if you had the chance to play around with synchronous ethernet as well. I haven't looked
[22:03.040 --> 22:18.360]  into that too deep yet so I can't tell you too much about that.
[22:18.360 --> 22:26.240]  You mentioned XTP. Are you aware of any applications of XTP in that area? To be honest I haven't
[22:26.240 --> 22:32.760]  seen them and I will start working on some of them for a customer project in just a few
[22:32.760 --> 22:40.800]  weeks probably. The idea is that basically because it's layer 2 you don't have much
[22:40.800 --> 22:48.840]  network stack above the hardware layer. So if you can cut some of the Linux networking
[22:48.840 --> 22:55.360]  stack because you don't use it anyway, you work on raw sockets anyway, you could just
[22:55.360 --> 23:05.200]  cut some of that out and try to achieve lower latencies in your basically Linux stack there.
[23:05.200 --> 23:10.600]  Probably on the next Fostum I can probably give you a talk on that.
[23:10.600 --> 23:14.920]  This is probably a big question but how do you go about debugging this sort of stuff
[23:14.920 --> 23:21.280]  so like setting it up or if you think there's a problem, how do you go about finding problems?
[23:21.280 --> 23:29.760]  That's actually a bit of a pain point and you have to know at least a bit what same
[23:29.760 --> 23:37.560]  values for like path delays for the PTP and stuff are and one of the most useful debugging
[23:37.560 --> 23:43.280]  tools I've found so far is a good ethernet switch because it will give you like output
[23:43.280 --> 23:50.520]  for your stream reservations, it will give you output for your PTP or GPTP. You can also
[23:50.520 --> 24:00.320]  like sniff traffic with wiretaps basically and analyze it in Wireshark or Skypie or whatever
[24:00.320 --> 24:06.080]  your tool of choice is. That works best to be honest for 100 megabit links because you
[24:06.080 --> 24:12.000]  can use passive tabs. It doesn't work that great for gigabit links because it violates
[24:12.000 --> 24:19.760]  some of the sound it's a bit. You can also use like mirror ports and switches to exfiltrate
[24:19.760 --> 24:27.400]  traffic but basically it's a more manual approach of debugging and I'd like to get in touch
[24:27.400 --> 24:35.440]  with if anyone is interested in just write me an email to start a community-based project
[24:35.440 --> 24:44.760]  of automated analysis of TSN networks basically because I think it's something we really really
[24:44.760 --> 24:52.040]  need especially for people who aren't that deep into the standards and we need to make
[24:52.040 --> 24:59.520]  sure that we can basically have a one-click check and setup and can tell from a tool that
[24:59.520 --> 25:06.240]  at least if that looks okay-ish or not what you're doing but I'm not aware of any project
[25:06.240 --> 25:13.480]  so far so I'd like to start but I'm not too experienced in how to start such a project
[25:13.480 --> 25:19.920]  so if you're experienced in that or are interested in that just write me an email, get in touch
[25:19.920 --> 25:33.240]  and maybe we can set up something. Any more questions? That's all the last one.
[25:33.240 --> 25:44.600]  You mentioned some protocols for link redundancy. Can they also be used for node redundancy?
[25:44.600 --> 25:56.200]  I'm not entirely sure. I would have to look something up. I think basically it should
[25:56.200 --> 26:03.600]  work because it's about the data path so if one node drops out basically that would work
[26:03.600 --> 26:07.000]  as well but it won't work for the endpoints so for the talk of the listener of course
[26:07.000 --> 26:15.120]  it won't work but for nodes in the middle of your graph that would probably work.
[26:15.120 --> 26:17.000]  Okay thank you very much again for your presentation.
[26:17.000 --> 26:27.000]  Thank you.