[00:00.000 --> 00:27.080] Okay, please have a seat, we have to begin, do as you can, all right, hi everybody again, [00:27.080 --> 00:32.560] we meet Volker Kosser, he will explain us what he's doing on Caddy E and he's come from Germany [00:32.560 --> 00:49.520] and we're very pleased to welcome him. Go on. Thank you. Okay, so yeah, I'll talk a bit about [00:49.520 --> 01:00.760] how we use public transport information in Caddy itinerary. So what is this? Caddy is a big open [01:00.760 --> 01:09.560] source community, so not a transport operator for once here. We do all kinds of stuff, you can [01:09.560 --> 01:15.840] find us in the K building on the second level to look at a few things we do and one of the things [01:15.840 --> 01:26.640] we do is an transport assistance app called itinerary. So in that you can import any kind of [01:26.640 --> 01:33.760] travel related things like flights, train trips, bus trips, hotel reservations, event tickets, etc. [01:33.760 --> 01:39.320] and that is then grouped together and put into a timeline so you have all the relevant information [01:39.320 --> 01:45.680] at hand when you need them and we augment that with whatever might be helpful along the way, [01:45.680 --> 01:52.280] like the weather forecast as the obvious example. Since we don't really have a lot of time, I'll [01:52.280 --> 01:58.240] have to dive right into what we do with public transport data, you'll see some of the features [01:58.240 --> 02:08.280] along the way. So the first problem is we need to actually understand where you want to go, [02:08.280 --> 02:14.640] ideally without you having to enter that manually but by reusing documents or material you already [02:14.640 --> 02:24.520] have. In the best case scenario, that material has machine readable annotations about your trip, [02:24.520 --> 02:31.640] there's something that Gmail has been promoting but outside of airlines I think in Europe at least [02:31.640 --> 02:38.440] we have only seen that for Flix, bus and train line so none of the major railway operators for [02:38.440 --> 02:46.400] example have that. But there is a second best thing and that is the ticket barcodes. Most, [02:46.400 --> 02:53.880] not all of them but luckily most contain some information about the trip and especially in [02:53.880 --> 03:01.440] international use they are somewhat standardized so we actually have a chance to understand what's [03:01.440 --> 03:11.960] in them. The probably most well-known one is the one from airline boarding passes that is a single [03:11.960 --> 03:16.280] standard that works globally so that is the absolute best case scenario, only one thing we [03:16.280 --> 03:23.840] have to implement. For railways we don't have that luxury but the European Railway Agency has at [03:23.840 --> 03:30.080] least defined a few standards that are in use in Europe for international travel and in some [03:30.080 --> 03:41.600] countries also domestically. The complexity of those standards varies greatly. The airline [03:41.600 --> 03:46.720] boarding passes for example that is a simple ASCII string that is almost human readable, [03:46.720 --> 03:55.400] that's as easy as it gets. The latest iteration from the European Railway Agency for the international [03:55.400 --> 04:04.360] tickets here, the flexible content barcode, that is 2,000 lines of ASN1 specification defining [04:04.360 --> 04:12.000] 300 or so mostly optional fields with some unaligned packed encoding representation so awesome to [04:12.000 --> 04:23.400] debug but extremely powerful. That's the ultimate other end of complexity then. Just because it [04:23.400 --> 04:29.800] is standardized doesn't automatically mean this is also all openly available. Again, [04:29.800 --> 04:34.600] the European Railway Agency is the good example here, they have that on the website. If something [04:34.600 --> 04:43.600] is missing you ask them, they put it on the website, perfect. Some of the other organizations ask you [04:43.600 --> 04:50.680] for unreasonable amounts of money to get a PDF or require you to be a member and for that you [04:50.680 --> 05:00.000] need to be an airline or railway agency which we are not. Some of those systems have cryptographic [05:00.000 --> 05:05.280] signatures which we usually don't care about because we only care where you travel not if the [05:05.280 --> 05:12.600] ticket is actually valid but in one case the 44E ticket used in some areas in Germany in Luxembourg [05:12.600 --> 05:21.120] the signature and the content is somewhat intermixed so we actually need to decode that and just [05:21.120 --> 05:28.360] because something is called a public key doesn't mean it's actually public on the website. In this [05:28.360 --> 05:34.720] case we got lucky. Extensive internet search found a hundred page PDF in a location where [05:34.720 --> 05:39.400] probably shouldn't have been containing a screenshot where we found an URL pointing to an [05:39.400 --> 05:48.640] ill-app server on which we found the keys so it can be quite messy to work with this stuff. Most [05:48.640 --> 05:57.200] of the standards have operator specific extensions, those of course are not documented. For the final [05:57.200 --> 06:07.720] point is there anyone from Tranitalia here? Too bad I have questions for them. Then of course [06:07.720 --> 06:17.400] there's also a set of proprietary codes where our only option is reverse engineering. For that we [06:17.400 --> 06:25.520] rely on donations of sample tickets because I mean everything we do is very much focused on [06:25.520 --> 06:34.680] privacy so once on your own device we never get your actual tickets so we need them donated [06:34.680 --> 06:44.680] right to work with them. There were ones listed here for those we have more or less understanding [06:44.680 --> 06:51.040] some we get enough out of it to work already for some we can barely prove that there is [06:51.040 --> 06:56.160] actually travel relevant data in there but we have no way of decoding that. For me the most [06:56.160 --> 07:02.120] frustrating one is SBB because that is a fairly comprehensive format we understand most of it [07:02.120 --> 07:09.080] apart from the daytime fields and without of that it is pretty much useless right so if there's [07:09.080 --> 07:17.000] anyone here from SBB who has hints or information on how those tickets work I would be very interested. [07:17.000 --> 07:27.520] Then once we actually know where you're going and we have that in the timeline we augment that with [07:27.520 --> 07:38.560] real-time public transport information. The most obvious example is delays and disruptions, [07:38.560 --> 07:45.640] cancellations, platform changes, that kind of stuff right so we notify you about that. Another [07:45.640 --> 07:54.360] thing we do is filling gaps in the itinerary right so I to get here I book a train from Berlin [07:54.360 --> 08:01.160] to Brussels but I actually need to go from my home to the station then take the train and then in [08:01.160 --> 08:07.920] Brussels somehow get from the station to my hotel with using the respective local public transport [08:07.920 --> 08:16.080] so and that that is something we we can fill in automatically. And then the third thing is when [08:16.080 --> 08:22.400] you miss a connection right we offer you to to find alternatives for getting to the same destination. [08:22.400 --> 08:32.400] In order to implement that kind of stuff we kind of need to get to that data and there's [08:32.400 --> 08:39.680] unfortunately not a single global service that gives us to us right so we need to query many many [08:39.680 --> 08:45.560] different sources depending on where we currently are which backend can actually provide us this [08:45.560 --> 08:55.280] information. So we have a bit of an abstraction layer over all those sources which basically [08:55.280 --> 09:03.440] offers three basic operations searching for locations by by name or coordinate searching [09:03.440 --> 09:08.640] for arrival and departures at a specific stop and searching for journeys from from A to B. [09:08.640 --> 09:17.320] And on top of that we then build the the higher level features. In terms of supported backends [09:17.320 --> 09:25.760] that is basically three different categories the fully open source ones those are the easiest [09:25.760 --> 09:35.160] ones to work with like Navisha, OpenTrip planner. Motors is still missing on that list simply because [09:35.160 --> 09:42.760] there is currently no production deployment we have access to as soon as there is one we'll add [09:42.760 --> 09:51.600] that as well. Second category is things where the protocol is at least documented like the [09:51.600 --> 09:59.600] open journey planner used in Switzerland and the third one the most annoying ones to work with [09:59.600 --> 10:09.160] is the proprietary legacy backends. But just having the protocols of course is not enough we also [10:09.160 --> 10:17.240] need to know where exactly are those and the respective services for that. For that there is [10:17.240 --> 10:24.120] the transport API which is three that's a collaboration with others having that same [10:24.120 --> 10:32.400] problem like Janice and that is basically a collection of machine readable information about [10:32.400 --> 10:38.880] those those services. Both where exactly do I need to connect there which protocol do they use [10:38.880 --> 10:48.200] specific parameters I need to use but also information like the coverage area because [10:48.200 --> 10:56.320] in for most of those services that is kind of implied right if I have the Belgian transport app [10:56.320 --> 11:04.120] right the scope of that is implicit. Navisha is the exception that actually has API for querying [11:04.120 --> 11:12.880] this but if I want to pick the right back end right I of course need that information. Very [11:12.880 --> 11:21.480] similar problem all of what you see here is what journey query would describe as metro line one [11:21.480 --> 11:31.240] but the signage is very very different depending on where you are and the signage is something [11:31.240 --> 11:37.880] that is very prominent locally right so if I I should show the right thing in the app in order [11:37.880 --> 11:46.800] to to help the user to to find the right thing. But this isn't this isn't really unique right so [11:46.800 --> 11:53.680] finding the right logo is somewhat tricky. What we do there is we get the logo and the colors [11:53.680 --> 12:00.880] and all of that information from Wikidata. The Wikidata entry is linked to an OpenStreetMap [12:00.880 --> 12:06.400] route relation from that we get the geographic bounding box and the combination of geographic [12:06.400 --> 12:14.320] bounding area name and mode of transport is mostly unique and that is then good enough to to find [12:14.320 --> 12:25.520] the right logos. Okay then a few more things we integrate one is available rental vehicles so [12:25.520 --> 12:33.200] rental bikes electric kick scooters that kind of stuff. What you maybe can see in the screenshot [12:33.200 --> 12:41.640] here is a few available kick scooters some shown in green some shown in yellow the yellow ones are [12:41.640 --> 12:51.360] those with a remaining range of less than five kilometers. All of this is is coming from GBFS [12:51.360 --> 13:01.560] that is a nicely developing open standard for for that kind of information and it is very actively [13:01.560 --> 13:09.120] evolving. Just one or two years ago we wouldn't have that level of detail available so that's a [13:09.120 --> 13:17.840] very nice example of open standards and open sourcing in that field. Coverage for that is [13:17.840 --> 13:25.080] somewhat biased towards Europe and North America though. I know that those systems exist in Asia [13:25.080 --> 13:31.800] as well but I have no idea if they if they use GBFS as well or if there's any other system so [13:31.800 --> 13:41.640] again something where I would be interested in in information. Another thing we integrate on the [13:41.640 --> 13:49.960] train station maps is the real-time status of elevators and escalators so I think in this case [13:49.960 --> 13:56.400] they're all shown in green so they are actually functional. This is of course something very [13:56.400 --> 14:04.240] relevant if you're traveling say with heavy luggage a stroller or in a wheelchair. The [14:04.240 --> 14:12.720] data source for that is accessibility cloud that is the backend behind realmap.org. That's [14:12.720 --> 14:18.280] also free software and they aggregate these kind of information from from many different sources. [14:18.280 --> 14:30.880] There is a bit of a coverage bias towards Germany so similar data from other countries would be [14:30.880 --> 14:46.440] more than welcome. Another thing where we have a coverage problem is train coach layouts. I think [14:46.440 --> 14:52.320] there's currently two or three countries where we are getting this. Still is widely different [14:52.320 --> 15:03.000] data models so it's not quite clear yet how we best abstract that and that is also somewhat [15:03.000 --> 15:11.240] relevant on especially only the long distance trains which can get up to 400 meters so you [15:11.240 --> 15:19.320] want to know where exactly you need to go on a platform especially if you're in a hurry. One [15:19.320 --> 15:25.640] challenge there is that the especially in the countries where we have that open street map [15:25.640 --> 15:32.320] doesn't contain many of the platform section informations and that is the key to match those [15:32.320 --> 15:38.840] two data sets together to have the the proper train layout displayed correctly on the actual [15:38.840 --> 15:45.160] station map. If you think further towards say indoor navigation in a train station like that is [15:45.160 --> 15:53.000] kind of relevant. Pushing this topic even further would be to also show insights of the train. [15:53.000 --> 16:00.640] At least Deutsche Bahn has very detailed PDFs for human consumption of the of the interior [16:00.640 --> 16:10.920] but there is currently to my knowledge no machine readable format say like OSM for trains and that [16:10.920 --> 16:18.040] is again relevant for accessibility for example right so I need to know which parts I can go [16:18.040 --> 16:27.320] to and which parts I can't go to and then the the last part that is very very recent a lot of [16:27.320 --> 16:38.880] work on that happened just yesterday is using the onboard APIs on trains. So if you connect to [16:38.880 --> 16:44.240] the onboard Wi-Fi there is often some kind of portal page showing you information about the [16:44.240 --> 16:52.840] current trip that's powered by some API that we can use as well. Typically this gives you [16:52.840 --> 16:59.320] current position speed and heading and information about the journey with delays on on each stop. [16:59.320 --> 17:06.680] Just showing that is of course the easiest way to to integrate that but the real value comes [17:06.680 --> 17:14.040] when when we use that for higher level features again for example checking if you're on the [17:14.040 --> 17:20.920] right train might seem obvious but if you're traveling in the country where you don't speak [17:20.920 --> 17:29.760] the local language or in case of a multi-set train that splits up along the way. Zugteilung [17:29.760 --> 17:38.200] in Hamm as we say in German right it's it's quite helpful if the software double checks that same [17:38.200 --> 17:43.840] for detecting if we have arrived yet that is something very very easy to realize for the [17:43.840 --> 17:54.120] human but it's actually surprisingly tricky for for the software to know. Yeah so all of these [17:54.120 --> 18:03.320] things I've shown you are not tied to the app specifically but are available as reusable libraries [18:03.320 --> 18:10.480] and for example next cloud is using the ticket data extraction in their email client so you [18:10.480 --> 18:20.200] can automatically add calendar entries for your ticket when you get them by email and I think [18:20.200 --> 18:27.760] there is much more that can be built on top of all this. I mean the itinerary app is basically for [18:27.760 --> 18:33.440] the irregular explicitly booked kind of travel but doesn't touch the commute use case at all. [18:33.440 --> 18:42.600] If you happen to know about any kind of relevant APIs or data sets or have the documentation for [18:42.600 --> 18:50.440] those or for ticket formats like we would be very very much interested same if you have travel [18:50.440 --> 18:58.240] documents past present or future that you are willing to donate to develop the extractor on [18:58.240 --> 19:04.080] that we are happy to take those as well. Yeah thank you. [19:04.080 --> 19:24.080] You're talking about getting live train data from the train. Can you do it from the location of the phone? [19:24.080 --> 19:39.440] The position information we get on the train is essentially GPS just with a GPS receiver on [19:39.440 --> 19:44.880] the train. In theory you could do that from the phone and the problem is that reception [19:44.880 --> 19:53.240] inside a metal train is somewhat limited so you usually get better results by using the API for [19:53.240 --> 19:58.680] that but it is essentially GPS data you get there so it's it's the same. [19:58.680 --> 20:28.240] Yeah that is an annoyingly complicated topic. [20:28.240 --> 20:37.360] The modality is awfully undefined. I mean there's neither a technical nor like a product level [20:37.360 --> 20:44.440] definition on what is a subway or a metro or a tram and it can be all kinds of hybrid things. [20:44.440 --> 20:55.960] That is one of the metadata we carry from Vicky data alongside the logos and so on so if in doubt [20:55.960 --> 21:05.440] we use that but even that is there is some loss in there. I mean there's some cities where you [21:05.440 --> 21:12.680] have trams that go on long distance railway outside of the city and yeah I mean we will [21:12.680 --> 21:20.440] never be able to capture these extreme special cases that a region or operator specific app [21:20.440 --> 21:30.520] can capture. So I mean that is the price we pay for that abstraction right and the one app that [21:30.520 --> 21:40.880] works everywhere approach. One question you showed the data about the scooters and how it has [21:40.880 --> 21:51.040] you said it's getting better and the coverage is better. What is driving this improvement? Is that regulation or why it's getting better? [21:51.040 --> 22:00.040] That is a good question I don't know for sure. I know that in some cities it is regulation so if you [22:00.040 --> 22:08.040] want to permit to operate your rental system in that city right you are required to publish your [22:08.040 --> 22:20.200] feeds as GBFS and we then happily consume that. I think another part is that somehow started very [22:20.200 --> 22:26.680] early by some US cities requiring that to to give out the permits for those systems and then that [22:26.680 --> 22:34.040] kind of became the standard mode of operation for those services right. So if you get in very [22:34.040 --> 22:41.520] early that works. I don't think there is like national or EU-wide regulation so this is usually [22:41.520 --> 22:43.680] something that differs from city to city. [22:43.680 --> 23:06.480] Regarding on demand traffic some of the routing engines have that in their results so we can show [23:06.480 --> 23:15.200] that but we currently have nothing regarding actively booking things on demand or otherwise [23:15.200 --> 23:23.040] because that is something where there is there's practically no API available for external users. [23:23.040 --> 23:32.280] I don't think the the railway operators or the especially the inverse with the private operators [23:32.280 --> 23:41.040] they they give that to to smaller users like us if they give it to to anyone at all. [23:41.040 --> 23:48.920] You mentioned computing what about the case when I don't yet have a ticket but I want to make the [23:48.920 --> 23:54.520] journey so for instance in Germany I had a bank card 100. Is there a possibility already to enter [23:54.520 --> 24:00.960] that somehow? We also have like a general route search so you just specify where you want to go [24:00.960 --> 24:08.720] and it offers you depending on where you start right the options from Deutsche Bahn or S&CF or [24:08.720 --> 24:16.240] wherever you are and then you can add that to the timeline as well. So there is the the ability to [24:16.240 --> 24:21.320] do manual entry for for that scenario but that would be quite cumbersome to do this every day [24:21.320 --> 24:27.560] for your commute right. So there you would want something that I know I usually go to your office [24:27.560 --> 24:33.880] between 8 and 9 in the morning so inform me if there any if there's any deviation on my usual [24:33.880 --> 24:37.640] route but not necessarily make me enter this right. [24:37.640 --> 25:02.480] You mean the checks for delays yeah that is polling there is none of those services we use [25:02.480 --> 25:07.920] has a push service that we could use. Okay.