[00:00.000 --> 00:10.440] Thank you for coming to my talk, I'm very glad that the room is so packed, so I hope [00:10.440 --> 00:13.360] that this will be of interest to you. [00:13.360 --> 00:19.360] So my talk is named OpenStreetMap, one geographic database to rule them all, mapping the railway [00:19.360 --> 00:22.320] network for the public with the public. [00:22.320 --> 00:29.920] And I will focus on OpenStreetMap and open data related topics for OSRD, which is an [00:29.920 --> 00:36.560] open source project developed by SNSF, the French Railway Company, which is part of [00:36.560 --> 00:38.080] the Open Rail Foundation. [00:38.080 --> 00:45.800] So there are many information about this here on the panel. [00:45.800 --> 00:51.640] Just a few reminders about why the railway company should invest in open data. [00:51.640 --> 00:58.880] I think you are all convinced that open data is the way to go for all of projects. [00:58.880 --> 01:03.800] But inside the railway companies, it's not always that obvious. [01:03.800 --> 01:09.920] So we want long distance trains across Europe, so we can construct together the transport [01:09.920 --> 01:12.600] network of the future on rails. [01:12.600 --> 01:19.080] We want to do European cooperation because we have railway infrastructure managers in [01:19.080 --> 01:25.160] all European countries that have the same needs, and yet we are still paying for different [01:25.160 --> 01:29.600] software providers for the same tools and the same data. [01:29.600 --> 01:36.320] And of course, we want free competition to prove that all of the train operators we work [01:36.320 --> 01:38.040] with are treated the same. [01:38.040 --> 01:45.000] So if we share the same source code and the same data, we can ensure that. [01:45.000 --> 01:50.560] I will dive into the specific need of OSRD, which is our project. [01:50.560 --> 01:55.960] Of course, you may have different data needs, so I will focus on these. [01:55.960 --> 02:01.360] If any in the room have other experience with other types of data, I will be very happy [02:01.360 --> 02:03.920] to discuss with you. [02:03.920 --> 02:11.440] So in OSRD, we have four main features, pass-finding or route compatibility check is to find a [02:11.440 --> 02:15.560] train pass in the European railway network. [02:15.560 --> 02:21.320] And in time calculation is to calculate the time that the train will take to go from point [02:21.320 --> 02:28.000] A to point B, conflict detection is to ensure that the train will not run into another train [02:28.000 --> 02:34.640] during its route, and short-term train planning is to add a new train into the timetable at [02:34.640 --> 02:36.160] the last minute. [02:36.160 --> 02:41.720] Maybe you were lucky to hear my colleague Elwa this morning talk about this topic. [02:41.720 --> 02:47.080] So to do these four features, we need a lot of data, tracks, geometry and topology at [02:47.080 --> 02:55.680] track level and not line level signals, switches, routes and detectors, which are kind of technical [02:55.680 --> 03:04.560] objects, electrification of the tracks, loading gauge, speed limits, slopes, curves, real-time [03:04.560 --> 03:11.160] position of trains, and stations can be useful for display use. [03:11.160 --> 03:16.080] So I've detailed the needs for each of the features, but what you can remind is that [03:16.080 --> 03:22.480] we need a lot of data, which is all geographic and in high quality. [03:22.480 --> 03:28.120] So the goal of this study and what I will show you today is we want to find and compare [03:28.120 --> 03:34.480] European level open data to choose the best source for our needs at OSRD, but also maybe [03:34.480 --> 03:40.600] for your needs if you're working with the same data needs in your projects. [03:40.600 --> 03:47.360] I've compared four data sources, the RIMF or Registrar of Infrastructure is a data source [03:47.360 --> 03:52.240] provided by the Agency for Railways of the European Union. [03:52.240 --> 04:00.520] Inspire is a European directive that's ensure to share geographic data across Europe. [04:00.520 --> 04:06.880] Then we can find open data platforms of infrastructure managers, but there are one data platform for [04:06.880 --> 04:13.240] each company, so it can be quite confusing to find the good data and of course they all [04:13.240 --> 04:15.440] use different formats. [04:15.440 --> 04:29.040] And finally, OpenStreetMap, which is as you all know, I hope, collaborative database of [04:29.040 --> 04:32.600] geographic feed data, and it feeds all of our needs. [04:32.600 --> 04:37.680] We want open data, we want a data model which is consistent across Europe so that we don't [04:37.680 --> 04:41.480] have to change the parameter of our tool in each country. [04:41.480 --> 04:46.320] We want a data model that can evolve if we want to add a new feature. [04:46.320 --> 04:55.840] Of course, we need English documentation, easy data access, and a wide data perimeter. [04:55.840 --> 04:57.720] Let's try to access some data. [04:57.720 --> 05:06.160] So here I am on the Inspire website, I can find a broken link in a mixed language. [05:06.160 --> 05:11.520] Another example of Inspire data, which is supposed to have good metadata. [05:11.520 --> 05:18.080] Here you can see the link to access the data, which is in the middle of the page, so very [05:18.080 --> 05:20.320] easy to find. [05:20.320 --> 05:27.080] And finally, another example, I could go on and on about this, but this is a page in, [05:27.080 --> 05:33.480] I think, Swedish, but it cannot be translated nor copy and paste in any translator. [05:33.480 --> 05:37.640] So you have to click and download the data, hoping for the best. [05:37.640 --> 05:42.920] This is not to blame the people that have created these pages, but just to share that [05:42.920 --> 05:49.920] finding open data can be very time consuming and very difficult, especially if you, as [05:49.920 --> 05:55.280] me, don't talk all the European languages. [05:55.280 --> 05:59.440] Then once you have downloaded the data, we can try to assess data quality. [05:59.440 --> 06:04.400] For example, this is the railway network in Italy that I've downloaded from the Inspire [06:04.400 --> 06:05.400] dataset. [06:05.400 --> 06:12.360] And as you can see, there's supposedly a railway tunnel that links Tivita Vecchia and [06:12.360 --> 06:13.600] Sardinia. [06:13.600 --> 06:17.800] So I was very surprised by that. [06:17.800 --> 06:23.560] I checked on the official RFI website, which is the Infrastructure Manager for Italy. [06:23.560 --> 06:28.560] And in the official website, we cannot find this underwater tunnel. [06:28.560 --> 06:37.040] So of course, I was not allowed to travel across all Europe to check all the data quality [06:37.040 --> 06:41.320] that I've downloaded, so, yes? [06:41.320 --> 06:51.640] In some place, it is true, but there it is not. [06:51.640 --> 06:58.720] So first question we want to ask is for all the open data sources that I've found, are [06:58.720 --> 07:02.320] they compatible with OpenStreetsMap? [07:02.320 --> 07:08.520] In many cases, this is the case, but unfortunately, for the Creative Commons license, we must [07:08.520 --> 07:14.160] ask the provider if the attribution in OpenStreetsMap is good enough. [07:14.160 --> 07:23.440] So this can take more time, and it's not as easy as other type of licenses. [07:23.440 --> 07:28.400] So if you publish open data, it's important to check if the license is compatible with [07:28.400 --> 07:30.440] OSM. [07:30.440 --> 07:34.600] And as you can see, unfortunately, there are still many European countries where I have [07:34.600 --> 07:37.600] found no open data source at all. [07:37.600 --> 07:43.080] So maybe it's because I don't speak the language, but still, it's problematic. [07:43.080 --> 07:47.240] Then I've done a little quantitative comparison of the data I've found. [07:47.240 --> 07:54.640] So this is a comparison of track length total for one country, so by country and by source. [07:54.640 --> 08:00.880] As you can see, I have found data on OpenStreetsMap for all of the European countries, but not [08:00.880 --> 08:06.360] an open data source that is not OSM for all countries. [08:06.360 --> 08:11.280] And even more, what we can see on the graph is that in every country, the OpenStreetsMap [08:11.280 --> 08:18.240] data shows more tracks than the open data data. [08:18.240 --> 08:23.360] So even if there is open data available, it seems that the OpenStreetsMap data is more [08:23.360 --> 08:26.760] complete. [08:26.760 --> 08:37.200] Then I tried to design an indicator to see if all the useful data was available for OSRD [08:37.200 --> 08:38.200] needs. [08:38.200 --> 08:42.120] So you can see the same data needs that I've presented before. [08:42.120 --> 08:45.120] And I have classified them by necessity. [08:45.120 --> 08:52.440] So we require tracks and signals to make OSRD run, and then the other data are optional, [08:52.440 --> 08:57.360] which means if we have them, this is good, and we will have a better result. [08:57.360 --> 09:03.400] But if we don't have them, we can still run our tool and have partial results. [09:03.400 --> 09:12.200] So I've designed an indicator, which is good if we have the two required data and two optional [09:12.200 --> 09:19.840] data or more, then an OK indicator if we have part of the required data. [09:19.840 --> 09:28.400] The required indicator can be one and a half if we have partial data. [09:28.400 --> 09:33.600] It's quite complicated, but I have shared the full methodology on the blog, and I will [09:33.600 --> 09:36.440] send you the link after, so don't worry. [09:36.440 --> 09:42.920] What you have to remember is that this indicator will give you an overview of if the available [09:42.920 --> 09:47.120] data can be used for OSRD needs. [09:47.120 --> 09:49.480] So what are the results of this study? [09:49.480 --> 09:52.920] First, what we can do is open data. [09:52.920 --> 09:57.680] Unfortunately, as you can see, the map is not so green. [09:57.680 --> 10:05.440] So there are a few countries where you can do OK or poor implementation of OSRD using [10:05.440 --> 10:09.840] open data, excluding OpenStreetMap. [10:09.840 --> 10:14.400] And then we can see the map for the OpenStreetMap data. [10:14.400 --> 10:15.400] It's better. [10:15.400 --> 10:19.480] It's not that better, but it's better. [10:19.480 --> 10:27.120] So there are many countries that were read in the first map that are now green, and there [10:27.120 --> 10:31.520] are many countries that were gray that are now red. [10:31.520 --> 10:34.760] So it's not that good, but it's better. [10:34.760 --> 10:40.760] What we can see is that OpenStreetMap is the database we should use and improve because [10:40.760 --> 10:45.280] it's currently the best standard across Europe. [10:45.280 --> 10:51.920] So as I've said, you can look at the full data and methodology on our blog. [10:51.920 --> 10:59.200] So there is the detailed analysis for each country, as well as the sources for each [10:59.200 --> 11:00.920] open data set that I found. [11:00.920 --> 11:08.680] So if you're interested in one country, specifically, you can check out this. [11:08.680 --> 11:14.400] I'd like to thank the people that have done the icons for this presentation, and also [11:14.400 --> 11:20.240] a special thanks for the QGIS community that have allowed me to make the maps and most [11:20.240 --> 11:21.520] of the analysis. [11:21.520 --> 11:27.640] So maybe if there are QGIS developers there, thank you so much for your work. [11:27.640 --> 11:31.320] And finally, if you want to contact us, there are emails. [11:31.320 --> 11:35.160] You can learn more about the OSRD project on our website. [11:35.160 --> 11:39.160] You can chat with us, and if you're a railway company, you might be interested in joining [11:39.160 --> 11:42.960] the Open Rail Foundation, so let us know. [11:42.960 --> 11:43.960] Thank you for listening. [11:43.960 --> 11:52.560] Thank you.