[00:00.000 --> 00:09.480] So, thank you. I'm Aurora from the University of Murcia. [00:09.480 --> 00:12.120] And I'm Michael from the University of Passau. [00:12.120 --> 00:16.320] And we will be presenting the NGA Search and OpenWeb Search.U projects. [00:16.320 --> 00:22.960] Two sister initiatives for a paradigm change in open search and discovery on the internet. [00:22.960 --> 00:27.640] We will have a common introduction and then each of us will delve in the projects that [00:27.640 --> 00:29.960] we are involved in. [00:29.960 --> 00:33.960] The short disclaimer before we start, in the last two days at FOSTA, we have heard a lot [00:33.960 --> 00:37.440] about personal lifetime projects. [00:37.440 --> 00:40.400] This is quite different because it's not personal. [00:40.400 --> 00:43.880] These are European institutions involved. [00:43.880 --> 00:49.520] And it's not lifetime, but these projects, they are just starting up and have nice ideas [00:49.520 --> 00:52.800] and require your attention and contribution. [00:52.800 --> 01:01.200] So, NGA Search is a European project that will welcome entrepreneurs, tech-dates, developers [01:01.200 --> 01:08.040] and socially engaged people that have challenging ideas about the way that we will search and [01:08.040 --> 01:11.080] discover information and data on the internet. [01:11.080 --> 01:19.880] So basically, we will find projects that are focusing on several topics that I will explain [01:19.880 --> 01:26.320] in a minute and they are compliant with the European values of openness, transparency, [01:26.320 --> 01:28.400] privacy and trust. [01:28.400 --> 01:32.280] The applicants can be natural persons and organizations and they can apply individually [01:32.280 --> 01:37.720] or as a consortium of three members of a team. [01:37.720 --> 01:43.960] OpenWeb Search is a project funded by the EU and has the goal of developing an open European [01:43.960 --> 01:46.160] infrastructure for a web search. [01:46.160 --> 01:53.120] There are several research and computing centers involved and the project started in September [01:53.120 --> 01:56.920] last year and has a time frame of three years. [01:56.920 --> 02:02.240] It's quite interesting that it's the first project which is funded by the EU and has [02:02.240 --> 02:06.760] the goal of developing Europe's own web index. [02:06.760 --> 02:10.680] And for those who don't know, a web index is a data structure which allows the vast [02:10.680 --> 02:13.040] access of web data. [02:13.040 --> 02:22.840] It is the fundamental core of all current search engines and it enables the development [02:22.840 --> 02:31.240] of all kind of web search and web data retrieval services. [02:31.240 --> 02:34.400] So for us, let me present you the project partners. [02:34.400 --> 02:39.680] We are Link Novene, Aros University, the University of Murcia, OW2, a funding box. [02:39.680 --> 02:43.480] These are some of our phases. [02:43.480 --> 02:47.680] We will organize five different open calls and we will welcome these innovative projects. [02:47.680 --> 02:54.680] We will provide them financial support up to 150,000 euros and as well we will not only [02:54.680 --> 02:59.440] provide financial but also technical, business and innovation support and the projects are [02:59.440 --> 03:03.160] expected to take 12 months. [03:03.160 --> 03:11.400] The first open call is already closed, the evaluation is ongoing and it was a technological [03:11.400 --> 03:19.160] based call, meaning that we were looking for products, finalized products. [03:19.160 --> 03:24.440] And then we will select nine to 10 projects in the topics that you see on the screen, [03:24.440 --> 03:30.600] base voice assistants, NLP, semantic analysis, social computing and data visualization. [03:30.600 --> 03:36.680] But let me delve a little bit more on the second open call topics which may caught your [03:36.680 --> 03:37.680] attention. [03:37.680 --> 03:39.800] I hope you can resonate with some of them. [03:39.800 --> 03:44.880] The second open call will open in April, the first of April and it's a little bit more [03:44.880 --> 03:52.760] research oriented but please have a look because we have plenty of space for everything. [03:52.760 --> 03:54.320] And let me explain each topic. [03:54.320 --> 03:58.520] The first one is power cognitive search by reinforcement learning. [03:58.520 --> 04:04.960] Today we look for mechanisms of self-learning and all kinds of algorithms that can contribute [04:04.960 --> 04:11.480] to our reinforcement learning system that is able to then with the interactions how [04:11.480 --> 04:16.280] to choose the right data and algorithms in order to make any search more relevant towards [04:16.280 --> 04:19.640] the objective of it. [04:19.640 --> 04:23.400] Then the second topic is machine based data, Internet of Things data. [04:23.400 --> 04:29.440] Today we look for algorithms for search and pattern discovery that are adapted or that [04:29.440 --> 04:38.720] can adapt to IOT characteristics which means we will look for edge computing, times, algorithms [04:38.720 --> 04:45.200] that can deal with time series, with events, with different localizations and so on. [04:45.200 --> 04:47.520] We have also AI based taxonomies. [04:47.520 --> 04:53.320] We look for the automatic creation and expansion of existing taxonomies that are machine based [04:53.320 --> 04:59.160] and machine interpretable semantics by using AI techniques. [04:59.160 --> 05:03.760] Basically we want to model the interdependency among new concepts so we want to adapt to [05:03.760 --> 05:08.200] the dynamic and the dynamicity of the data. [05:08.200 --> 05:11.280] The next topic is network analysis. [05:11.280 --> 05:18.680] Basically we want to find other ways to create knowledge graphs which are an interlinked network [05:18.680 --> 05:21.520] of distributed resources that can be searched. [05:21.520 --> 05:27.360] We can do this looking for, the objective can be to make them more scalable, to test [05:27.360 --> 05:35.080] the quality of the creative models, to create different semantics, to make them compliant [05:35.080 --> 05:39.160] with diversity and dynamicity of data and so on. [05:39.160 --> 05:44.840] Then we have AI based search tools and content generators. [05:44.840 --> 05:49.880] As you know nowadays AI based content generators are quite popular. [05:49.880 --> 05:58.680] Here I have some names that have been coming up in FOSM many times, check GPT, GitHub co-pilot, [05:58.680 --> 05:59.680] co-PAI. [05:59.680 --> 06:04.880] So we welcome projects that can help evaluate the privacy of these kind of tools. [06:04.880 --> 06:09.960] Also look for research at the logical gaps on them as well as the completeness of their [06:09.960 --> 06:11.560] answers. [06:11.560 --> 06:14.320] Then we have the topic of ethics in search and discovery. [06:14.320 --> 06:22.480] We want to have projects that specifically focus on testing that the developments of [06:22.480 --> 06:26.880] search and discovery are compliant with human rights, that they are not biased towards [06:26.880 --> 06:38.240] minorities, towards gender, so that they promote equality but also private, they are private [06:38.240 --> 06:43.920] and they take care of the data confidentiality. [06:43.920 --> 06:48.480] And finally we have new ways of discovering and accessing information, which is a broad [06:48.480 --> 06:51.960] topic so that we don't leave anything behind. [06:51.960 --> 06:52.960] How are we going to do that? [06:52.960 --> 06:57.640] We are going to provide technology mentoring and advice on technologies to access store [06:57.640 --> 07:04.680] managed data and also on algorithms and the platforms are tailored to the projects. [07:04.680 --> 07:09.880] We are going to leverage the reach out beta testing platforms, a platform for beta testing [07:09.880 --> 07:13.440] campaigns, then we are going to link the projects to different standards and promote [07:13.440 --> 07:18.800] the conversations with different foundations that will be related to them. [07:18.800 --> 07:21.240] We will have a program for market readiness. [07:21.240 --> 07:31.360] We will create workshops for pitch training so that they can try to pitch their solution [07:31.360 --> 07:36.040] or their project to potential investors and users. [07:36.040 --> 07:41.000] Business modeling advice and coaching and as for innovation, we will give information [07:41.000 --> 07:47.920] about which kind of open source license they can have, they can use, how to manage it and [07:47.920 --> 07:56.040] how to make their research and their solutions reproducible so following open science principles. [07:56.040 --> 08:02.080] So as to conclude for my part, the next three years there will be five open calls that will [08:02.080 --> 08:08.240] challenge the way that we search and discover information on the Internet. [08:08.240 --> 08:14.200] We have synergies with the NDI mission of looking for a more human centric Internet [08:14.200 --> 08:19.920] and these are our core values, open source, contributions to the wider community, collaboration [08:19.920 --> 08:25.440] and open science principles as well as transversal challenges that include sustainability and [08:25.440 --> 08:26.440] equality. [08:26.440 --> 08:30.200] I give the floor to my colleague. [08:30.200 --> 08:31.320] Thanks. [08:31.320 --> 08:36.840] So now switching to the second system project, open web surge. [08:36.840 --> 08:44.440] As I said before, there are several computing and research centers involved in the project [08:44.440 --> 08:49.880] and it's good this way because all of these have different competences like universities [08:49.880 --> 08:55.440] or businesses and this underlines one specific point about the project. [08:55.440 --> 09:02.200] In contrary to the web index which is operated by Google for example, this project does not [09:02.200 --> 09:10.960] follow a centralized approach but it tries to build up a web index collaboratively and [09:10.960 --> 09:16.800] distributed among several European institutions otherwise it would just not be possible to [09:16.800 --> 09:24.360] build up a web index from scratch. [09:24.360 --> 09:29.160] I will keep the motivation short because it should be clear why it is a good idea to have [09:29.160 --> 09:36.160] your own web index in Europe, the first thing is the imbalance in the surge engine market. [09:36.160 --> 09:46.680] There are four big global web indexes out there from big tech companies and this dominance [09:46.680 --> 09:52.800] of these companies has a lot of negative effects on the critical infrastructure web [09:52.800 --> 09:59.120] surge so to say and the solution is to strive for more plurality. [09:59.120 --> 10:07.800] Another thing is that web data is a driver for innovations and an example, just one [10:07.800 --> 10:13.840] out of many examples that has drawn a lot of attention lately is the training of large [10:13.840 --> 10:25.440] language models such as JetGPT which works on its basis with web data. [10:25.440 --> 10:33.680] The project goals are developing the core of an open web index so two remarks on this [10:33.680 --> 10:43.320] point, this will be done with open source and open configuration and it is not expected [10:43.320 --> 10:50.360] that at the end of the three years of the project there will be a production ready index [10:50.360 --> 11:00.480] but the goal is to have an index which contains at least 50% of all text web pages and then [11:00.480 --> 11:06.440] that can be worked on and that shows as a prototype that it is possible to create it [11:06.440 --> 11:12.760] collaboratively and distributed among several institutions. [11:12.760 --> 11:18.640] Another goal is to build an ecosystem around the index and make it publicly available in [11:18.640 --> 11:20.240] this way. [11:20.240 --> 11:25.840] As I said before it serves as a feasibility study in some way to show that it is possible [11:25.840 --> 11:32.640] to do it collaboratively and that is why along the way it should be established a network [11:32.640 --> 11:38.400] among European infrastructure partners. [11:38.400 --> 11:45.800] The overall vision of the project is to give open access for innovators and businesses and [11:45.800 --> 11:55.960] researchers on web data to enable them to build new business ideas for example or to [11:55.960 --> 12:02.980] work on their ideas on web search and web analysis. [12:02.980 --> 12:11.280] As for the NGI search project there are also ways to contribute to openwebsurge.u through [12:11.280 --> 12:17.720] third party calls there will be three public calls with a fixed amount of funding and the [12:17.720 --> 12:22.840] first of them focuses on the legal aspects of the project. [12:22.840 --> 12:29.680] There are two tracks the first one focuses contributions on legal and business and social [12:29.680 --> 12:36.560] aspects of web search in general and the second track focuses on the legal compliance of the [12:36.560 --> 12:44.540] crawling so the acquisition of the data from the internet which is then stored in the index. [12:44.540 --> 12:50.200] This call will open on the first of March this year. [12:50.200 --> 12:58.000] There are also other opportunities to contribute which is covered by the upcoming calls and [12:58.000 --> 13:06.360] this course overview of the architecture should just give you a hint on where these contributions [13:06.360 --> 13:16.120] can be located so if you have an idea on how to develop a search and discovery application [13:16.120 --> 13:27.480] this can be done for example as a vertical search engine on top of the web index infrastructure [13:27.480 --> 13:34.120] or it can be a content analysis method which enriches the data which is then stored in [13:34.120 --> 13:39.200] the index. [13:39.200 --> 13:46.680] So as a conclusion the openwebsurge.u project wants to open up web search and strives for [13:46.680 --> 13:56.920] more plurality to give new business ideas a chance and new alternative search engines [13:56.920 --> 14:05.960] a chance therefore the project partners collaborate and build up an European open web index which [14:05.960 --> 14:09.640] is then publicly available. [14:09.640 --> 14:15.520] For researchers innovators and businesses there is the possibility to contribute either [14:15.520 --> 14:24.160] by developing your own business model which sets upon the open web index or by involving [14:24.160 --> 14:27.520] in one of the three public calls. [14:27.520 --> 14:32.560] The first one of them as I said opens on the first of March. [14:32.560 --> 14:54.960] So thanks for your attention and we open for questions.