[00:00.000 --> 00:13.640] Hi, everyone. Well, I'm very impressed to have such a large audience for such a small [00:13.640 --> 00:21.080] tool. But, well, I'm Beatrice. I work at the French Media Lab. And today I'm going to [00:21.080 --> 00:29.640] present PIMI, which is a tool to study image propagation. The Media Lab is a lab where [00:29.640 --> 00:39.520] social scientists try to, among other things, study the traces that people leave online. [00:39.520 --> 00:46.880] And for now, they are quite well equipped with tools to study text. But when they ask [00:46.880 --> 00:59.920] me, OK, how can I study mean propagation? I'm still struggling to give them answers. [00:59.920 --> 01:05.720] So what does it mean to study mean propagation? It means, OK, being able to recognize that [01:05.720 --> 01:16.240] some parts of an image are copied or partially copied. So what this tool does, it's very [01:16.240 --> 01:25.760] simple. It's able to create clusters of images and group together images that are total or [01:25.760 --> 01:31.680] partial copies of each other. It's able to deal with image transformation, so if the [01:31.680 --> 01:40.800] image is cropped or zoomed. And it's able to adapt to copper's characteristics. So [01:40.800 --> 01:45.520] it will try to make the best of your data sets, depending on the number of images you [01:45.520 --> 01:55.600] have or the type of images you have. What PIMI is not able to do is to cluster semantically [01:55.600 --> 02:02.160] similar images. So it's not the tool that you are going to use if you want to create [02:02.160 --> 02:09.840] clusters of cats and clusters of dogs or, I don't know, find images of violence versus [02:09.840 --> 02:18.960] images of peace. And it's not able to do some face recognition. So again, you will not [02:18.960 --> 02:26.960] be able to make some clusters of pictures of Elizabeth II versus clusters of images [02:26.960 --> 02:40.800] of Emmanuel Macron. What you could imagine doing, and we could imagine also work together [02:40.800 --> 02:49.960] if you are a researcher working on those subjects, is to study the propagation of MIMA on social [02:49.960 --> 02:56.920] networks, as I was saying. But also you could study the usage of the press agency photos [02:56.920 --> 03:03.960] in a press corpus or stock photos as well. You could also study the dissemination of [03:03.960 --> 03:13.000] fake news based on image montage. Or you could study the editorial choices between different [03:13.000 --> 03:23.600] media, depending on whether they use the same images or not. So let me do a quick demo of [03:23.600 --> 03:48.600] how it looks for now. It's not on the screen. Okay, let's forget about that. [03:48.600 --> 04:17.600] I'm very sorry. [04:17.600 --> 04:33.600] Well, I'll try to make it work. Okay, well, it's still not showing totally all clusters. [04:33.600 --> 04:40.240] So we create clusters of images. So this is a data set that is created by the French [04:40.240 --> 04:49.640] Inria and that is presenting some degradation on images. So they take an original picture [04:49.640 --> 04:56.000] and they apply some filters or they crop the images to see if they are able to group the [04:56.000 --> 05:03.560] images together. So we can see that we have pretty correct results on that data set. And [05:03.560 --> 05:10.880] this is our results on some images that we collected ourselves on Twitter using Elon [05:10.880 --> 05:20.120] Musk as a query. And so we try to clusters those images. So as you can see, we have images [05:20.120 --> 05:30.520] of Elon Musk. We are able to group together some images that are crops of others. So this [05:30.520 --> 05:38.680] is probably the source image of the montage that has been done here. But we can also see [05:38.680 --> 05:46.560] that we have some problems with the tool. For example, here we have a cluster with two [05:46.560 --> 05:55.440] images that have been assembled together and we create a cluster of actually two images. [05:55.440 --> 06:25.400] But well, that's the state of the tool for now. And now I try to come back to my slides. [06:25.400 --> 06:33.200] Okay, so how does it work? For people who work in computer vision, I'm going probably [06:33.200 --> 06:40.160] to say some things that are quite basic, but I'll try to make it clear for people who [06:40.160 --> 06:47.440] do not do computer vision. So it is not based on colors at all. It's used like the grayscale [06:47.440 --> 06:58.360] of images. And it tries to detect points of interest on a picture. And then it uses these [06:58.360 --> 07:10.680] local key points as vectors. And then those vectors are indexed in a database that is [07:10.680 --> 07:21.120] able to perform some very quick similarity search. [13:10.680 --> 13:37.480] of the tool. As I say, there is that problem of parts of images that create clusters that [13:37.480 --> 13:44.880] are bigger than they should be. So our plan is to be able to detect images that are actually [13:44.880 --> 13:50.880] those links between two clusters. So to be able to detect that this image is actually [13:50.880 --> 14:00.720] containing two images and to be able to deal with part of images. And also what we would [14:00.720 --> 14:08.320] like to do is to show images in their context, to be able to show the tweets that contains [14:08.320 --> 14:15.440] those images or Instagram posts, et cetera. Or at least to show additional metadata for [14:15.440 --> 14:24.200] the users. And also we would like to show you the graph of image similarities so that [14:24.200 --> 14:35.720] the clusters that are resulting from that graph are not interpretable. And to improve [14:35.720 --> 14:49.560] our tool, we need your use cases because for now we have those two, three databases. But [14:49.560 --> 14:57.680] we would be very glad to do some partnerships with other researchers to improve the tool. [14:57.680 --> 15:04.960] Thank you very much for your attention. If you want to look at the slides, we have the [15:04.960 --> 15:13.880] references to all the images used and to the papers of the algorithms used by Pini. I'm [15:13.880 --> 15:27.560] open for questions. We had a bit of trouble with the sound stream, but it's back on now. [15:27.560 --> 15:41.560] So yeah, you should repeat the question. Okay, I'll do. Yes. So thank you very much [15:41.560 --> 16:08.120] for that. We'll try to find similarities. Oh, sorry. I have to repeat the question. [16:08.120 --> 16:22.960] So the question was, if I understand well, how to reproduce that use case, not on images, [16:22.960 --> 16:35.880] but on other types of documents that would be, I guess, some features. 3D counterparts. [16:35.880 --> 16:44.920] And I'd say, well, as long as you can, like, represent your data in the shape of vectors, [16:44.920 --> 16:51.840] then you're ready to use face to, like, do some, some search for nearest neighbor in [16:51.840 --> 16:58.320] your database. And then you can go for the whole pipeline, create some graphs, find communities [16:58.320 --> 17:08.480] in the graph, and go for it. But I'm not sure Pini is your tool, but, but, well, the architecture [17:08.480 --> 17:18.080] of Pini could be, of course, a model. Yes. Is there any project current or you're completely [17:18.080 --> 17:26.240] ongoing that Media Labs has used before, or is it still largely in development? It is [17:26.240 --> 17:35.280] largely in the development. Sorry, I repeat the question. So are there some projects at [17:35.280 --> 17:50.200] the Media Lab that are currently using Pini? And the response is no. Yes. Sorry, can you [17:50.200 --> 18:06.480] consider any other ways that can be considered? Yes. Have you considered other ways of presenting [18:06.480 --> 18:13.040] picture similarity or using picture similarity, or the types of image similarity, if I'm [18:13.040 --> 18:27.800] here in the Sunwell? Well, I'd say that that was what I was saying in my second slide. There [18:27.800 --> 18:37.560] are other types of image similarity, for example, semantical similarity. And, well, maybe in [18:37.560 --> 18:48.400] a few months, if we have like a robust architecture, we could maybe include some other types of [18:48.400 --> 19:00.360] vectorization of images. But for now, well, there are already tools that do that. Like, [19:00.360 --> 19:10.680] there is something called Clip Server that helps you find similar images from clip vectors [19:10.680 --> 19:26.080] that are like semantical vectors. So you could use that tool. It's great. Yes. Yes. [19:26.080 --> 19:53.960] So the question is, is the tool really able to distinguish the thing that is of interest [19:53.960 --> 20:04.120] to us, the fact that we are talking about a dog? So the tool is only able to find partial [20:04.120 --> 20:11.280] copies in an image. So the tool would probably be able to say that all those images contain [20:11.280 --> 20:19.680] the same parts of face of a dog. So it would probably be able to group all those images [20:19.680 --> 20:26.760] together. The problem is that if there are other images in the database that contain [20:26.760 --> 20:33.200] the rest of the images, then they would probably also be grouped in the same cluster. So that's [20:33.200 --> 20:43.080] why what we are currently doing about parts of images would let us improve the cluster [20:43.080 --> 20:53.880] so that it's purified from the rest of the images. And we could have a cluster of the [20:53.880 --> 21:07.200] face of that specific dog and then a cluster of that taco in the second cluster. Yes. [21:07.200 --> 21:13.200] What kind of clusterization do you use on the graph? Well, for now, we have the best [21:13.200 --> 21:23.440] result with, excuse me, what kind of clusterization do you use on the graph? For now, we have [21:23.440 --> 21:31.600] our best result using pure connected components. So actually, the specification we do on the [21:31.600 --> 21:39.360] graph to reduce the number of links between images is enough to have separated connected [21:39.360 --> 21:46.240] components in the graph. And so we take each connected component and it's our cluster. [21:46.240 --> 21:55.200] What we would like to do is to try to mix with some Luvain community detection, but actually [21:55.200 --> 22:21.120] for now, it's not the thing that works best. Yes. [22:21.120 --> 22:32.200] I'm not sure I understand the question. Can you try to rephrase it? Okay. What things [22:32.200 --> 22:47.080] are you looking at to improve the model? Well, there are many things we are looking [22:47.080 --> 22:59.160] at. For now, mainly, we look at techniques to do a better graph specification in order [22:59.160 --> 23:13.640] to find more coherent clusters. We are not so much working on the local descriptors part [23:13.640 --> 23:43.520] of the tool for now. Have you considered using the direct link [23:43.520 --> 23:55.640] to the Twitter images or social media images online? Did I repeat everything? Well, yes. [23:55.640 --> 24:02.760] We would like people to be able to see images in their context because, actually, they won't [24:02.760 --> 24:08.880] understand what's happening if they just have images. They need to see, okay, why was this [24:08.880 --> 24:19.640] image published? Who answered, et cetera. This would probably mean that we need to add [24:19.640 --> 24:26.040] at least the links to the pulse or maybe some kind of visualization of it. [24:26.040 --> 24:42.440] We have a bit of time here. Any more questions? We can take one or two. If not, we can switch [24:42.440 --> 24:43.440] quietly. All the next questions. [24:43.440 --> 25:00.420] Thank you.