[00:00.000 --> 00:13.640]  Hi, everyone. Well, I'm very impressed to have such a large audience for such a small
[00:13.640 --> 00:21.080]  tool. But, well, I'm Beatrice. I work at the French Media Lab. And today I'm going to
[00:21.080 --> 00:29.640]  present PIMI, which is a tool to study image propagation. The Media Lab is a lab where
[00:29.640 --> 00:39.520]  social scientists try to, among other things, study the traces that people leave online.
[00:39.520 --> 00:46.880]  And for now, they are quite well equipped with tools to study text. But when they ask
[00:46.880 --> 00:59.920]  me, OK, how can I study mean propagation? I'm still struggling to give them answers.
[00:59.920 --> 01:05.720]  So what does it mean to study mean propagation? It means, OK, being able to recognize that
[01:05.720 --> 01:16.240]  some parts of an image are copied or partially copied. So what this tool does, it's very
[01:16.240 --> 01:25.760]  simple. It's able to create clusters of images and group together images that are total or
[01:25.760 --> 01:31.680]  partial copies of each other. It's able to deal with image transformation, so if the
[01:31.680 --> 01:40.800]  image is cropped or zoomed. And it's able to adapt to copper's characteristics. So
[01:40.800 --> 01:45.520]  it will try to make the best of your data sets, depending on the number of images you
[01:45.520 --> 01:55.600]  have or the type of images you have. What PIMI is not able to do is to cluster semantically
[01:55.600 --> 02:02.160]  similar images. So it's not the tool that you are going to use if you want to create
[02:02.160 --> 02:09.840]  clusters of cats and clusters of dogs or, I don't know, find images of violence versus
[02:09.840 --> 02:18.960]  images of peace. And it's not able to do some face recognition. So again, you will not
[02:18.960 --> 02:26.960]  be able to make some clusters of pictures of Elizabeth II versus clusters of images
[02:26.960 --> 02:40.800]  of Emmanuel Macron. What you could imagine doing, and we could imagine also work together
[02:40.800 --> 02:49.960]  if you are a researcher working on those subjects, is to study the propagation of MIMA on social
[02:49.960 --> 02:56.920]  networks, as I was saying. But also you could study the usage of the press agency photos
[02:56.920 --> 03:03.960]  in a press corpus or stock photos as well. You could also study the dissemination of
[03:03.960 --> 03:13.000]  fake news based on image montage. Or you could study the editorial choices between different
[03:13.000 --> 03:23.600]  media, depending on whether they use the same images or not. So let me do a quick demo of
[03:23.600 --> 03:48.600]  how it looks for now. It's not on the screen. Okay, let's forget about that.
[03:48.600 --> 04:17.600]  I'm very sorry.
[04:17.600 --> 04:33.600]  Well, I'll try to make it work. Okay, well, it's still not showing totally all clusters.
[04:33.600 --> 04:40.240]  So we create clusters of images. So this is a data set that is created by the French
[04:40.240 --> 04:49.640]  Inria and that is presenting some degradation on images. So they take an original picture
[04:49.640 --> 04:56.000]  and they apply some filters or they crop the images to see if they are able to group the
[04:56.000 --> 05:03.560]  images together. So we can see that we have pretty correct results on that data set. And
[05:03.560 --> 05:10.880]  this is our results on some images that we collected ourselves on Twitter using Elon
[05:10.880 --> 05:20.120]  Musk as a query. And so we try to clusters those images. So as you can see, we have images
[05:20.120 --> 05:30.520]  of Elon Musk. We are able to group together some images that are crops of others. So this
[05:30.520 --> 05:38.680]  is probably the source image of the montage that has been done here. But we can also see
[05:38.680 --> 05:46.560]  that we have some problems with the tool. For example, here we have a cluster with two
[05:46.560 --> 05:55.440]  images that have been assembled together and we create a cluster of actually two images.
[05:55.440 --> 06:25.400]  But well, that's the state of the tool for now. And now I try to come back to my slides.
[06:25.400 --> 06:33.200]  Okay, so how does it work? For people who work in computer vision, I'm going probably
[06:33.200 --> 06:40.160]  to say some things that are quite basic, but I'll try to make it clear for people who
[06:40.160 --> 06:47.440]  do not do computer vision. So it is not based on colors at all. It's used like the grayscale
[06:47.440 --> 06:58.360]  of images. And it tries to detect points of interest on a picture. And then it uses these
[06:58.360 --> 07:10.680]  local key points as vectors. And then those vectors are indexed in a database that is
[07:10.680 --> 07:21.120]  able to perform some very quick similarity search.
[13:10.680 --> 13:37.480]  of the tool. As I say, there is that problem of parts of images that create clusters that
[13:37.480 --> 13:44.880]  are bigger than they should be. So our plan is to be able to detect images that are actually
[13:44.880 --> 13:50.880]  those links between two clusters. So to be able to detect that this image is actually
[13:50.880 --> 14:00.720]  containing two images and to be able to deal with part of images. And also what we would
[14:00.720 --> 14:08.320]  like to do is to show images in their context, to be able to show the tweets that contains
[14:08.320 --> 14:15.440]  those images or Instagram posts, et cetera. Or at least to show additional metadata for
[14:15.440 --> 14:24.200]  the users. And also we would like to show you the graph of image similarities so that
[14:24.200 --> 14:35.720]  the clusters that are resulting from that graph are not interpretable. And to improve
[14:35.720 --> 14:49.560]  our tool, we need your use cases because for now we have those two, three databases. But
[14:49.560 --> 14:57.680]  we would be very glad to do some partnerships with other researchers to improve the tool.
[14:57.680 --> 15:04.960]  Thank you very much for your attention. If you want to look at the slides, we have the
[15:04.960 --> 15:13.880]  references to all the images used and to the papers of the algorithms used by Pini. I'm
[15:13.880 --> 15:27.560]  open for questions. We had a bit of trouble with the sound stream, but it's back on now.
[15:27.560 --> 15:41.560]  So yeah, you should repeat the question. Okay, I'll do. Yes. So thank you very much
[15:41.560 --> 16:08.120]  for that. We'll try to find similarities. Oh, sorry. I have to repeat the question.
[16:08.120 --> 16:22.960]  So the question was, if I understand well, how to reproduce that use case, not on images,
[16:22.960 --> 16:35.880]  but on other types of documents that would be, I guess, some features. 3D counterparts.
[16:35.880 --> 16:44.920]  And I'd say, well, as long as you can, like, represent your data in the shape of vectors,
[16:44.920 --> 16:51.840]  then you're ready to use face to, like, do some, some search for nearest neighbor in
[16:51.840 --> 16:58.320]  your database. And then you can go for the whole pipeline, create some graphs, find communities
[16:58.320 --> 17:08.480]  in the graph, and go for it. But I'm not sure Pini is your tool, but, but, well, the architecture
[17:08.480 --> 17:18.080]  of Pini could be, of course, a model. Yes. Is there any project current or you're completely
[17:18.080 --> 17:26.240]  ongoing that Media Labs has used before, or is it still largely in development? It is
[17:26.240 --> 17:35.280]  largely in the development. Sorry, I repeat the question. So are there some projects at
[17:35.280 --> 17:50.200]  the Media Lab that are currently using Pini? And the response is no. Yes. Sorry, can you
[17:50.200 --> 18:06.480]  consider any other ways that can be considered? Yes. Have you considered other ways of presenting
[18:06.480 --> 18:13.040]  picture similarity or using picture similarity, or the types of image similarity, if I'm
[18:13.040 --> 18:27.800]  here in the Sunwell? Well, I'd say that that was what I was saying in my second slide. There
[18:27.800 --> 18:37.560]  are other types of image similarity, for example, semantical similarity. And, well, maybe in
[18:37.560 --> 18:48.400]  a few months, if we have like a robust architecture, we could maybe include some other types of
[18:48.400 --> 19:00.360]  vectorization of images. But for now, well, there are already tools that do that. Like,
[19:00.360 --> 19:10.680]  there is something called Clip Server that helps you find similar images from clip vectors
[19:10.680 --> 19:26.080]  that are like semantical vectors. So you could use that tool. It's great. Yes. Yes.
[19:26.080 --> 19:53.960]  So the question is, is the tool really able to distinguish the thing that is of interest
[19:53.960 --> 20:04.120]  to us, the fact that we are talking about a dog? So the tool is only able to find partial
[20:04.120 --> 20:11.280]  copies in an image. So the tool would probably be able to say that all those images contain
[20:11.280 --> 20:19.680]  the same parts of face of a dog. So it would probably be able to group all those images
[20:19.680 --> 20:26.760]  together. The problem is that if there are other images in the database that contain
[20:26.760 --> 20:33.200]  the rest of the images, then they would probably also be grouped in the same cluster. So that's
[20:33.200 --> 20:43.080]  why what we are currently doing about parts of images would let us improve the cluster
[20:43.080 --> 20:53.880]  so that it's purified from the rest of the images. And we could have a cluster of the
[20:53.880 --> 21:07.200]  face of that specific dog and then a cluster of that taco in the second cluster. Yes.
[21:07.200 --> 21:13.200]  What kind of clusterization do you use on the graph? Well, for now, we have the best
[21:13.200 --> 21:23.440]  result with, excuse me, what kind of clusterization do you use on the graph? For now, we have
[21:23.440 --> 21:31.600]  our best result using pure connected components. So actually, the specification we do on the
[21:31.600 --> 21:39.360]  graph to reduce the number of links between images is enough to have separated connected
[21:39.360 --> 21:46.240]  components in the graph. And so we take each connected component and it's our cluster.
[21:46.240 --> 21:55.200]  What we would like to do is to try to mix with some Luvain community detection, but actually
[21:55.200 --> 22:21.120]  for now, it's not the thing that works best. Yes.
[22:21.120 --> 22:32.200]  I'm not sure I understand the question. Can you try to rephrase it? Okay. What things
[22:32.200 --> 22:47.080]  are you looking at to improve the model? Well, there are many things we are looking
[22:47.080 --> 22:59.160]  at. For now, mainly, we look at techniques to do a better graph specification in order
[22:59.160 --> 23:13.640]  to find more coherent clusters. We are not so much working on the local descriptors part
[23:13.640 --> 23:43.520]  of the tool for now. Have you considered using the direct link
[23:43.520 --> 23:55.640]  to the Twitter images or social media images online? Did I repeat everything? Well, yes.
[23:55.640 --> 24:02.760]  We would like people to be able to see images in their context because, actually, they won't
[24:02.760 --> 24:08.880]  understand what's happening if they just have images. They need to see, okay, why was this
[24:08.880 --> 24:19.640]  image published? Who answered, et cetera. This would probably mean that we need to add
[24:19.640 --> 24:26.040]  at least the links to the pulse or maybe some kind of visualization of it.
[24:26.040 --> 24:42.440]  We have a bit of time here. Any more questions? We can take one or two. If not, we can switch
[24:42.440 --> 24:43.440]  quietly. All the next questions.
[24:43.440 --> 25:00.420]  Thank you.