I had my notes here, so I'm supposed to read them because my English is really not that well. So in some parts I'm going to try to just read the slides and in some others I'm going to try to improve. So first of all I want to really thank you guys being here, the Graphology and Sigma developers because thank you, thank you for your work and thank you for being here. I really appreciate that. I'm using that library. I'm also in one moment. So in one moment I also used Force Atlas II that Matthew Jacome is also here. So it's kind of really, really amazing for me being in this chance to at least talk to somebody that has the same interest and that's why in my own country maybe it's difficult to do this kind of presentation because I have to make a long introduction. As I'm going to do right now, but about the social phenomena I'm studying, it's freight graffiti. So this is one of the visualizations we can achieve and this is, as you can see, it's a really hot mess happening there. It's a lot of stuff and we're trying to get this. It's kind of a synthesis or maybe an abstraction too. So in this visualization at the end we'll be able to link the users to the symbolic forms, the symbolic or the meaning they are using to make a community. So I just invite you to have that in mind. We're going from this to this, but before we actually have to get all that information, gather all that information and make it happen in this visualization, in this computational visualization. Something is happening in these train yards. So we have two different stuff. We have to break that, this really long title. I know maybe social scientists were always in verbose mode and we were just talking right and talking right and I see how you guys are really into synthesis and really straightforward. So what I'm going to do here is talking about two different stuff. One, a computational process that is a really fancy name, no production through artificial intelligence, inferences, but it's just a filter. We're just cleaning all the mess. And the other one is about a social phenomenon. This is happening in physical world, real world if you want to say so, but we're trying to take that data also and make it happen in our own framework. So these guys tend to write their names on the freight trains and these trains will always travel from all the North America region and then some other persons will track them, will take some pictures and will post this on Instagram. So this has been happening before social media. So this is a community. It's a practice community. If somebody here know Jenkins, maybe you will know what I'm talking about, participatory culture. So that's kind of the same. There's two, two, that's the same phenomenon happening in two different places. In the physical world, in the digital world, and that's what we called an on-life phenomenon. So the case of study for this presentation, as I told you, is the graffiti, the freight train graffiti in the North America region. So an hypertextual conversations, I don't know if it makes sense for somebody here. One guy in the morning make a presentation about his presentation was Cosma. That's the software he was presenting and he talked about that guy that invent this kind of linking document idea and that's hypertext. What about hashtags? Hashtags may be this kind of hypertext too because they will function like a gathering point. People will join in those places through their own publishing, their own post after they tag them with any hashtag. This network will be talking about users, Instagram users, that post and mark this post with any hashtag. So this can make clusters or clicks and that's what we are trying to look. These small groups that share something in commune, that share meaning, all these posts are meaningful for themselves. So that's I think something happening here too. This is like this big group of persons that gets together and fuzz them and then we have these little clicks happening in each room and then anybody will move from place to place and make these kind of networks if we try to see it that way. So the other part I would like to talk to you is about this filter. This filter happens in two different levels. One with a Python using some other libraries too and the second one through Graphology and Sigma, I think misspelled, using JavaScript. So this was introduction guys. The point here is I'm going to share to you how each step uses different open source libraries or software and that's one way to acknowledge to all the developers here that all that effort you are doing is making people like me that is not really a developer. Trying to make a dialogue, talk between social science, computational science with the tools that I can try to use. So there's a word there that is really important. It will go all the way from the whole slide show, it's data. We have been listening to that concept a lot and I really feel kind of sad because the effort that I can see in those talks before was about standard data. Big platforms make tools to make standard data and this example is the whole other thing. It's really different because it's a really custom data set. It's a really custom, it's a really niche social phenomena so there's no tools to study this study object. So we have to make them with anything we can. So data is the key and it's the link between execution devices, between disciplines, between programming languages, theoretical frameworks, development libraries and social phenomena. That will help us to make interoperability between all of these different dimensions. And I think, and I hope you do too, this will be only possible through open source and data. Data is the key here. So the journey starts. I'm going to try to be really fast so I can have some of your comments. I will tell you this is a master's degree thesis so each step it was way long. If you think this is verbose, that's some other stuff. So I'm using the first link between, I want to show you is between conceptual frameworks, theories. So we have Thompson, a guy from England that is trying to find these kind of categories to detect meaning, to detect symbolic stuff. And we also have the graffiti de firma from Figueroa, that's a Spaniard, another Spaniard guy that retomb these I exist, I am the SCART, I don't know in French maybe, the SCART, I don't know the right pronunciation, the SCART, the SCART is to the cart. To see how graffiti writers broadcast themselves to the world. So we're making this link, right? Because data will be the key here. To make this link between some theoretical point view perspective, to a way we can manage to just back up at least, we need to make this, look for these terms, look for these stuff and make it some sort of way to, well to data. So we at least have these three categories, those things we are looking for. We are looking for geographies, so we are looking for cities, so we are making a dictionary, a city dictionary. We are looking for communities, that's symbolic, shared terms, so this dictionary is about the words that graffiti writers use to tag their own posts and the freight workers use too, so we can mix them, merge them and make this freight train dictionary. And last but not least, we have entities, so we are looking for graffiti writers names. We are going to scrap, we're going to mine these hashtags, these hashtags, conversations, these hypertextual conversations and the network, we have that simple structure. Users post some publication and add some tags. But we are not only using one user's post, we are using a lot of them. So we have this seed node, the seed node is the first hashtag scrapped and this Instagram data mining boat, really original name, will download an infinite number of posts and then add new hashtags that are found on these publications. That will give us this primitive kind of network, this is a small one, the seed node was graffiti bombing, we used a mining depth of only zero, so it will only mine that in this case 30 posts that are using this specific hashtag, but as you will see, graffiti bombing has 30 posts, but this other post is also using different hashtags. So that is how this network is built. For making this mining, I'm using Instagram app, it's an unofficial Instagram app for Python. I don't know if it's a privacy stuff and I know it's tricky, so I won't talk about it. But I use Docker, so I can make a Raspberry, we try to mimic human behaviors, so this mining will last for maybe one week for each conversation and if these conversations are really large, it will last longer. So that's why we are using this low consumption computer and then after we scrap this, we will put it on the SQL database. So we are going from the publications to the SQL database. So this will be a really fast way to put it. We came from reality to Instagram and from Instagram to our own dataset. But we are now looking for these terms on the dictionary we already made before. So in this case, this is a writer from my city, Afex, he wrote that train in Mexico and now somebody else sent him the photo in Utah. So he will put his name, the place it was found, some other stuff and some slang for the same community and also his crew, his group. So if we try to put this text in spicy, we'll give us only one token and that won't help. So we have to split the hashtag in small words. So the answer was really cool. It was already on Stack Overflow. So thank you to that guy. I put it in the paperwork. It's there because we have to acknowledge some others work. So I have to build this really big dictionary of all the words we know in Spanish and English to split the hashtag. And after that, we'll look in these dictionaries. If some word is inside any dictionary, it will be marked as. If it's not, we will think about it as a writer name or as a crew name too. But we're making sure this is real and we will look for graffiti in any part of the caption to make us sure that that strange string is actually a graffiti writer's name. So this is simple, but it works. We have those words, how those hashtags were marked by this software. So we have throw up calls, bubble style. And this is interesting, but it will be more interesting when we try to put everything together. We do it with a spicy docker on Sorrasberry. I'm going to be really fast now. These are two image detection models, ones for Google, ones for IMAID. And that's cool because this is the same technical process, the same image, but it's seeing two different stuff, right? Because the models will see what we want them to see in the training. So that's really straightforward, but it's really important because the Google model, it won't be useful for me at all. So this is the result of using this model. Also it will be interesting when we put everything together. We are using Jupyter with Google collab because it's free, so we can make an SQL query and then it will download the images and apply the model. So the beauty of relational databases is you can access to these different content from different sizes. You already know this. But the point here, we are going from the database to JSON network and we will get something like this, right? We have in the middle, in yellow, the inference notes and purple, the images were detected. And the point here is to see how users gather using the same symbolic stuff. That can be the same symbolic stuff, some kind of graffiti, some specific slang word, some kind of city. We can see in this point how somebody in Tijuana maybe will use the same... Some group in Tijuana will use maybe the same style. I think that's important. But this is a hot mess again. It's thousands of notes of different types, so we have to clean this. Looking for significance, or meaning or symbolic forms. We know a man is an animal suspend in webs of significance. He himself has fun, but we can clean that. We're trying to clean that. So the note reduction will be the shortest path to the meaning. We're going from that to the really clean network. At least I think so. So the algorithm, the thing that's happening here is that for each user node, we're making an array. Then I have another array of the whole network. And if they have a shortest path that is like an algorithm using graphology, and another place is two, that if this match some... If it match three steps, it means that the user has some symbolic node detected and then it will link them. If it's not like that, it will delay the node. So it will change the network structure to this now. It will be... It's really different from that one from before. So I think... I don't know if we have some time. If you want to see how this works, and if you have also some comments too, because I would really love to see if you guys have anything that I can change, I can add. I think it's really... I don't feel like there's some questions you can do. I think it would be better if you just told me what to think about it. So in this case, the network starts with 4,000, almost 5,000 nodes. And at the end, it's really small. Let's see. I don't even remember which one is the biggest. Sorry. We went from 5,000 to only 500. So I put this example because... There's a principle phenomenon called divergence that when we mine the whole hypertextual conversation, it will go for a lot of places that we are not interested. Like if somebody used red as a hashtag, if somebody used love, if somebody else used no, it will just move the conversation that we are trying to mine to different places that we are not interested. So in this case, macro, it's a really known writer. Let's see if we can find it. Well, this one is also a really known writer. So in this specific example, we can see how the object detection model tagged this photo as a wild style, as a throw up, and the Google model will tag them like a wheel and a train. But also, we can see how this hashtag was tagged as a graffiti writer. So that gives us an idea that... Well, not an idea, an evidence that this guy is a graffiti writer name and we can see his intervention. We can see who is the user that makes this post. When we can access to the user, the user neighbor network, we can also be sure that he's using these specific graffiti styles. Okay. So I'm going to finish with this. In this case, when we apply this filter, it's way better, it's way cleaner, and we can start to see... I think I'm just talking nonsense now. Do you want to add something? That's my answer question. Thank you. How did you get inspired to run this as a master piece and continue doing research? I mean, checking the throw ups and the graffiti on trains. What was the practical aspect that motivated you to be fine? Okay. So the question was, what was the practical aspect of attracting the graffiti and what motivates you personally to do it? I had a pre-grad also graffiti as a central topic and I thought it would be easier to do something that will continue this personal initiative. But at last, I've been late for two years now. I shouldn't deliver this last year because... But it was really interesting to how learn this new stuff and put it together to make some scholar work. What made the difference? What do you see? Right train, for instance. How do you know that right train is the background and not the name of the artist? Okay. That's a really... That's a good question. I... You can repeat the question. Okay. How we managed to difference the freight as a graffiti... Is not a graffiti writer's name. So there's a big dictionary. It's built with all the known words. So it will distinguish between known words and words that are out of that vocabulary. Yes, Alison? I have one question then if no one else has one. How do you... Have you had any insights in your graph that really excited you? Yes. So the question is if the insight of the network really excited me. Yeah, I think it does because it was like a kind of a serendipity, you know? When I start to see like this small notes linked together from the terms, it will pop like how some terms are linked for some graffiti styles too. So everything's connected and I think the way to get to this is tailoring data to our own needs. Right. Yeah. Okay, folks. Can we have a big round of applause? Thank you. Thank you. Thank you. Thank you.