I would like to introduce our last two speakers of the day, last but not least. We've got Stephen and Mike. Check. Yes, there we go. Hi. Hi there. I'm Steve Jacobs. I am the director of the Open at RIT, though I have been doing teaching students and faculty and staff about open source since 2009 when we started making educational games for OLPC. This is my first thought stem. So thank you very much. And hi, I'm Mike Nolan. I'm the associate director of Open at RIT. I do not have a career tenure as long as Steve. But I have been doing open source for around a decade now across many different sectors and industry from the humanitarian aid sector and now working in academia. And today we want to talk to you about using science to build open source communities. So I'm sure if you've been sitting in the deaf room all day, you'll probably have noticed some themes around community building and different methods. And you'll probably see some of those themes in ours, but in particular what I'm interested in sort of conveying with this talk is how we use evidence based practices to really figure out what sort of things are necessary for building communities and various types of open work projects. So just as a quick note, you know, while Steve and I are up here, we have a long and storied history of hiring team members and students and very talented people to work on some of the things that we're going to talk about today. Two of those students are Django and Daechon who are fantastic designers. I think both of them might be looking for work, design work and open source. So if you're interested, please reach out. I'd be happy to connect to you with them. They're both amazing and talented. So before we get into the nitty gritty details, Steve, do you want to talk a little bit about who we are and what we do? Sure. So what is now open at RIT has its roots in the education program. We started in 2009 when we wanted students to make educational games for one laptop or child. That went from a seminar to a standard course to multiple courses and a co-op paid internship program that led to the only academic minor in free and open source software and free culture on the planet. We're our alums, number of folks like Justin Flory who is running the Fedora community these days, Remy Decausmaker who is the head of the first open source programs office in the federal government and center for Medicare and Medicaid. So we built things like Fedora badges. We laid the groundwork for the UNICEF venture teams, Roadmap and Milestone program for their fellowship people. So we've been around doing a lot of things. We've learned a lot from a lot of people including shout out Elizabeth Barron up there in the back who is the chaos community manager who God bless her everyone. And several years ago when universities started talking about having OSPOs within universities, we decided to spun one up with a second one in the U.S. anyway to be an open source programs office. Except we don't call ourselves that. We call ourselves an open programs office and we talk about the fact that we support all academic open work. We don't want to send the designers, the artists, the people who don't feel like they do programming or they do research or they do formal academia away. We want everybody to do open stuff and we're there to support them to do that. God bless the Alpha P Sloan Foundation as they gave us some funding because they came to us and said the stuff you used to do out externally for UNICEF, we want you to do for your own faculty members and that's what we're going to be talking about a lot today. Did I miss anything? If you want you can talk about the pillars of the services. Okay, so formal academic education for students. We run a list for faculty, students and staff to look at to learn more about open. We do policy work internally for the university. We just released a position paper on the reasons we feel that federal institutions should fund the peer review for open science. Since open science is being pushed by most of our governments to try to fix pieces of science that are broken or inaccessible and peer review is one of those. We do work and research into digital infrastructure for community building. I was one of the first Ford Foundation digital infrastructure fellows many years ago and what we looked at was community issues within Pi PI, and the first types of things that Dawn was talking about, her comments about knowing what your community is about, what you're doing, you can look up the research and the fully open qualitative research we did, which means that uncharacteristically the people we interviewed signed away their privacy rights and allowed us to use their names and their transcripts from their interviews. If you look up conceptual mismatches, you'll find access to all that data, but to go to Dawn's point, one of the things that Pi PI ran into was road blocking themselves when the culture of the program made the people who got hired to do outreach community managing onboarding and governance ended up feeling guilty because they weren't pushing code. So we do that kind of infrastructure and policy work, community building work, and we do this fellowship work which we pioneered for us anyway, working with UNICEF, extended to our own faculty with the Sloan Foundation support, and now do for anyone who would like to work with us. And so going into this fellowship work that we're talking about here, this is going to be the main subject of our services that we want to sort of be describing today because I think it's quite interesting for community building. This is the approach that we landed on. So to give a quick overview of what we mean when we say community building or fellowship, when we talk about people who were trying to provide services to, these people are maintainers of open work. I'm sure many people here might be maintainers of open software, others might be academics creating data sets, or people publishing journals and stuff like that of open work, right? And they come to us because they're really interested in either building a community around a piece of work that they built, maybe growing an existing community, making it more inclusive, or bringing in a new type of potential contributor or user, or maybe they're just super burnt out and they're like, this is an entirely unsustainable way of maintaining this piece of work and I want to figure out what I can do or what sort of things are needed to make this more feasible to maintain this work. And so, well, we have slowly built over the last, I don't know, 10 years of working with different clients is this process where we provide a team of developers, designers, and community managers to tackle this specific problem, right? Not necessarily making direct contributions to them, but to tackle this problem of figuring out these community issues, what they are, and what resources are needed to overcome that. And so, what does this team actually do? Well, we kind of act as maybe a project accelerator, which is a pretty common term, and you see startup accelerators do this where you have a directed resource specifically of figuring out what your market is and stuff and what you're going to do. We do this with projects. So, a project maintainer will come to us and we'll help them launch or convert a project being open or we'll help them grow it or maintain it so they don't have to burn out. And so, as I said earlier, we do this with all kinds of projects. We have this link all over our slides quite often, this open work definition, so we don't just work with open software maintainers, but open scientists, open data repositories, OERs, and many other things as well. And kind of like the smaller cases, but things that often happen is because we're based in a university, we've had faculty come to us for help. And I use this project stuff all the time. I have all these teaching materials. They need teaching materials. How do I contribute it back? I don't understand what to do. So, we've helped faculty do that and as a result, we've also kind of done an analysis of that project's contributor pipeline and said, you know, if you kind of smooth things out here and here, you probably get more people putting stuff in. So, it kind of works both ways. You don't have to be a maintainer. And so, before we get into like the services, I think it's good to understand the background of what existing kind of resources and knowledge bases that we're pulling upon when we think about how we're going to be building communities. And so, like a big one is like the Mozilla Open Leadership Training Series, which, you know, I can't speak for these communities, but from my perspective have really influenced programs by Code for Science and Society or the Turing Way or Open Life Sciences and many other sort of community building services are offered by other institutions. And this training series was like an amazing thing that was created probably a decade ago at this point that really set out the foundations of like how you think about contributors and, you know, Don was talking about the contributor ladder, right, which I think is kind of a good allegory to contributor personas and pathways that are talked about by Mozilla. We also pulled from concepts from like Nadia Eggball's book, working in public, thinking about taxonomizing communities and understanding, you know, different, you know, depending on the type of community may face different issues. We also use design thinking in our process of ideating and creating solutions and iteratively testing them. And then, you know, obviously, all of this came together in large part when we were working with the UNICEF Venture Fund doing sort of consulting with the various projects that they were funding to build open source communities. Some of this knowledge has been stored in UNICEF's own sort of open source inventory. It's kind of taken a different shape at this point, but I encourage you to check it out if you're interested in the approach that they're using right now. And this also back fills the undergraduate education we do. We, the degree, the minor courses are open to students all across campus, not just computing students. It's, you know, as everyone has said before, you know, contribution is docs, it's onboarding, it's graphics, it's websites, I think kids build logos for the projects that they want to contribute to as part of the course. And somewhere between roughly 40% of the class is learning how to do analysis based on the type of graphs that Don was showing, that, you know, what, you know, both for themselves, you're going to have to do contributions by the end of the semester. Does this look like something you want to try to contribute to? Are they responsive? Are they supportive? What do the flows look like, right? Does, when your projects are due by the end of the semester line up to a point when they're not really taking contributions? So is that a good thing, right? So they learn to apply all that stuff. So when we start working with a project, in particular with a first engagement with a project, I think it can be, we try to divide it into these three sections of objectives that we're trying to work with, with our client. The first is, you know, it's kind of slowly learning more and more about the project and more and more about the, you know, what's going on. And through that you can figure out like what are the actual specific things that need to be created. So, you know, first and foremost we have to understand, you know, what the project is, where its goals, like why did you create this, why is openness, you know, purportedly something that's important and useful to the project and like what do they hope to get out of that. You know, then we try to figure out who are the potential stakeholders and stuff and like how would they potentially get involved. And then like from that we begin working backwards and say, okay, well what are the actual roadblocks that people are encountering when they try to get involved. What are the things that are missing? So, you know, I think just to give a little bit of additional detail for each of these stages, I kind of want to like pose a few questions that we're oftentimes asking our clients when we're working with them. Right? Like, what are you actually trying to do? Like what's the point of this project and resource that you're creating? You know, why is open source the thing that's important to it? Is it about, you know, getting people from other areas that you traditionally don't work with? Is it, you know, about inclusivity? And then, you know, do you have a community or people currently contributing? And what ways are they contributing? Maybe you have some software contributors but you know, you don't have contributors from a different place. Or, you know, maybe you have a lot of non-technical people involved in a software project but you don't actually have a lot of technical people. You know, that's super common in things like humanitarian software, right? And then like, what sort of community are you trying to create? And then after that, you know, we, once we kind of understand the objectives of the project, then we can start trying to figure out who are the different archetypes of stakeholders. So in Mozilla or in the Turing way, they talk about contributor personas, right? So these archetypal sort of contributors, maybe in our project we might have like a persona that's like focused on researchers. But then you might have another type of contributor that's like maybe coming from the private sector or something like that. And these people each have their own incentives and sort of reasons for getting involved in ways of engaging with. Then, you know, we begin theorizing like what's the ideal way of getting involved and this creates the pathway or maybe something akin to the ladder, contributor ladder. And so the stuff gets applied no matter what type of project it is, right? So when we did 25 different academic projects over two years running people through this thing, we had everybody from computational astrophysicists to great vineyard DNA people to deaf educators working on early childhood and learning languages and teaching international sign languages to my favorite acronym of any project we've ever worked on, the Victorian Autobiographical Information Network or VANE so that they could share their data on Victorian autobiography that was much broader than what they could put in their book, right? So all of this stuff works no matter what corner of the universe they're coming from, right? And, you know, collecting all this information on the contributor types and how they get involved, this has meant to create like an idealized version of what you hope your community to be, right? Who are the people that you actually want to be involved and, you know, what is like the end goal? And then from this end goal, we begin, you know, documenting like where are these like shortcomings, right? Where are the things that might be missing or preventing people from getting involved? And this stuff can be, you know, pretty simple in the end, right? It can be things like project documentation that's maybe missing or a lack of marketing materials or like just no outreach in specific, you know, either geographic or online areas where people aren't finding things or a lack of governance that is preventing people from getting the stage of like being maybe a first time, you know, drive through contributor and getting them into a thing where they're maybe approaching something more akin to a leadership role, right? And so it's this like very specific process that regardless of the type of project that we really try to stick to because by going through this process we're gathering the evidence of what's actually needed and what the materials are. And, you know, it took us some time to get to this because there's a real tendency of just being like, well, you know, just like I just want to read me template that will make my community grow, right? Just give me the best practices, right? But, you know, oftentimes when you hear like people who do community building, they're like, well, yeah, like there's best practices for sure. Like, you know, make sure you have a license and a code of conduct. These are good. But the realities of what it takes to build a community is very specific to your community. And so what we've worked really hard on is trying to come up with like a specific process of finding out what that is so you can know for sure. And so to talk kind of about the maybe the scientific methodology that we're using, we think about our engagements, you know, in two different ways. The first is developing solutions, which is kind of what I've gone through here, right? And the way that we do this beyond just asking the maintainer these questions is we're conducting qualitative semi-structured interview studies with various types of, yeah, thank you, big words. I just got my masters, so, you know, what can I say? We conduct these qualitative semi-structured interview studies with people to generate evidence on what sort of interventions are needed. So we're like, you know, heavy touch in this case, because what we do is we find out like after coming up with all these personas and pathways, we talk to all these different people in the community that we feel like are representing each type. We talk to people who were really successful in getting involved. We try to talk to people who are unsuccessful in getting involved. And through these interviews, we're generating data of like legitimate evidence as to what has worked, what hasn't worked. And through that, we can begin trying to derive, you know, kind of potential solutions to these problems, right? So if we know that people keep on having trouble with like, yeah, but I couldn't get the project up and running on my own computer, so I just kind of like ditched it and I tried to use something else. Then you know for sure that this is something that is worth putting work into. And then once we develop these solutions, maybe we write some documentation, right? You deploy it and then you want to be able to see if it was effective and like if things have changed and if new problems have popped up. And with this is kind of the next stage after that, right? Where we have a preference of taking a more mixed methods approach that involve using more quantitative data. You know, Don, I think, did a really great job talking of some of the different ways that you can use various software from chaos to evaluate against these goals that you've developed in the first stage. As well as continuing to do qualitative and confirming your conclusions that you're generating from the data with community members. So following up and talking with people and showing them, you know, hey, like, it seems like, you know, the amount of contributors have gone up here. Like, you know, maybe the bus factor has lowered in seeing if there's like new problems coming up. And so, you know, I think it's important to note that like the interview work and this qualitative work that we do and our community building, it's very heavy touch. And it's like kind of expensive, right? You got to like have people going out and like, you know, doing half hour interviews with random people over and over and collecting all the data and doing the transcripting and then like doing the coding. It takes a lot of work. But we find that it's really important because this work is often overshadowed and, you know, because this work is, it's often not thought about. And Steve, maybe this would be a good opportunity for you to talk about your research with conceptual mismatches and like what you were actually finding out. So, yeah, the conceptual mismatches effort. PyPI, the Python Packaging Index, had originally come to life as many open source projects do. Somebody had an itch, they have to scratch it, they built it, and it wasn't created for tens of thousands of people using it on a daily basis, right? So it needed to be re-scoped, re-architected, refactored, right? All those things, right? And they hit a wall. And they hit a wall even after they got funding to do that work. They hit a wall because of different perceptions of what being a community member, being a contributor, what the goals of the project were like, all those things. And so we did multiple rounds of interviews, and this is a process developed by Dr. Melchua, which she used on her PhD thesis originally, where a set of round robin interviews where we got X number of maintainers and X number of active community members to respond not only to a series of questions, but then respond to each other's answers, right? So we did three cycles and distilled all that down. And because they were willing, because they were open source people, yay, right? When you do a qualitative interview as a social scientist, generally it's just dead given that people never reveal who they are, right? Because often they might get fired if they trash their boss as part of this process, right? So you are like, never, never, never, never, supposed to actually out people or tie them to their work. And you have to go through an IRB and institutional research board where you prove that not only are you not electroshocking your students, but you are not revealing data. And we had to work with our IRB to create something and a process by which we had those folks not only sign off on the IRB paperwork, but give up their copyrights to the transcripts of their own interviews. We did let them review them first, but all that paperwork, all those forms, all that process is in the same repository as the report. So if you search on forward and critical digital infrastructure, you search on conceptual mismatches, if you will find a one-pager first, there will be a link to the repository. You can click down the repository and pull all that stuff out and reuse it. So if that kind of social science work is of interest to you and you want to try to convince your own powers that be that no one will go to jail or be sued if everybody signs off on this stuff, that's there for you. And as I highlighted briefly, there was so, you know, this happens, right? You have a core set of maintainers who are focused on how many lines of code go out a day, right? And did we hit refactoring this peep of the project by this time? And there was so much pressure there that people felt bad about doing their onboarding or their documenting or their other types of jobs, and that stuff fell behind, even that was funded as part of this big refactor, right? So culture and misperceptions about what our goals are and who we are and what we want to accomplish actually got in the way of doing what people wanted to do. So I'm going to breeze through this just so we have a little bit of time for Q&A at the end, but I thought it might help to give like a down-to-earth example of kind of a simple project that we worked with on a limited basis. So like one example is we worked with an RIT professor, Professor Rostojie. So she is a professor of computer science and she was working on developing datasets and investigating different ways that you can kind of potentially cause or ideally prevent. But issues around self-driving vehicles, getting into crashes and stuff due to harmful information in their datasets. So she's interested, obviously there's a lot of potential stakeholders that would be interested in this data and like how it can be distributed and a lot of other open source simulation systems that could use it. So we worked with her to develop specific personas of understanding like who these different types of stakeholders were, right? Because there are stakeholders in the private sector developing AV systems. There were other researchers who were interested in her work and might be able to continue generating further data. And then there were also kind of these like potential partner projects or like integrations with simulation systems, right? And so from there we kind of begin to try to understand like what were the goals of this project, which was a very early stage but potentially high impact project that could be integrated into all these different systems. And we found largely it was an issue of like discoverability and also understanding the open source aspect of this. So like for us, you know, through many of our interviews we realized that there's just, there wasn't a funnel from like finding the research to understanding that this is a project that can continually live and evolve and get contributions from other researchers and so on. So we needed to find a way to create a narrative around this. And like an obvious very easy first step to this was like developing a simple landing page website that describes the project, the research, you know, what are the main contribution asks, and then, you know, showcasing those different pathways of getting involved, which like oftentimes is just sort of linking to a well documented GitHub repository and examples of work, right? So we said, okay, well we're going to make a website and we went and we made a website and we like prototyped some copy and we got some review on it and we did the whole, you know, usual process of making a website and we deployed it and, you know, it was somewhat successful. I'm not going to go beyond this point because it was like a fairly limited engagement, but, you know, when thinking about applying this process, in particular if it's like a small project it can be quite a simple engagement, right? But, you know, when coming back to like what's the point of this engagement and like, you know, doing all this work to find out all this stuff is there's a few things, right? As a community becomes larger, developing that community becomes more complex because you have more stakeholders involved, you have more potential conflicts between those stakeholders, and more resources are needed to coordinate all that, right? And community development skills are not necessarily often found in like this peer to peer production of open work, right? So for our team we had like many designers and, you know, social scientists that like do this work and understand how to, you know, gather data from interviews and create this evidence. And, you know, Dr. Rastogi was super talented but not necessarily in these fields. And then, you know, finally if you want to justify growing, like allocating additional resources to growing your community, you need to have evidence to do that, right? Like in particular if you want to get funding for anything, like it's very, very difficult to just be like, and we're finding this now particularly in the private sector to say like, well, you know, we might go away if you don't give us money. We might go away if you do give us money but it's like less likely, right? I've spent the last four years of my life writing a lot of grants and I can promise you at least in those four years of experience has never worked once, although I've tried it a few times. So, you know, community development can get complex particularly as you migrate out of like a very simple governance structure into something that has like many different tiers and many different types of people involved. And this takes work to do, right? And that work requires these new skills, or maybe not new skills but like different types of skill sets involved. And that skill set is important for doing this work. And then finally, and what has certainly been probably one of the biggest points that has opened the eyes of a lot of people that we've worked with is, you know, if you want to like really get additional resources for your project, right? Particularly if you're like a burnt out maintainer or you feel like your project is like plateauing but you want to grow it and you know that it can but you just, you don't have what you need to do it. So, you know, getting the evidence to showcase what exactly is needed is one of the best tools that you have at your disposal for finding ways to like potentially fund this or just to, you know, give yourself the mind space to begin organizing around it, right? And then you can also get the data to be able to do that. And then you can also get the data to be able to do that. And then you can also get the data to be able to do that. And then you can also get the data to be able to do that. And then you can also get the data to be able to do that. And then, yeah, we have three minutes and 45 seconds left. So thank you and questions. Ask away. Any questions, folks? I'll bring it up on the mic. It's okay. So in across all of the projects that you've worked on, you do these personas. And I have to tell you that that is definitely a trigger word for me having worked in a number of companies where they would do personas and the people who ended up doing the personas knew absolutely nothing about. We do them right. So you're not trigger. My question is, have you looked at across all of these projects where people have done the personas and then later succeeded in the project? How far off the personas were at the beginning and did they evolve over time? Did they get closer? Can we get better at doing personas? Great question. There's a lot of bad persona. So I think part of it is coming from a design perspective. Personas are meant to be living documents that are kind of tested out. And they aren't just meant to be. It's like corporate values. A thing I hate, I hate corporate values because these arbitrary words like we're here for goodness and whatever. Personas can sometimes be that we're like, we have our customer persona and they like use our thing because they want to give us money because we're valuable. Like that's useless. With personas, what we're trying to find out is almost like documentation of our studies. Right. So we're we're trying to figure out like, who are the type of contributors that are useful to our community? And like, why do they want to contribute? And like, you know, what does that process look like? And so in terms of thinking about like, you know, we may make assumptions or collect, you know, invalid data about like who that person is and why they want to contribute. And so when thinking about our process, why there's these two stages of like developing solutions and then testing them, this is like where the evolution of a persona comes in. So, you know, we think our personas like are just having issues discovering us, right? So they're really interested and they have the time to contribute. And so we're going to like have a marketing campaign. But from from that campaign, we may find that they actually, you know, have another problem and then we can evolve that persona. So it's really meant to be kind of like documentation of that data. Sorry, I was wrong in the answer. I mean, I started as and still am a video game design professor. I've gone horribly wrong in so many ways. But what we tell our students straight up first thing, whatever it about ready to get to make a game is you're not your player. You think you're your player, but you're not your player. And so you need to figure out who your player really is just because they like the same game as you doesn't mean you're necessarily them. Any other questions? We want to go first. Well, thank you very much for our final talk of the day. Thank you so much. And thank you all for a great first FOSTA. Thank you. Thanks, folks. Thanks for sticking around.