you you you you you different search parameters like a bit of genomic data and this is our architecture so I already mentioned lens this is what the researcher sees in their browser this is the front end and then it has its own back end which we actually call spot and in some projects the old spot which is made in Java is still running but we are about to replace it with a new spot which is made in Rust then there's these are beam proxies they're also made with Rust focus is made with Rust Blaze is a store and it is made with closure and then we have those operations which are shell scripts mostly so what is happening here a researcher says I need to find samples with type plasma that come from donors with diagnosis C61 for example and where the age at diagnosis is between 1450 for example and then that request goes to spot where it's packed into a certain beam task beam is a task broker which actually solves problems of strict network environments we face in hospitals in Germany because of the data protection concepts so on the sites which are hospitals by banks there are beam proxies which ask beam do you have a task for me and when they do beam sends the task and focus this component here gets the task then focus unpacks the task decides for which endpoint it is a task blaze is only one of the possible stores and also we can query other applications so it's not only for sorry for database types we also have another application which is exporter called exporter and one more which is called reporter so those can also query blaze in their own ways blaze is actually a fire server fire is a standard of exchange of information in e-health and healthcare in general and medicine and focus then runs the query against blaze or against some other store it gets sorry I keep clicking it gets the results return results to a beam proxy which returns it to beam which returns it to lens backing which is spot and in the end the browser gets the result and this component here Laplace this is used for obfuscation obfuscation of data is done on sites so unobfuscated data never leaves the sites we decided it was the best to put it there and we have multiple projects that actually run our bridge heads these set of applications on sites we call them bridge heads you can look later in our bridge head repository which installs all those components so we have a lot of projects those are some of the projects that actually run bridge heads this is map of Germany which with bridge heads in Germany but besides German Biobank node we also have the European version of it which has biobanks in other European countries that's bbmri eric then German cancer consortium I already mentioned and cancer core Europe which intends to facilitate a translation of clinical research into new drugs and then because children usually have different types of cancers and cancers differently affect children we have a separate project which is intended to facilitate the invention of drugs for pediatric cancers and also applying applying existing drugs which are for adults but also for those genetic markers for which no drugs exist it is intended to facilitate personalized medicine this is another project we have this is for cancer images so MRI CT pet cat it is intended to actually enable AI analysis of images and then I mentioned beam beam is a distributed task broker which enables communication with biobanks which are behind the proxies and have very exotic configurations it handles the encryption beam proxies on each side encrypt all the traffic and decrypted and it also handles certificates and it only allows outbound connections which means it is only possible that beam proxies connect to beam and then we have focus which is a query dispatcher in which the obfuscation happens so first I need to mention CQL that's what we use it is clinical quality language I know that there's another CQL which means something else so parts of CQL come from front end and currently we are working with certain query replacements to prevent CQL injections but soon we should have a translation of abstract syntax tree from lens from front end into CQL completely done in focus I'm working on it and also abstract syntax tree gets translated or rather simplified for you came the project for a medical imaging I mentioned before as I said it uses the sampler Laplace library yeah these QR codes you can scan them and you can get to the GitHub repository I hope it is large enough and also if you want if you want to get to the beam repository this is the QR code and the problem with aggregated data is still that with a search narrow enough it could be deduced in which store in which database or in which Biobank samples or data about a certain patient are stored so we need to offer a similar level of privacy to the patients who are supposed to consent they are more likely to consent to having their samples and their data available if they know that their level of privacy is the same if they are in a Biobank and if they are not in a Biobank because we obfuscate the data enough we add a small number and we round it up I'm gonna mention why K anonymity means that for each set of parameters there would be at least K patients for whom they the search would return results but that's still not enough because we can we have some rare diagnosis we can narrow the age range enough so that we could have searches return only one patient and that's why we had to do this we use a Laplace distribution with certain parameters we take a random value from the distribution we add it to every count in all those counts in all those stratifiers we get for example for each diagnosis for each sample type and this shows how depending on the values we can lower the privacy but we can make the data more usable so here we would get more higher values with B which is 0.1 and here we get more lower values but values that are closer to the true state of the database are actually more usable privacy budget is something that everybody has to decide for themselves but sensitivity depends on what is being obfuscated it is the number of those resources per patient so if it's diagnosis then it's the number of diagnosis per patient if it's a samples then it's the average number of samples per patient so we are working with 10 and 3 and 4 patients of course it is one patient per patient this is the library and it is a rust crate and we also made it Java library for our friends in Erlangen who use it in their Java projects it is highly configurable but I have included parameters that might be needed in medical informatics so of course epsilon and delta I mentioned before but also what to do with values under 10 we round them to 10 some might want to round them down to 0 or they can be obfuscated in the usual way also for zeros we have chosen not to obfuscate them that is because after the search there comes another process the researchers select the biobanks they want to negotiate with and then use the tool which is called negotiator which was made by our friends in Czechia and in the negotiator they describe the research they intend to use and in the biobank the head of the biobank or whoever is tasked with it but in any case real humans decide who is going to get those samples because samples are very valuable and once they used up you don't have them anymore and it could be last sample for a combination of diagnosis and certain sample type certain genetic markers especially so we didn't want for those biobanks that really have zero values to okay that's it we didn't want them to border people in biobanks so all our code is open source you can scan this and you are going to then get to our organization on github you can look at our other also software and if you want to join us live in beautiful Heidelberg help cancer research then scan this this is a job posting just please don't be don't the fact that German languages mention prevent you from applying because my German is still not good enough and it is not a requirement really you will be asked to learn German but the company pays for it so thank you you