[00:00.000 --> 00:17.000]  Next presenter. Who is Theo? Yes. So, talking about cryptpads. Yes, the floor is yours.
[00:17.000 --> 00:24.000]  Thank you. Yeah, whom do you trust? I think this question is really a serious question,
[00:24.000 --> 00:33.000]  especially for privacy and collaboration in today's Internet. Well, yeah, let's directly start
[00:33.000 --> 00:42.000]  with this question or with what collaboration is. Collaborative editing is that multiple people
[00:42.000 --> 00:49.000]  work on the same document at the same time and they want that their changes are transmitted
[00:49.000 --> 00:56.000]  in near real time. So here in this example you see that one person writes there, the update is
[00:56.000 --> 01:03.000]  propagated to the server and the server further forwards the message to all other users. Here
[01:03.000 --> 01:11.000]  you see that in this generic example that the server can see all messages. The server has a local
[01:11.000 --> 01:20.000]  copy of the document and updates as soon as it gets a message from a user. And already here we
[01:20.000 --> 01:27.000]  should say we're like, hmm, okay, so whom do we need to trust? We obviously need to trust the
[01:27.000 --> 01:37.000]  server in this example because the server can see the documents. So this leads me to the second
[01:37.000 --> 01:43.000]  part, to the privacy that we want. And here I give you some informal definition and we say that no
[01:43.000 --> 01:52.000]  untrusted entity can infer personal information, document content, or who the collaborators are.
[01:52.000 --> 01:59.000]  So for an untrusted party, the document should look like this, just like snippets. And this
[01:59.000 --> 02:06.000]  untrusted party should not infer any information. Here the key point is that it's an untrusted entity.
[02:06.000 --> 02:12.000]  Because this does not hold for everyone. For example, the collaborators, they should be able to read the
[02:12.000 --> 02:24.000]  document. So the question is whom we trust? And I'll start with the solution that's probably the most
[02:24.000 --> 02:32.000]  used today. And yeah, why not trust Google and Co. And there may be many reasons. I just want to give
[02:32.000 --> 02:39.000]  you one example. And it's the case of Desha Rawi. Here, Naomi Klein, a famous environmental
[02:39.000 --> 02:47.000]  activist writes that India targets climate activists with the help of Big Tech. Tech shines like
[02:47.000 --> 02:53.000]  Google and Facebook appear to be abating and abetting a vicious government campaign against
[02:53.000 --> 03:05.000]  Indian climate activists. So what happened here, there was, cannot go too far to the side, was
[03:05.000 --> 03:12.000]  climate activist Desha Rawi who founded the Indian, was co-founder of the Indian chapter of
[03:12.000 --> 03:19.000]  Fridays for Future. And they worked on my Google Docs where they discussed how to help Indian
[03:19.000 --> 03:25.000]  farmer protests. And there was stuff like use this tweet or you can write a letter to your
[03:25.000 --> 03:33.000]  government. This document was leaked publicly on Twitter, I think. And then the Indian government
[03:33.000 --> 03:44.000]  thought of this is a conspiracy theory and wanted to track down who actually wrote this document.
[03:44.000 --> 03:52.000]  So they asked Google and Google helped them and said it's this and this and this person. And then
[03:52.000 --> 04:03.000]  Desha Rawi was arrested for a few days. She was later on, she was against freedom and there was
[04:03.000 --> 04:09.000]  no sentence against her in the end. But nevertheless this shows that we cannot really trust
[04:09.000 --> 04:16.000]  Google to host sensitive documents. So what can we do against? Or what is an alternative
[04:16.000 --> 04:24.000]  solution? And I think one of the most obvious answers, especially at a conference like here, is to
[04:24.000 --> 04:30.000]  say that we need to control the software. We need to have the server and the client's open source.
[04:30.000 --> 04:39.000]  Because if this is the case, then we can host the software on our own instance, on our own server. And
[04:39.000 --> 04:50.000]  we can decide whom we want to give the data. And yeah. So this would be a first approach. So we
[04:50.000 --> 04:58.000]  could say, yeah, it's freedom of software, we are safe. And this is exactly a quotation here from
[04:58.000 --> 05:05.000]  Jitsi Mead. And they say that the possibility to run your own instance completely removes the need to
[05:05.000 --> 05:13.000]  trust a third party provider and therefore eliminates the need for end-to-end encryption. So they say
[05:13.000 --> 05:20.000]  exactly this, you can run it your own. You don't need any other pre-consciousness. No, this is fine
[05:20.000 --> 05:28.000]  because it's open source. Jitsi Mead is a video conferencing platform you may be familiar with. So this
[05:28.000 --> 05:35.000]  is a bit different. I will come to it later. And also interestingly, also interesting is that they
[05:35.000 --> 05:42.000]  remove the statement from their website only a bit after I started to prepare my talk. So this is
[05:42.000 --> 05:52.000]  from December 2022. But are we really safe? And to answer these questions or some more questions, can
[05:52.000 --> 06:00.000]  really everybody run their own instance? I mean, yes, probably most of you have the technical
[06:00.000 --> 06:06.000]  capabilities. But do other people have this capability? Do they have the infrastructure? Do they
[06:06.000 --> 06:14.000]  have the money to run this? No, probably not. And the second question is, do you really want to
[06:14.000 --> 06:20.000]  trust a system administrator to see all your documents? So imagine you're in a company and you
[06:20.000 --> 06:27.000]  are working in a collaborative system and you have the salary sheets online. Do you want the system
[06:27.000 --> 06:35.000]  administrator to read that? No, probably not. Even if you trust it in the first place.
[06:35.000 --> 06:41.000]  And then, and this is where the difference is to video conferencing, is that documents are not
[06:41.000 --> 06:48.000]  ephemeral. So a video stream you can safely delete after the conference has ended. But a document
[06:48.000 --> 06:55.000]  must be stored in the server because you want to access it later. And this means you do not only
[06:55.000 --> 07:03.000]  need to protect your documents currently, but also in the long term. So that if the server is
[07:03.000 --> 07:11.000]  under attack or an attacker gets access to it, they should not have access to the documents.
[07:11.000 --> 07:19.000]  Okay, so if you see this, then you probably think we need end-to-end encryption. And end-to-end
[07:19.000 --> 07:26.000]  encryption is in principle that you have one party, let's say Alice, and Alice encrypts a
[07:26.000 --> 07:37.000]  document, they send it to Bob and Bob decrypts it. And in the middle, the data is not readable.
[07:37.000 --> 07:47.000]  So this is the encrypted ciphertext. And you see here, it's exactly the snippets we want. So this
[07:47.000 --> 07:54.000]  technically looks good, and we could say, okay, we apply this, and we can say it's end-to-end encrypted,
[07:54.000 --> 08:00.000]  we are safe. And here's a statement of Google, and they say that with Google Workspace, client-side
[08:00.000 --> 08:08.000]  encryption, content encryption is handled in the client's browser before any data is transmitted.
[08:08.000 --> 08:14.000]  So here first note that client-side encryption is not the same as end-to-end encryption. It's
[08:14.000 --> 08:22.000]  different, especially in the question who holds the key. And client-side encryption, it's not you
[08:22.000 --> 08:30.000]  as a user who holds the key, but the keys are stored on a third-party server. So there comes
[08:30.000 --> 08:38.000]  again this question of trust, if you trust this third-party server. Okay, so we could say it's
[08:38.000 --> 08:46.000]  end-to-end encrypted, we are safe. Well, really. First, there are the metadata. And metadata is
[08:46.000 --> 08:54.000]  all about who connects to the server, at which time, from which IP address, who collaborates, which
[08:54.000 --> 09:01.000]  people are accessing the document at the same time. And all these metadata, they are still there,
[09:01.000 --> 09:10.000]  even if the content is encrypted. So, yeah, still a problem. And second, we have Kirchhoff's
[09:10.000 --> 09:18.000]  principle from cryptography, which says that a cryptosystem should be secure, even if everything
[09:18.000 --> 09:25.000]  about the system accepts the key is public knowledge. So you should be able to release all the
[09:25.000 --> 09:32.000]  code, and all information accepts the key, and it should still be secure. And for me, it's really
[09:32.000 --> 09:42.000]  urgent for open source. And, yeah, that's why I think it's urgent for open source. So we see that
[09:42.000 --> 09:50.000]  we need both of them. And here, I want to present you CripPad. CripPad is an online collaborative
[09:50.000 --> 09:58.000]  editing tool. There are multiple parts of it. There is a whiteboard. There is code marked
[09:58.000 --> 10:08.000]  on. There are slides, like these ones, and documents. It's open source software from the
[10:08.000 --> 10:17.000]  client code is open source, as well as the server code. So you can host your own instance.
[10:17.000 --> 10:24.000]  And there are about 200 maintained instances. We at the CripPad team, we host a flagship
[10:24.000 --> 10:33.000]  instance, which has about 200,000 registered users. And how does CripPad encrypt? So in
[10:33.000 --> 10:39.000]  CripPad, we have this end-to-end encryption. We have that an update is propagated in encryption
[10:39.000 --> 10:47.000]  form, encrypted form, and the server only has an encrypted state of the document. So the
[10:47.000 --> 10:55.000]  server cannot infer the actual content of the document. And how do we share the keys? In the
[10:55.000 --> 11:02.000]  most basic way, we share the keys over the fragment identifier of the URLs. That means we put the
[11:02.000 --> 11:12.000]  keys after the hashtag of the URL. Like this, you can easily share a document. What do we
[11:12.000 --> 11:19.000]  trust? As I saw, as I said, you still have the metadata. So you still need some trust. In
[11:19.000 --> 11:27.000]  CripPad, you have to trust that the server is not an active attacker. That means that you
[11:27.000 --> 11:32.000]  expect that you trust that the server acts according to the protocol. It runs the correct
[11:32.000 --> 11:40.000]  code, and it does not deliver any malicious things. Or it does not repeat stuff, and so
[11:40.000 --> 11:48.000]  on. It does not reorder stuff like this. And why do we have this trust requirement? There are
[11:48.000 --> 11:54.000]  two reasons. The first one is a practical. We have a web application where you get the
[11:54.000 --> 12:01.000]  client source code from the server. So if the server would deliver bogus client code,
[12:01.000 --> 12:08.000]  well, then every security guarantee is lost. And then there is the second one, which is
[12:08.000 --> 12:14.000]  more theoretical. Namely that the server can always delete files. Even if they are encrypted,
[12:14.000 --> 12:22.000]  the server can delete them without problem. Okay. So you see that you need some trust. But
[12:22.000 --> 12:28.000]  the cool thing about CripPad is that there are other stuff which you don't have to trust.
[12:28.000 --> 12:35.000]  Namely, the server could be an honest but curious attacker. That means that even if the
[12:35.000 --> 12:42.000]  server watches you, you're still safe. You don't have to trust the server that it does
[12:42.000 --> 12:51.000]  not watch you. No, it's explicitly allowed. And why do we have that? Well, the server
[12:51.000 --> 12:59.000]  may become corrupt. Even if you trust it to be not actively malicious, it's still
[12:59.000 --> 13:07.000]  maybe at some point in time, it may be corrupt. And this was especially the case here in
[13:07.000 --> 13:14.000]  last summer where there was the CripPad instance hosted by Germany's private party. And on
[13:14.000 --> 13:21.000]  this instance, some sensitive documents about the G7 summit were leaked. And then the police
[13:21.000 --> 13:27.000]  asked the pirate party to hand out their data. Otherwise, they would seize the server. So
[13:27.000 --> 13:33.000]  the police got access to this data. And now the police could not read anything. They
[13:33.000 --> 13:39.000]  could not read the documents. Namely, because everything was entered into encryption. So
[13:39.000 --> 13:45.000]  this shows that even if you trust it in the first place, we still cannot be sure that
[13:45.000 --> 13:52.000]  it's trustworthy forever. And this setting, yeah, and as I said, this setting is exactly
[13:52.000 --> 14:00.000]  covered in this honest but curious attacker, which we allow. There's also another point
[14:00.000 --> 14:07.000]  of view to look at this. We could all say that we protect the server from its users. So
[14:07.000 --> 14:16.000]  for example, the server administrator of the CripPad of Germany's pirate party was not
[14:16.000 --> 14:22.000]  consulting. How could they know what documents were published on their server because it's
[14:22.000 --> 14:30.000]  encrypted? So this shows that such encryption is also nice for you in terms of system
[14:30.000 --> 14:36.000]  administrator because it allows you to offer the service without taking too much risk to
[14:36.000 --> 14:47.000]  you. So, yeah, as a take home message, I really want to say that we need both. We need
[14:47.000 --> 14:54.000]  open source and end-to-end encryption for good trust assumption. And with this, I'm at
[14:54.000 --> 15:01.000]  the end of my presentation. Shout out to my team. It's David is there somewhere. Wolfgang
[15:01.000 --> 15:09.000]  is here. And Ludo is also there. I am Theo. And CripPad is developed at Xwiki in France
[15:09.000 --> 15:16.000]  by this small team. We have a stand here in the K building. Yeah, come by. Drop by. We
[15:16.000 --> 15:18.000]  have stickers. Thank you.
[15:18.000 --> 15:45.000]  So are there any questions?
[15:45.000 --> 15:50.000]  Would a peer-to-peer version be possible to reduce risks with the server and would it
[15:50.000 --> 15:57.000]  help? Yeah, it would be possible, theoretically. The main point why we have a server is that
[15:57.000 --> 16:04.000]  we want that the documents are accessible all the time. So in a peer-to-peer setting,
[16:04.000 --> 16:11.000]  you will firstly have the requirement that always one party must be online. And we don't
[16:11.000 --> 16:15.000]  want that. Yeah.
[16:15.000 --> 16:18.000]  So another question.
[16:18.000 --> 16:24.000]  Thank you very much. You say we need open source and E2E for good trust assumptions. I would
[16:24.000 --> 16:32.000]  suggest you might need a slightly stronger statement. The code on the server has to be
[16:32.000 --> 16:40.000]  open source as well. So potentially you need something like the Afaro GPL. And in terms
[16:40.000 --> 16:46.000]  of the E2E, you need post-quantum resistance. Yeah.
[16:46.000 --> 16:54.000]  If you have those two things, then maybe you have good trust assumptions. Yeah, good point.
[16:54.000 --> 17:00.000]  TripPad is licensed under the ATPL. So this point is easily answerable. And on the second
[17:00.000 --> 17:05.000]  point we are working on, we are looking on how to make TripPad secure in a post-quantum
[17:05.000 --> 17:15.000]  resistance. Thank you very much. You mentioned that it's problematic to have the metadata
[17:15.000 --> 17:21.000]  still. So what is TripPad doing against that or how can we make sure that the server is
[17:21.000 --> 17:29.000]  not collecting metadata? Yeah, two answers to this. One is that there's always some
[17:29.000 --> 17:34.000]  metadata which will be there. For example, the IP address or the browser agent. This one
[17:34.000 --> 17:39.000]  we have to live with. And this is also the case why it's important that you can host it
[17:39.000 --> 17:47.000]  on your own instance. And then the second part is that TripPad collects as few information
[17:47.000 --> 17:54.000]  about you as possible. So for example, we don't have a list of users, of user names.
[17:54.000 --> 18:01.000]  There's even no list of hashed user names. So we just hash the user. The user name and
[18:01.000 --> 18:08.000]  the password locally on the client side and generate from this all the keys. So this
[18:08.000 --> 18:18.000]  is just as an illustration how we try to ensure to have as few metadata as possible.
[18:18.000 --> 18:22.000]  Good afternoon. And that's the first question you answered that you don't use peer-to-peer
[18:22.000 --> 18:26.000]  because you want the server to be online all the time. Does the server have to be unique
[18:26.000 --> 18:31.000]  or can you have multiple servers just in case one gets in the hands of the police or gets
[18:31.000 --> 18:38.000]  out for some reason? Are you speaking of federation? Yes. Okay. Possibly there, currently there
[18:38.000 --> 18:51.000]  is no possibility for federation. No, sadly not. Yeah, you mentioned the case where a
[18:51.000 --> 18:57.000]  server was raided by police. So they have the server. Is it not enough then to have that
[18:57.000 --> 19:02.000]  server and also have somebody's browser history with the key in the URL and then that conversation
[19:02.000 --> 19:13.000]  is open? So if I got your answer, if you got your question correctly, it was if an attacker
[19:13.000 --> 19:21.000]  has access to the server and to the URL, then they have full knowledge. Yeah, that's the
[19:21.000 --> 19:28.000]  case because the URL leaks the full URL including the part after the hash, which is not sent
[19:28.000 --> 19:34.000]  to the server. If the attacker has this, yeah, then they have the key to the server and add
[19:34.000 --> 19:44.000]  the key to the document and can decrypt it. Yeah, connected it. So yes. How does editing
[19:44.000 --> 19:51.000]  collaborators adding removing work do like re-encrypt the file of different keys every
[19:51.000 --> 19:58.000]  time or how do you handle that? So we have, we only send updates. So it's not the entire
[19:58.000 --> 20:06.000]  file every time and it's symmetrically encrypted. And in order, there are two ways you can
[20:06.000 --> 20:12.000]  access a document in a read-only mode. There you have the keys for decryption and to prove
[20:12.000 --> 20:20.000]  that you're able to update the document, you need to sign it with a sign-in key. But the
[20:20.000 --> 20:27.000]  keys are static for a document. But if a user gets removed from read access, they would
[20:27.000 --> 20:32.000]  still be able to read the file after it's being modified, wouldn't they?
[20:32.000 --> 20:48.000]  Yes, exactly. Yeah. Yeah, they'll still be able to read it. There is, there are access lists
[20:48.000 --> 20:53.000]  which we have which can defend against this scenario. But yeah, there's also something
[20:53.000 --> 21:01.000]  we're working on. And maybe if I can just mention something which with more goes into
[21:01.000 --> 21:07.000]  these detailed questions. We just published a white paper. You can go to our website on
[21:07.000 --> 21:24.000]  kruppad.org and check it out. So if there are no other questions.