Thank you all for joining us in a great, you don't.
It's not working? Why is it not working?
It is? Okay, good.
Let's not correct. So thank you all for joining us and here to talk about next.
AI.
Exactly. Another approach to AI is just port fleet.
Okay. Does this work well enough? You can hear it in the back and all that.
Okay. I see thumbs up. That's wonderful.
Yeah. Well, I had this. I'm your sport fleet.
Direct communications co-founder at NextLoud.
So, yeah, the thing we do at NextLoud is collaboration.
That's of course what this room is all about.
Now, there's this AI thing coming.
And so I'm hoping to try and make this conversation a little bit interactive.
I mean, there are other people here from Xweek and other projects who are working on,
well, collaboration tools as well, open source collaboration tools.
And, you know, this AI thing, I mean, there have been AI-ish tools being used for a long time,
but a lot of them are also still quite new.
So I'm kind of hoping that we can also have a bit of a conversation about it.
Because, well, there are pros and cons. I mean, we'll get to all that stuff.
Of course, the big thing here is, yeah, that we have, like, these big companies, right?
They all want our data, and AI is for them another thing to use that data for.
So, yeah, I mean, AI, I don't know how deep I want to go into what it is,
because I think all of us know it a little bit.
But we don't want to live in a world where there are five companies, you know,
who run all our data, and that's kind of a little bit the case right now.
I think that if Trump and his next presidency tells Microsoft to shut down their service in Europe,
then basically you cannot get a new passport, you cannot, yeah, nobody can work at government here, for example, right?
This is, I think, a bit of an issue.
And, I mean, NextLoud is one of the projects that's working on solving that issue,
essentially trying to give, well, companies, individuals, but also hopefully government,
back the control over their data.
We've built a collaboration platform.
I'm guessing how many of you are not familiar with NextLoud?
Yeah, okay, it's like six people.
Google it.
I will then not go into that, sorry, or duck-duck-go it, that would be better, obviously.
So, as a company, we build an alternative for 365 in very quick, simple terms.
And with alternative, we mean that as a government or a company, we think it's important that you have a choice.
It's totally fine if you're happy that your data is at an American company and that U.S. buy agencies have access to it.
If you're good with that, if that's not a threat to your business, that's fine.
Government, I think it's by definition a threat, but that's their choice.
But we think there should be, like, a choice.
There should be an alternative.
And an alternative is only an alternative, is it is, if it does what the other product does, obviously,
and in a safe way and has enough ability to be used for a serious company.
So, that's what we're building and we have built.
That's why German and French and most European governments are in places already using Nexlout,
be it cities, be it at a state level or federal level.
So, as a company, we care a lot. Nexlout for us is a mission, it's a goal.
It's like our way to try and make the world a tiny little bit better.
And we want to work in an open, collaborative way.
Therefore, we're very happy, of course, that it's used by thousands of governments and universities, et cetera, et cetera.
And, of course, we're building this completely in the open.
And again, that will be relevant because, of course, I think the future for AI will better be open,
otherwise we are, well, just as crude as it is with collaboration platforms, honestly.
And we have a wonderful community working with us and all this stuff, which is awesome.
Also, as a company, we try to be open and transparent, not depending on venture capital, et cetera, but be self-owned.
Anyhow, AI.
So, AI is like, we've already introduced a ton of AI things over the years, like little things,
and I will show some of them, but of course, with the latest LLMs and stuff, it's getting really complicated.
I mean, there are tons of problems with it.
I have a lot of potential, right?
AI can help us make repetitive tasks easier, quicker, et cetera, but at the same time, Big Tech is basically loving it.
They have all the data to be able to build the AI's.
It costs tens or hundreds of millions right now to really train the proper LLMs, so they really have a bit of a monopoly here.
And, yeah, the rest of us will have to just accept that they're using all our data to do it.
And a lot of companies are already really realizing this is a problem for them, right?
It's Citigroup and Goldman Sachs.
They are actually not allowing their employees to use tools like chatGPT.
I mean, if you're BMW and you're working on a new car and you're using an AI to generate some ideas or summarize some proposals,
and you discover later that six months later Tesla, while designing their car,
suddenly got some of your ideas coming into their AI planning, then there's a bit of an issue here.
And, of course, this kind of stuff is happening.
The company, like a while ago, Twitter and Zoom, they changed their terms of service to allow for training on user data.
And, yeah, this is really an issue for business as well as, well, obviously, all our society.
And then I'm not even talking about data biases in these models, carbon footprint.
I mean, I think most of you are aware of all the issues with AI.
So, honestly, I don't think the question is to AI or not, because there are too many benefits.
The opportunities are really big, I think.
I've been trying to make a bit of a list of that, but I was just changing it while standing here in line outside.
So this is definitely not complete.
So I'm just going to put it all on the screen and ask what's missing.
I mean, I think there are some basics, you know, text to speech, speech to text,
recognizing faces on photos and recognizing objects on it, et cetera.
This is already, like Nexot has been shipping this for three years, four years already, these kind of things.
It's just one model that you download and does this stuff, and translation and other one.
It's fairly, I think it's, I mean, it's not simple.
It's technically complicated stuff, but it works.
And there are not huge risks.
You don't need to send your data to Google anymore if you want text to speech,
or if you want image recognition and being able to search for a dog
and find all the pictures of your favorite pet.
So this is already there, and it's not terribly complicated to use for a person.
But of course, you now have all these new language models.
I think there's really a big benefit, unlike dealing with information overload.
You have tons of emails coming in.
You have, I don't know, papers to read, et cetera.
And these LLMs, they, I know they create a lot of fake content and hallucinate stuff.
But the thing they're pretty reliable at is summarizing.
And this is really quite important.
I don't know how many emails you get, but I get a ton,
and I would love to be able to summarize it or help select, you know, useful emails, et cetera.
And this stuff is really possible, or, I don't know, meeting notes.
And, yeah.
So this is, I think, where these models can be super helpful.
And you have, like, text generation, of course, they can do help out with this.
You have also image analysis of various things.
There have been some demos from Microsoft and Google already about a year ago,
where they basically were showing that you have, like, a spreadsheet,
and you select something in it, and then you type a question about it,
and then it makes a graph that answers the question.
This kind of stuff is also pretty magical.
And there are tons of people in office all over the world
that would benefit a lot from having this stuff.
So I think, yeah, the benefits are really there.
Another thing is, like, automation.
Just talking about it with a colleague.
This is, like, also a next step, you know, if you can say to the LLM,
like, hey, send, make an appointment with another person,
and then they try chat, and if that doesn't work, they try email.
These kind of things would be really helpful, I think, in day-to-day work.
So, yeah, I don't know.
If there are other ideas or things that are missing,
I'd love to hear it, actually, and make my list a little bit more complete,
but we'll get to that, I think.
So I just wanted to show a couple of examples,
like we have this feature now, the Threads Summary,
that makes a summary of your emails.
Another example is, like, an Excel text.
You can just select some text and say, hey, summarize it, create a headline.
It's all quite simple to use.
And image generation, of course, I mean, this is a horrible image,
but, you know, you can make things that look good.
And then you have the data analysis, and you have automation,
all these other features we have ideas on.
I'll share some things a bit later on.
So I think we need to do AI in our collaboration platforms,
like X-Wiki, you guys need to have a plan.
I know only Office, but they integrated just chat GPT.
I think we need a little bit more than just that,
because, well, we're losing the on-prem capabilities, right?
It's not competitive if you're just integrating chat GPT,
then the data is sent to the U.S. anyway.
So, yeah, that's not really a good solution.
So the question is, how can we get this without the problems?
And I think I'm in a room with open source people,
so I think the answer for most of you is obvious,
and this to me at least, just transparency and being open, yeah?
And this is kind of the thing that we've been working on at NextLoud.
We kind of made some rules for ourselves,
so we have been doing AI as things,
but when the whole text stuff from chat GPT came out,
actually that was at the FOSDEM two years ago,
we talked to people and each other,
and we have some fairly smart people on board,
also from the research community,
and we tried to come up with, like, how can we handle this?
Because we add more AI features,
like we don't want to be left behind,
and we need to be an alternative, as I said earlier,
and you can only be an alternative if you offer similar features,
otherwise who's going to use your product?
But then how can you do that in an okayish way?
So the idea we came up with was to at least create transparency,
and of course, choice, I'll get to that next,
but first the transparency.
So we came up with the idea of creating a rating
that has basically red, orange, yellow, green,
and we would rate each of the integrations of AI features
in NextCloud with this rating.
So first, is it open source?
Is the model available, and is the training data available?
And so if a model has all three, it's green,
if it has two of them, it's orange,
if it has one of them, it's yellow,
if it has none of them, it's red.
So chat GPT integration, red.
Completely on-prem model that is trained
and has the training data available,
for example, for speech to text, that can be green,
and you have everything in between, of course.
And the second thing is choice.
So for us, it's really important that you, well, can choose, right?
I mean, there are, again, legitimate users
for something like chat GPT, and I mean,
they're throwing so many billions at this problem
that you can hardly argue that open source
can really keep up to the latest stuff they're doing,
and sometimes you just need it, fine?
So in our user interface,
we have these choices that you can have, like Opus,
that's a translation exactly,
so this would be a fully green one,
and that's, well, we all know, chat GPT.
So we try to make sure that for the various features
that you can choose between these different models,
on-prem, et cetera.
So for us, of course, most of the work we put in
on-prem and open source locally running AI features,
because, well, that fits with our values as a company
and, well, with our ethical AI rating,
but the others are available.
So at the moment, I made a list,
but I'm sure there are many more you can use,
like models like these in NextLoud.
I have four of the various features.
I'll show, well, actually, I'm showing examples right now.
So this is just a bunch of the features we have.
There is more, but suspicious login detection
is something we developed like a really, really long time ago.
It's basically a neural network
that gets trained on your login data.
It just runs completely local every time you log in.
If you work nine to five from the Berlin office, let's say,
and suddenly somebody at 3 a.m. logs in in your account from China,
maybe there's something wrong, the model will detect that
and give you a warning.
Very simple, and we have had this for, I don't know,
since 2020, so quite a while.
And it's green, right?
It runs fully local.
There's nothing special about it, no data sent anywhere.
We basically do a very similar thing with our mail app,
where we basically train a neural network on subjects,
sender, and email recipients, et cetera.
And it creates a smart inbox trying to put important emails on top
and the rest not, and again, no data sent anywhere,
because it just runs on premise.
I already mentioned phase recognition and stuff we did in 2022,
I think, so this is all.
But the problem with this is already,
you need to download a multi-gigabyte file,
which has, like, all the values needed for the neural network
to recognize stuff.
So we already had to re-architect a lot of the way Nexard works
just to be able to download this big blob
without creating all kinds of complexities for the users.
And obviously, this problem gets bigger and bigger
when you get to modern AIs.
We even have music genre recognition using machine learning.
It's yellow because it's trained on all the music on Spotify,
which means the training data is actually copyrighted
and therefore not open.
And, yeah, we had, like, a pre-trained model to do call transcripts.
We introduced that last year.
That is nice. You have a call and the recording then gets text,
speeds to text so that you get the text of the recording.
Again, this model runs fully local, so that's cool.
And speeds to text the other way around.
Background blur, just a JavaScript thing that we upload
in the browser, very simple translations.
First, we made it with Deepol, which is not cool.
So then we made one using the Opus corpus.
You saw it earlier, and that is running fully local,
so that's much better.
So these are still mostly basic features, I think, today,
and yet already pretty complicated.
You need to keep an eye out where the data is being sent to,
like with translation.
But, of course, the big thing are LLMs,
like the text operations.
What we've been doing is to create, basically, NexLad Assistant.
It uses the large language models, but open source on-prem ones
that you can host yourself.
It's like this little thing on the top.
When you click on it, you get a dialogue.
You can give a free prompt, or you can give a text to summarize
and some other things.
And it just runs this through one of the models that is supported by NexLad.
And, again, you can put JetGPT here as model,
or as back-end, but you can also then run your own LLM
and connect that to NexLad, and then it can do all this stuff on-premise.
So it's fairly simple.
When it's running, then it'll get the results,
and after a while you get the output of it.
You can copy it into a document, et cetera.
And, again, if you take a local model that is trained on public data,
then it can be a fully green solution.
So that's really cool.
In places like NexLad Text, I already showed that,
that you can select some text and then run this.
Mail, I already showed this as well.
In talk, like our video calling and JetSolution,
you can translate a message, select it, and then choose translate,
insert images and other stuff.
We even made a little bot.
This isn't the smartest bot.
It's a very small model, but hey, it's fast.
And you can ask it questions.
Honestly, I wouldn't say such smart things.
It's fairly shitty, I've noticed.
But still, it works on your own server.
That's kind of nice.
So a lot is possible.
One of the newer things we're working on is more of these services,
because they're now companies like Amazon.
They are running LLMs as a service.
And other companies are doing this also purely in Europe,
like you have ALEF, ALFA, and I think MIRROR or something.
In France, there's also a company that is building local AI.
So we're trying to support these, that you...
You know, everybody can run these AIs,
like you need a lot of heavy GPUs.
It's a lot of compute.
So you can use it as a service that at least it stays in Europe
or at a company that you trust.
Then I wouldn't recommend Amazon, per se, perhaps.
For this, we also made it possible that you can put in some limits,
otherwise users get a little creative and start to basically cost you a lot of money.
And we worked on the interaction with this.
I'll skip through this.
A thing we're working on now is also to make all of this even smarter.
A newer thing is the ability to take your documents that you have into account.
So ContextChat is a feature of the Assistant that basically it has access to your documents,
your emails, everything you have that gets indexed.
Let me see.
It's indexed into a vector database.
So this runs as a separate service next to NextLoud.
And then when you ask a question from the Assistant,
it can actually answer using your documents, your company documentation, your emails, etc.
So you can really do stuff like, you know,
can you give me an idea of how we organize events, rather than in general,
it can look at your documentation and then tell you, like,
oh, you know, at your company, organize events this way.
Or you can say, hey, can you give me a summary of the different requests
that a colleague has emailed to me last week,
and hopefully it'll give you all the to-dos that you got from that colleague in the last week.
So this, yeah, has the context basically of what you are doing as a user at hand.
It's, I think, really kind of, yeah, an important step forward to make this useful,
because otherwise you're just getting the generic info that's in the LLM.
As I said, they hallucinate stuff all the time.
They're much better at, like, taking information and summarizing it,
and that's, of course, what this does.
I think it's much more reliable in that way, you know, vacation process, et cetera, et cetera.
So that's a couple of things we've been doing lately on this, as well as in the context yet.
So that's our approach to AI.
I would really like to hear thoughts on that, and, like,
I don't know what other projects are planning with this.
One of these will be giving a talk after mine.
But I know any feedback, questions, thoughts, fears, and anxieties?
Okay, so is this working?
Can anybody confirm in the back?
Great, thank you very much.
So any thoughts, questions? Let's start here.
So you said your screenshot showed that...
Yes, the screenshot showed that I need to double check the information that the assistant gave me.
Notice that it doesn't give me the reference to the emails that it was quoting from.
Is there a possibility to get that?
Currently not, but thinking of the way this works, I mean, I have one developer here.
They can interject, but I think that should actually be quite doable,
because the way it works is it looks in this vector database
and gives that information to the LLM to then summarize and give you the answer.
And, well, in the vector database, I guess it knows where it came from,
and therefore can then say what information was used to summarize that answer.
So I would think this is possible, but I don't... I don't know.
Yeah, I see a thumbs up. Excellent, okay.
Any other questions, ideas?
Okay.
Or are we going to do this?
Yes, for me, I am an e-aseptic.
There's something... some examples.
So it's good, something... when user is at the end,
and he can correct what is said by the AI.
So an example for translation, I have the word in Dutch, Académie de Sie.
It's not universitaire, the translation in French.
It's personne issue des milieux académiques.
And that's any... the translator or a consultant don't give this response.
Yeah, but it's...
So you have a control, they say, from the user, also from the citizen in general,
so when the user has no power of the system.
You can't check a human translator either, though, unless you know the language
at that point you didn't need them in the first place.
So, yeah, you have to use this stuff in a skeptical way, but then... yeah.
Yes, and the other thing is about consuming energy.
So it was an emission in the RTBF about consuming of energy, of shaggivity.
It was hard.
Yeah, so the amount of energy that these models use is big.
That's, by the way, one of the reasons I think they should be open source,
is that the researchers who do stuff companies aren't interested in
can try to optimize them and make them run with less energy.
Yeah.
Hi. I think another good use case for all these...
If we combine these features, that would mean that we could have a super accessible environment,
because if someone is blind or nearly blind,
people could use all this text to speech if someone has autism, ADHD, whatever.
You could try to find a shorter version, an easier, understandable version of a text or whatever,
and combining this would help, I think.
That is awesome. I'm going to add that to my slides right now,
but I am completely making the laptop slow now.
That's a really good point. Accessibility is a really important benefit.
Actually, hint to the developer, bring it up in the team.
Maybe we can already work on that.
Yeah, any more?
Yeah, just a question on the REC approach that you were describing before.
Do you have any figure that you can share to which extent you tried the retrieval of the vector?
Sorry, I did not hear the question.
So when you were describing the rag, the retrieval of the augmented...
The green, the colors, yes.
No, the rag. When you're retrieving the vectors from the vector DB.
Right.
So can you give us some figures on to which extent you tried that?
Talk to somebody, not me, in one of these, who knows the technical part there,
and I'm not even sure we have somebody right here at the moment with that. Sorry.
Okay.
And we're out of time. I'm afraid. So this is probably it then.
Somebody wants their microphone back.
Alright, thank you all.
Thank you very much, Josh.