[00:00.000 --> 00:10.800]  Thanks very much, thank you, so welcome everyone and I'm glad that you're here on Saturday
[00:10.800 --> 00:16.800]  early morning in this first session, so I'd like to make it as easy as possible, thanks
[00:16.800 --> 00:24.320]  for the organizers, Jerez and Yamur, for inviting me to talk today about stream processing.
[00:24.320 --> 00:28.960]  The fact is I don't know your background, so I'm not sure exactly how much experience
[00:28.960 --> 00:34.960]  you'll have with stream processing, so if you see some concepts are easy, just get everyone
[00:34.960 --> 00:36.960]  up and into this concept.
[00:36.960 --> 00:42.960]  So I'll be talking today about stream processing on adaptive and there's a lot, so that's what
[00:42.960 --> 00:44.960]  the main focus will be.
[00:44.960 --> 00:52.960]  Obviously this title as it is could be a startup company, so you would expect to have some ideas
[00:52.960 --> 00:59.960]  today where you can use some of these ideas in your work or in your experience or in your
[00:59.960 --> 01:05.960]  case of study whatever you want or whether you are a Java developer or data scientist
[01:05.960 --> 01:08.960]  or MLO, so it doesn't matter.
[01:08.960 --> 01:13.960]  So there is something for everyone here today, so that's the main focus for this session.
[01:13.960 --> 01:19.960]  So, anyone recognize these guys on the screen here?
[01:19.960 --> 01:26.960]  Right, so that's where I came from, I'm based in Liverpool in the UK and on the right side
[01:26.960 --> 01:35.960]  is the Liverpool football club, which is basically one of the top football teams in the UK, so
[01:35.960 --> 01:40.960]  I wanted just to highlight this screen here, just to tell you that stream processing is
[01:40.960 --> 01:45.960]  not in specific domain, it could be in any domain.
[01:45.960 --> 01:54.960]  And if you look at it, do you know how long it takes, for example, for an eye to blink?
[01:54.960 --> 01:57.960]  Come again?
[01:57.960 --> 02:01.960]  Yeah, so it takes over half a second, so that's pretty fast.
[02:01.960 --> 02:08.960]  So if you think about maybe minutes or hours, probably this is not the right discussion
[02:08.960 --> 02:09.960]  room for you.
[02:09.960 --> 02:14.960]  We're talking about some milliseconds today, so whether it's for example using it in finance,
[02:14.960 --> 02:19.960]  whether you use it in IoT devices, smart devices, whether you use it in sports, hospitals or
[02:19.960 --> 02:23.960]  machine learning or what we're trying to do today for stream processing.
[02:23.960 --> 02:25.960]  So that's the main idea.
[02:25.960 --> 02:30.960]  And obviously, if you're working with real-time stream processing, you focus on the real-time
[02:30.960 --> 02:31.960]  data, right?
[02:31.960 --> 02:39.960]  And I've seen it so many times where platforms and tools focus on how much data you can process
[02:39.960 --> 02:45.960]  and you see these benchmarks everywhere on the internet and this is pretty cool, I think,
[02:45.960 --> 02:51.960]  but the key source and the secret source for this is to use something in combination between
[02:51.960 --> 02:55.960]  real-time data and historical data.
[02:55.960 --> 02:59.960]  So the main reason for this is to look at context.
[02:59.960 --> 03:05.960]  So without knowing what's going on, you probably don't benefit much from the real-time data
[03:05.960 --> 03:06.960]  you're processing.
[03:06.960 --> 03:11.960]  So what you want is always to go back and check what's going on with the context of these
[03:11.960 --> 03:12.960]  data.
[03:12.960 --> 03:18.960]  There is a problem in this secret source, obviously, because what you're looking at is kind of
[03:18.960 --> 03:24.960]  like two different data types and you want to make sure that you process it at the same
[03:24.960 --> 03:27.960]  speed or very close to the same speed.
[03:27.960 --> 03:32.960]  Obviously, it becomes really a problem when you try to scale it.
[03:32.960 --> 03:39.960]  So if you have, I don't know, maybe a few cases of data that you want to process, probably
[03:39.960 --> 03:44.960]  it's not too much trouble for you, but when you start to scale it up, it becomes really
[03:44.960 --> 03:47.960]  a problem to understand how you want to scale it.
[03:47.960 --> 03:53.960]  So do you scale your data or do you scale your compute or do you scale both at what speed?
[03:53.960 --> 04:00.960]  So we will discuss all these concepts today and if I ask you now how much data you process,
[04:00.960 --> 04:06.960]  obviously, because in this room I would assume over a million transactions per second or
[04:06.960 --> 04:11.960]  a few millions or, I don't know, some of you might be processing millions of transactions
[04:11.960 --> 04:12.960]  per second.
[04:12.960 --> 04:13.960]  So that's pretty good.
[04:13.960 --> 04:20.960]  And what we want today is to focus this domain into a very specific area.
[04:20.960 --> 04:25.960]  And this area essentially what we're trying to do today is to analyze traces.
[04:25.960 --> 04:31.960]  So it doesn't matter if it's like writing system traces or platform traces or it's like programming
[04:31.960 --> 04:32.960]  language traces.
[04:32.960 --> 04:38.960]  What we want is to make sure that we have environment and within this environment you
[04:38.960 --> 04:44.960]  can scale your loads, basically, scale your processing and at the same time we'll provide
[04:44.960 --> 04:46.960]  some kind of analytics, right?
[04:46.960 --> 04:52.960]  So again, if you look at how much data you're trying to process, the number by itself doesn't
[04:52.960 --> 04:54.960]  give you much what's going on here.
[04:54.960 --> 04:59.960]  So what you want is to find this specific information you're looking for.
[04:59.960 --> 05:05.960]  Kind of like looking at, you know, finding the needles or finding the hidden areas within
[05:05.960 --> 05:06.960]  your data.
[05:06.960 --> 05:13.960]  So if you look at, you know, how much loads you process per day or per week and you'll
[05:13.960 --> 05:18.960]  store it somewhere on, you know, crystal hard drive or you store it in Mori or you store
[05:18.960 --> 05:19.960]  it in the cloud.
[05:19.960 --> 05:23.960]  So what you want is to, you know, make sense of it.
[05:23.960 --> 05:30.960]  And some companies do this process manually, which means they run software and they go
[05:30.960 --> 05:37.960]  through their loads and this is kind of a patch service and they try to understand what's
[05:37.960 --> 05:39.960]  going on within the load.
[05:39.960 --> 05:45.960]  So obviously this is a problem when you want to scale it and with the scaling you have different
[05:45.960 --> 05:51.960]  loads stored in different places and you want to make sure basically to have a platform
[05:51.960 --> 05:58.960]  where in this platform we kind of like looking at some kind of results.
[05:58.960 --> 06:04.960]  So for the sake of this discussion today, we'll focus on two different solutions.
[06:04.960 --> 06:10.960]  So one of them is trying to provide some kind of alerts and the other is to provide some
[06:10.960 --> 06:13.960]  kind of trends within your data.
[06:13.960 --> 06:16.960]  Obviously I work for a company called Hazardcast.
[06:16.960 --> 06:23.960]  So Hazardcast as a platform, I love you to do so but obviously you might have heard of
[06:23.960 --> 06:27.960]  some companies or, you know, they do some kind of stream processing.
[06:27.960 --> 06:34.960]  So this is kind of like, you know, overview what's going on with this domain at this time.
[06:34.960 --> 06:40.960]  Obviously you can split it depending on if you're looking for open source solution or,
[06:40.960 --> 06:45.960]  I don't know, hardware solution or, you know, some kind of management service.
[06:45.960 --> 06:49.960]  And what you look at is kind of which domain you work so are you looking to capture your
[06:49.960 --> 06:54.960]  data or some kind of, you know, streaming your data or you want to do some kind of
[06:54.960 --> 07:00.960]  transformation on your data or do some kind of electrical machine learning.
[07:00.960 --> 07:07.960]  So you can see that you split it into 12 squares and within these like tools and
[07:07.960 --> 07:10.960]  platforms are, you know, spread it over.
[07:10.960 --> 07:17.960]  Some tools not exist on this screen for whatever reason but obviously this might give
[07:17.960 --> 07:22.960]  you some ideas but it's hard to decide which tool you want to go for.
[07:22.960 --> 07:28.960]  Simply because I think the distribution is not clear here so it tells you basically
[07:28.960 --> 07:34.960]  which tool is open source for example and where in process you can use it but it doesn't
[07:34.960 --> 07:38.960]  give you full picture on, you know, how to do it in practical terms.
[07:38.960 --> 07:43.960]  And so this is where it might be easier to understand what we're talking about.
[07:43.960 --> 07:50.960]  So if you remember from my slide where I discussed the historical data and the new data.
[07:50.960 --> 07:55.960]  So today we're kind of like, you know, trying to split everything into two categories.
[07:55.960 --> 07:59.960]  So on one side you get like stream processing engines.
[07:59.960 --> 08:04.960]  So these engines are pretty fast in, you know, streaming events.
[08:04.960 --> 08:10.960]  And on this far right side you have some kind of fast data stores which are, you know,
[08:10.960 --> 08:15.960]  are pretty fast in handling data at speed.
[08:15.960 --> 08:22.960]  So again the solution for lead time stream processing is kind of a combination and you
[08:22.960 --> 08:28.960]  want to process data in this moment and at the same time you want to actually also
[08:28.960 --> 08:35.960]  access data storage somewhere. So that's where Hazardcast fits into this area here.
[08:35.960 --> 08:40.960]  So the platform itself obviously for those who don't know, by the way we have one of
[08:40.960 --> 08:43.960]  the masterminds of Hazardcast sitting in this room.
[08:43.960 --> 08:45.960]  So this is the platform.
[08:45.960 --> 08:47.960]  So it's open source platform.
[08:47.960 --> 08:52.960]  It doesn't matter where your source is coming from, whether it's Apache Cloud or Apache
[08:52.960 --> 08:58.960]  IoT devices, for example, I don't know, some kind of device applications or even like within
[08:58.960 --> 09:03.960]  Hazardcast or even you can write your own connector and you feed it into the platform.
[09:03.960 --> 09:07.960]  So platform historically used to be two different components.
[09:07.960 --> 09:11.960]  So the IOMTG and the Jet Engine.
[09:11.960 --> 09:16.960]  And essentially now it's all back in one, one jar file.
[09:16.960 --> 09:21.960]  As you see here, it allows you to load your data from hard disks into memory.
[09:21.960 --> 09:29.960]  So you have access to historical data and pretty much like instantaneously and this
[09:29.960 --> 09:35.960]  will, well, you know, you can provide context, what's going on with your data.
[09:35.960 --> 09:38.960]  At the same time, you can actually do stream processing.
[09:38.960 --> 09:40.960]  So that's what Jet Engine is.
[09:40.960 --> 09:45.960]  So from here you can do some kind, I don't know, maybe like data transformation or do
[09:45.960 --> 09:50.960]  some kind of stream processing as we will do today or even like defined machine learning
[09:50.960 --> 09:51.960]  if you want to.
[09:51.960 --> 09:53.960]  You can connect it to some clients.
[09:53.960 --> 09:58.960]  So these are some clients here, so written into various languages.
[09:58.960 --> 10:03.960]  If you're from data science background, which means your programming languages in general
[10:03.960 --> 10:09.960]  are not preferable for you, so you might be considering using SQL to do what I'm planning
[10:09.960 --> 10:10.960]  today.
[10:10.960 --> 10:12.960]  So this is another option you can do.
[10:12.960 --> 10:19.960]  And once you process this data where you load it into memory for historical data and
[10:19.960 --> 10:23.960]  at the same time you have some kind of data coming in.
[10:23.960 --> 10:28.960]  For example, and you do the combination or even you do transformation, you can proceed
[10:28.960 --> 10:30.960]  to do some kind of visualization.
[10:30.960 --> 10:37.960]  So the good thing about Hesicas in general where it comes to scaling is it's partition
[10:37.960 --> 10:38.960]  aware.
[10:38.960 --> 10:44.960]  So which means basically your compute, your Jet Engine or your process essentially can
[10:44.960 --> 10:47.960]  be or can detect where your data is stored.
[10:47.960 --> 10:53.960]  So this is like, you know, we're trying to have as low latency as possible when it comes
[10:53.960 --> 10:55.960]  to processing this data.
[10:55.960 --> 11:00.960]  So this is very important to understand because latency is your enemy when it comes to stream
[11:00.960 --> 11:01.960]  processing.
[11:01.960 --> 11:07.960]  So what you want is kind of like having a platform where you avoid network folks.
[11:07.960 --> 11:11.960]  For example, you avoid IO to your hard disk.
[11:11.960 --> 11:17.960]  You will try to also avoid every time or, sorry, context switching between threads.
[11:17.960 --> 11:21.960]  So you want to avoid all of these, but at the same time you want your process to be as
[11:21.960 --> 11:23.960]  close as possible to your data.
[11:23.960 --> 11:28.960]  You can avoid some kind of, you know, machine learning on this.
[11:28.960 --> 11:32.960]  And the scaling itself could be done in various ways.
[11:32.960 --> 11:39.960]  So the main thing to take away from here is there's no master-worker relationship.
[11:39.960 --> 11:42.960]  So all nodes basically are peers.
[11:42.960 --> 11:44.960]  And we've done this study.
[11:44.960 --> 11:50.960]  It's a bit dated, but it's kind of like one million transactions per second on 45 nodes.
[11:50.960 --> 11:55.960]  So what we're trying to do now is to add one zero into this number here.
[11:55.960 --> 12:00.960]  And even though it's pretty impressive, what is nice about it is the linear scaling, which
[12:00.960 --> 12:05.960]  means more data you can add, you know, more nodes into it.
[12:05.960 --> 12:09.960]  So that's the historical bit of this talk.
[12:09.960 --> 12:11.960]  So let's just move to the technical part.
[12:11.960 --> 12:16.960]  So for this demo, what I wanted is kind of like, you know, show you some ideas, right?
[12:16.960 --> 12:20.960]  So you should be able to take these ideas and apply it, you know, anywhere.
[12:20.960 --> 12:25.960]  Obviously the solution as itself could be like, you know, project by itself.
[12:25.960 --> 12:27.960]  So feel free to edit and change it.
[12:27.960 --> 12:32.960]  All source code is available on GitHub and the documentation as well.
[12:32.960 --> 12:34.960]  So you can go through it.
[12:34.960 --> 12:40.960]  So the main idea when it comes to analyzing or, you know, making sense out of your traces
[12:40.960 --> 12:48.960]  and logs is to store it somewhere close to, you know, your compute, first of all,
[12:48.960 --> 12:50.960]  and shouldn't be stored locally, right?
[12:50.960 --> 12:52.960]  So you want to store it first of all.
[12:52.960 --> 12:54.960]  So the first thing is to store it on the cloud.
[12:54.960 --> 13:00.960]  So for this demo, what I'm doing is I'm storing everything onto the cloud.
[13:00.960 --> 13:04.960]  There is a solution called Hazelgast-Virginian, which is kind of like service.
[13:04.960 --> 13:07.960]  So you don't need to download GR, run your project.
[13:07.960 --> 13:09.960]  You can simply plug in and play.
[13:09.960 --> 13:11.960]  So you can create an account.
[13:11.960 --> 13:13.960]  You'll run everything I'm discussing today.
[13:13.960 --> 13:15.960]  So you create an instance of Hazelgast.
[13:15.960 --> 13:20.960]  And from there, you can pretty much proceed to what I'm planning to do.
[13:20.960 --> 13:26.960]  So the first option we were talking about is kind of like storing everything into the cloud.
[13:26.960 --> 13:31.960]  So we're going to import the data. Obviously, we need some kind of trace message,
[13:31.960 --> 13:33.960]  which makes sense.
[13:33.960 --> 13:39.960]  So this trace message could be, you know, changed based on how you want to approach it, right?
[13:39.960 --> 13:41.960]  So for example, if you're working with machine learning,
[13:41.960 --> 13:47.960]  you probably look for some kind of, I don't know, classification solution for your, you know,
[13:47.960 --> 13:50.960]  for your tests, or you could be looking for NLE.
[13:50.960 --> 13:55.960]  If you don't want to work with machine learning, you probably want to look for some kind of trends.
[13:55.960 --> 13:58.960]  So you look for processing your data.
[13:58.960 --> 14:00.960]  It doesn't matter if you're using machine learning.
[14:00.960 --> 14:04.960]  In this case, you want to have some kind of data stored somewhere.
[14:04.960 --> 14:09.960]  So it could be in JSON format, or it could be like bar charts, strings.
[14:09.960 --> 14:13.960]  So it depends again how much the speed is important to you.
[14:13.960 --> 14:18.960]  So the option, first option is to go through the alerts.
[14:18.960 --> 14:24.960]  So in alerts, what we're trying to do here is to take everything and store it in the cloud.
[14:24.960 --> 14:28.960]  So obviously we don't store it in the cloud on our disks.
[14:28.960 --> 14:31.960]  What we try to do is to store it in memory.
[14:31.960 --> 14:36.960]  My preference in this case is to use some kind of map structure.
[14:36.960 --> 14:44.960]  So map structure allows you to essentially random access and rebalance between various nodes within your cluster.
[14:44.960 --> 14:51.960]  And at the same time, you want to have some key value, so in order to know where this is coming from.
[14:51.960 --> 14:58.960]  So in this case, it could be like ID address, for example, so this is where, and support number, so as key.
[14:58.960 --> 15:02.960]  And the value could be anything that makes sense to you.
[15:02.960 --> 15:07.960]  So in this case, for example, you can track level of this error, sorry, of this loop,
[15:07.960 --> 15:12.960]  and message, for example, if you want to do some kind of NLP processing on it,
[15:12.960 --> 15:16.960]  and some kind of, you know, process or thread name on this.
[15:16.960 --> 15:24.960]  Obviously once you have your key and value, what you can do is proceed and store it into memory.
[15:24.960 --> 15:27.960]  So this is where you get this set to hazard cast.
[15:27.960 --> 15:30.960]  And what we're trying to do is create the IMAP.
[15:30.960 --> 15:39.960]  And once you have the IMAP, it means you should be able to store it, you know, access it and do same processing as I will show you.
[15:39.960 --> 15:44.960]  So first message is to store it in the cloud, store it in memory.
[15:44.960 --> 15:48.960]  In this case, I'm using hazard cast gradient, and I'm using IMAP.
[15:48.960 --> 15:52.960]  And second stage is to do the same processing, right?
[15:52.960 --> 15:54.960]  So there are a couple of options here for you.
[15:54.960 --> 15:56.960]  So first option is to use SQL.
[15:56.960 --> 16:01.960]  So SQL is built within hazard cast, which means, or on top of hazard cast,
[16:01.960 --> 16:05.960]  which means you should be able to query your data, so if you provide some kind of specific messages
[16:05.960 --> 16:11.960]  that you're looking for, obviously depends on your input, you can do some kind of SQL.
[16:11.960 --> 16:15.960]  So whether it's an inner joy, for example, or sales and so on.
[16:15.960 --> 16:18.960]  Or the other option is to do some kind of prediction.
[16:18.960 --> 16:23.960]  So you're getting some logs, you don't know exactly what's going on to happen next,
[16:23.960 --> 16:28.960]  and you try to predict to provide some kind of, you know, alerts or trends.
[16:28.960 --> 16:30.960]  So we need to build the trends.
[16:30.960 --> 16:34.960]  So in order to do this, what I did is kind of like use the same key,
[16:34.960 --> 16:40.960]  but for my value, what I'm using is some kind of log score.
[16:40.960 --> 16:43.960]  So log score is not important.
[16:43.960 --> 16:49.960]  What I'm saying here is I want to give value for every single message,
[16:49.960 --> 16:52.960]  or every single log message.
[16:52.960 --> 16:59.960]  So this could be, for example, how important this specific message is for you,
[16:59.960 --> 17:05.960]  or it could be, for example, how serious or how dangerous the message is.
[17:05.960 --> 17:11.960]  So as levels in logs, you can define scores, so instead of having four levels, for example,
[17:11.960 --> 17:15.960]  you can spread it, I don't know, from one to 100.
[17:15.960 --> 17:17.960]  So this should give you some kind of predictions.
[17:17.960 --> 17:21.960]  Why? Because if you have, for example, 10 messages, 10 local messages,
[17:21.960 --> 17:29.960]  or, for example, warning, you don't know exactly if the event matrix will be warning or not.
[17:29.960 --> 17:34.960]  If you want to predict it, obviously it doesn't give you how much will be warning or not.
[17:34.960 --> 17:41.960]  Whereas if you use some kind of numerical value, you can get as close as possible to this.
[17:41.960 --> 17:47.960]  So we get key from there, we get score from here,
[17:47.960 --> 17:52.960]  and what we do next is to do some kind of predictions on the logs.
[17:52.960 --> 17:56.960]  So in this process, what we have is exactly my key and the value,
[17:56.960 --> 18:02.960]  which is like the score on each log message, and I import it into Hezakas.
[18:02.960 --> 18:08.960]  So Hezakas allows you to basically input and output from two different maps,
[18:08.960 --> 18:13.960]  and do stream processing, so we'll build the train based on previous logs,
[18:13.960 --> 18:19.960]  based on previous log scores, and we'll use the prediction on top of it
[18:19.960 --> 18:21.960]  to provide some kind of alert.
[18:21.960 --> 18:24.960]  So zero means don't alert, one means alert.
[18:24.960 --> 18:31.960]  And as you can see here, the actual workflow, kind of like you build, you read it from math,
[18:31.960 --> 18:35.960]  then you define trend map, so which is like normal map,
[18:35.960 --> 18:40.960]  and from there you can use it to predict what's going to happen next.
[18:40.960 --> 18:44.960]  Obviously you do some kind of visualization, so how does it look like?
[18:44.960 --> 18:47.960]  So this is kind of the prediction part of it.
[18:47.960 --> 18:53.960]  So we take the logs map and we build trend map out of it.
[18:53.960 --> 18:59.960]  So the trend map would start reading messages and the scores and build train for you.
[18:59.960 --> 19:03.960]  And from this trend I can use some kind of machine learning,
[19:03.960 --> 19:05.960]  it doesn't have to be machine learning.
[19:05.960 --> 19:09.960]  In this case it's linear regression, but it could be anything to be honest.
[19:09.960 --> 19:16.960]  And we check the values and we try to use some kind of prediction based on the previous values
[19:16.960 --> 19:19.960]  to decide if you want to send an alert or not.
[19:19.960 --> 19:23.960]  And obviously this is kind of like describing the exact thing,
[19:23.960 --> 19:30.960]  so when there is a one on your values, it's alert, send alert when it's zero, don't.
[19:30.960 --> 19:34.960]  And here there are three ways to do it.
[19:34.960 --> 19:37.960]  So this is where the same processing comes into place.
[19:37.960 --> 19:43.960]  So you could simply use SQL to read from the map and do some query if you want.
[19:43.960 --> 19:47.960]  Obviously this is batch, which means it's not real time same processing,
[19:47.960 --> 19:52.960]  or you can even create a pipeline or create a process.
[19:52.960 --> 19:56.960]  And from this process you can read the logs and do some same thing.
[19:56.960 --> 20:00.960]  So you can use either SQL or Java to do it.
[20:00.960 --> 20:06.960]  But first two options are batch, which means you can process the data in real time,
[20:06.960 --> 20:08.960]  you want to do changes in real time.
[20:08.960 --> 20:11.960]  So the third option is the journal map.
[20:11.960 --> 20:14.960]  So journal map will track all changes, so it is continuous.
[20:14.960 --> 20:19.960]  So you have the logs stored and you have logs coming into Kafka topic
[20:19.960 --> 20:23.960]  and you can basically store both into journal map.
[20:23.960 --> 20:30.960]  So we're on 5.2 version within Hazegas, 5.3 will have the SQL features on top of it,
[20:30.960 --> 20:36.960]  which allows you, for example, data scientists to just do the queries and change the data.
[20:36.960 --> 20:40.960]  And obviously it's ring covered, so this is very important to understand,
[20:40.960 --> 20:46.960]  so you can start processing your data from start or from the end.
[20:46.960 --> 20:51.960]  And what you want is kind of like, you know, using this kind of alerts to it.
[20:51.960 --> 20:53.960]  So the first part is to read it.
[20:53.960 --> 20:57.960]  So this is the actual map we built in the first option.
[20:57.960 --> 21:01.960]  And from here you can define the key, for example, and the value.
[21:01.960 --> 21:05.960]  And you start, for example, to do some kind of filtering.
[21:05.960 --> 21:14.960]  So this happens in real time on continuous and the map itself will allow you to basically track changes.
[21:14.960 --> 21:18.960]  So to give you some takeaways and best practices,
[21:18.960 --> 21:22.960]  so we just try to summarize everything we discussed today.
[21:22.960 --> 21:27.960]  Obviously there are more to discuss, but this should give you something to go out and try.
[21:27.960 --> 21:32.960]  So first of all, you need to store your logs into some kind of data platform.
[21:32.960 --> 21:36.960]  So in this case, I'm using Hazegas, but obviously you don't have to use Hazegas.
[21:36.960 --> 21:42.960]  The idea here is to do some kind of compute on your circle data to provide some context,
[21:42.960 --> 21:44.960]  as well as real time data.
[21:44.960 --> 21:47.960]  And from there, you need to store it on the cloud.
[21:47.960 --> 21:51.960]  So you need to store it somewhere where you can access logs from multiple places.
[21:51.960 --> 21:54.960]  Obviously it has to be stored in memory.
[21:54.960 --> 21:57.960]  And from there, you need to choose the format.
[21:57.960 --> 22:01.960]  If you, for example, looking to provide some, you know, I don't know,
[22:01.960 --> 22:04.960]  some predictions you probably need to use JSON format.
[22:04.960 --> 22:08.960]  Or for example, if you want just to do something that you can't sing and it's faster,
[22:08.960 --> 22:12.960]  if you want to speed the unit, it is some kind of map structure.
[22:12.960 --> 22:19.960]  Obviously when you store it in memory, because this will allow some kind of random access.
[22:19.960 --> 22:24.960]  And also you need to consider how you empty your map.
[22:24.960 --> 22:27.960]  So because you are limited on size, obviously.
[22:27.960 --> 22:30.960]  And finally, you need to consider security.
[22:30.960 --> 22:35.960]  So whatever you send to the cloud, you need to make sure that you don't include some, you know,
[22:35.960 --> 22:38.960]  personal entities or whatever. So if you're interested in this topic,
[22:38.960 --> 22:41.960]  we're running a conference next month.
[22:41.960 --> 22:43.960]  So feel free to scan this code.
[22:43.960 --> 22:46.960]  We provide training for this all free, obviously.
[22:46.960 --> 22:48.960]  And everything I mentioned today is open source,
[22:48.960 --> 22:51.960]  so you should be able to do everything I mentioned today.
[22:51.960 --> 22:54.960]  I'll be steering around here if you want to have a chat
[22:54.960 --> 22:56.960]  or if you want to discuss it a little bit more.
[22:56.960 --> 22:59.960]  Obviously within half an hour, there's not much to give,
[22:59.960 --> 23:02.960]  but hopefully you've got some ideas from this talk.
[23:02.960 --> 23:05.960]  And hopefully it will be also useful for you.
[23:05.960 --> 23:08.960]  So with that being said, thanks very much for listening.
[23:08.960 --> 23:10.960]  I'll open for questions.
[23:10.960 --> 23:33.960]  Thank you.