[00:00.000 --> 00:09.040]  The next lightning talk is Victor, and this should be a really fun talk, I think, about
[00:09.040 --> 00:10.040]  MLOs.
[00:10.040 --> 00:11.040]  Yes.
[00:11.040 --> 00:12.040]  So, hello.
[00:12.040 --> 00:16.600]  This is probably going to be the least serious talk you have seen today, so I'm sorry about
[00:16.600 --> 00:17.600]  that.
[00:17.600 --> 00:19.280]  We're going to be automating weight loss with AI.
[00:19.280 --> 00:23.760]  It's a stupid project I made in a weekend, or like in a few weekends, but I want to talk
[00:23.760 --> 00:24.760]  about it.
[00:24.760 --> 00:25.760]  So, who am I?
[00:25.760 --> 00:26.760]  Lightning talk version.
[00:26.760 --> 00:27.760]  I'm Victor Zonk.
[00:27.760 --> 00:28.760]  I work at ClearML.
[00:28.760 --> 00:29.760]  Hi.
[00:29.760 --> 00:30.760]  Let's make something.
[00:30.760 --> 00:33.480]  So, that's the reason why I'm here.
[00:33.480 --> 00:37.760]  The problem statement was I'm working at home, and I'm not working out enough, like probably
[00:37.760 --> 00:44.120]  a lot of us are, and so the problem solution is why not lock my PC every hour and force
[00:44.120 --> 00:47.960]  myself to do push-ups, and then it automatically opens again.
[00:47.960 --> 00:48.960]  That was the main idea.
[00:48.960 --> 00:54.840]  I want to do this with AI, obviously, because over-engineering, and I'm a machinery engineer,
[00:54.840 --> 00:58.480]  so if I am a hammer, everything looks like a nail.
[00:58.480 --> 01:02.360]  This is going to be the diagram, so left top is an oak one.
[01:02.360 --> 01:03.360]  It's an AI camera.
[01:03.360 --> 01:05.480]  I'll talk about it in a second.
[01:05.480 --> 01:09.600]  That will run one model, and then, because one AI model is not enough, I have two, so
[01:09.600 --> 01:13.620]  there is a second model that runs on the Raspberry Pi that will lock my PC.
[01:13.620 --> 01:18.400]  This is what it looks like, so you get like a notification, workout time, lazy bum, and
[01:18.400 --> 01:20.800]  you have to do push-ups.
[01:20.800 --> 01:24.080]  It's in Raspberry Pi running in the corner of my room.
[01:24.080 --> 01:26.800]  You can follow along with the diagram at the right top.
[01:26.800 --> 01:29.920]  So I'm going to do post-estimation with the oak one.
[01:29.920 --> 01:30.920]  Now what is the oak one?
[01:30.920 --> 01:37.040]  The oak one is a 150 bucks open-source hardware AI camera, which is really cool.
[01:37.040 --> 01:38.720]  I highly recommend it.
[01:38.720 --> 01:43.800]  They run the Intel Mirrored X, so if you look at the speeds there, if you have the oak one,
[01:43.800 --> 01:49.040]  because it does the AI, like the AI, on the chip itself, on the camera itself, it can
[01:49.040 --> 01:53.160]  get a lot higher FPS on the Raspberry Pi, because it doesn't have to go to the Raspberry
[01:53.160 --> 01:55.240]  Pi to do anything.
[01:55.240 --> 02:00.040]  Even when compared to another AI accelerator connected to the Raspberry Pi.
[02:00.040 --> 02:04.240]  It also has excellent documentation, which is a unicorn these days, but yeah, it really
[02:04.240 --> 02:06.440]  is a nice library.
[02:06.440 --> 02:11.200]  So they have a bunch of cool examples that you can try, like there's D-plop with segmentation,
[02:11.200 --> 02:13.160]  there's other stuff.
[02:13.160 --> 02:17.880]  But luckily for me, I didn't have to do any work, because they also had post-estimation.
[02:17.880 --> 02:21.880]  So thanks to GXGX for implementing this.
[02:21.880 --> 02:23.120]  This is an awesome repository.
[02:23.120 --> 02:25.960]  It's still being maintained, if I remember correctly.
[02:25.960 --> 02:27.360]  So definitely check that out.
[02:27.360 --> 02:28.360]  That's really cool.
[02:28.360 --> 02:31.400]  Now, what does this repository do?
[02:31.400 --> 02:36.240]  It basically gives me post-estimation, so it films me on that AI camera.
[02:36.240 --> 02:39.880]  I have one with me, by the way, so after the lightning talks, I can actually give a demo
[02:39.880 --> 02:40.880]  lightning talk.
[02:40.880 --> 02:42.880]  I can't do it right now.
[02:42.880 --> 02:47.560]  Basically it draws like a skeleton on top of you in like seven, eight frames a second
[02:47.560 --> 02:52.040]  on the Raspberry Pi, which is awesome, and then it even positions them in 3D.
[02:52.040 --> 02:53.040]  So that's nice.
[02:53.040 --> 02:54.040]  This is stage one.
[02:54.040 --> 02:56.080]  We want to go to a pushup detector.
[02:56.080 --> 02:57.240]  This is stage two.
[02:57.240 --> 03:00.480]  So we now basically have a skeleton.
[03:00.480 --> 03:05.320]  If we just throw away the pixels, these are the only points that we actually care about.
[03:05.320 --> 03:08.400]  And then now it just basically becomes a tabular problem.
[03:08.400 --> 03:12.720]  So the second part of the machine learning, or like the simple machine learning, is going
[03:12.720 --> 03:13.880]  to be really simple.
[03:13.880 --> 03:18.400]  Now we just have a few points, and we want to classify them.
[03:18.400 --> 03:22.080]  For this second model, though, this is not pre-trained, so I actually have to label.
[03:22.080 --> 03:26.040]  A few images, it's not a very complex model, but you have to do something.
[03:26.040 --> 03:31.440]  So what do we want to do is we want to say, okay, this is a pushup, this is a pushdown,
[03:31.440 --> 03:35.520]  and then we can do some additional logic to actually count them while they're happening.
[03:35.520 --> 03:36.520]  Right.
[03:36.520 --> 03:42.360]  But then the question becomes, how do I take a picture when I'm actually doing my pushups?
[03:42.360 --> 03:45.560]  Because like there is a camera there, do I need a button, but then it might overfit
[03:45.560 --> 03:48.160]  on me pressing the button or like something else.
[03:48.160 --> 03:53.360]  So if you're a machine learning engineer, the answer is throw more AI at it.
[03:53.360 --> 04:00.840]  So basically overengineering using an unnecessary amount of AI, I set up a microphone while
[04:00.840 --> 04:02.800]  I was pushing up and pushing down.
[04:02.800 --> 04:07.520]  There's a really cool open source package for Python that can do voice recognition.
[04:07.520 --> 04:12.760]  It does send it to the proprietary API of Google, but at least the code is there.
[04:12.760 --> 04:16.680]  And then you can just basically shout label me, and if label, the word label is actually
[04:16.680 --> 04:20.720]  found inside of what you said, it will take a picture.
[04:20.720 --> 04:22.600]  So that's that.
[04:22.600 --> 04:26.600]  Now we have a third AI model that's really useful.
[04:26.600 --> 04:28.720]  And then I did a lot with ClearML.
[04:28.720 --> 04:30.760]  So this is actually the MLops part.
[04:30.760 --> 04:32.080]  I now have two models.
[04:32.080 --> 04:33.440]  I want to be able to train them.
[04:33.440 --> 04:35.120]  I want to be able to maintain them.
[04:35.120 --> 04:37.520]  So what did I do is this is the labeling tool.
[04:37.520 --> 04:40.800]  So right, left top, left top for you.
[04:40.800 --> 04:43.560]  Right, left top is the oak one that will take a picture.
[04:43.560 --> 04:49.360]  When I shout, take a picture, it will send it to ClearML, which is actually an open source
[04:49.360 --> 04:52.120]  MLops tool, one that I work for.
[04:52.120 --> 04:54.400]  And they have data versioning, for example.
[04:54.400 --> 04:58.880]  So every single time I run the labeling tool, it will create a new version of my data set,
[04:58.880 --> 05:00.520]  which is very useful.
[05:00.520 --> 05:03.160]  And then I can use this new version of the data set.
[05:03.160 --> 05:04.160]  I can pull it in.
[05:04.160 --> 05:07.680]  I can use the experiment manager of ClearML to keep track of all my code.
[05:07.680 --> 05:12.840]  Every single time I run or I train, I will get all of my output, all of my plots.
[05:12.840 --> 05:14.720]  And then you can actually build this into pipelines.
[05:14.720 --> 05:16.680]  You can run this automatically on remote machines.
[05:16.680 --> 05:21.080]  So I over-engineered the crap out of it, but I can't really tell everything in Lightning
[05:21.080 --> 05:22.240]  Talk.
[05:22.240 --> 05:27.760]  The main idea is you have a lot of different tools in ClearML that can help you with that
[05:27.760 --> 05:29.600]  and automate a lot of that stuff.
[05:29.600 --> 05:31.680]  Now training my own model.
[05:31.680 --> 05:33.960]  So now we have all of those points.
[05:33.960 --> 05:38.640]  We have four each of those sets of points we have if it's a push up or a push down.
[05:38.640 --> 05:40.360]  Where do you go from here?
[05:40.360 --> 05:43.040]  Training my own model, it's this.
[05:43.040 --> 05:44.240]  Like it's super simple.
[05:44.240 --> 05:46.400]  It's three lines of code these days.
[05:46.400 --> 05:47.640]  So this is just sklearn.
[05:47.640 --> 05:48.640]  It's an SVM.
[05:48.640 --> 05:49.640]  It's a simple classifier.
[05:49.640 --> 05:50.640]  It takes points in.
[05:50.640 --> 05:51.640]  Give you one point out.
[05:51.640 --> 05:52.880]  Push up, push down.
[05:52.880 --> 05:53.880]  It's not ideal.
[05:53.880 --> 05:56.680]  I should do a no class, but I was lazy.
[05:56.680 --> 05:59.320]  No class basically meaning it's nothing, none of the two.
[05:59.320 --> 06:06.400]  So now it will say when I walk to it, it will maybe register a push up, which is not ideal.
[06:06.400 --> 06:13.440]  So in order to combat that, I made a very simple, even simpler piece of code that basically
[06:13.440 --> 06:14.440]  primes it.
[06:14.440 --> 06:20.520]  So here on the left, you can see one is basically push down, two is push up, and so you can
[06:20.520 --> 06:21.520]  see it happen.
[06:21.520 --> 06:24.800]  I think, yeah, you can basically see it happen there in the beginning.
[06:24.800 --> 06:29.520]  But when I run to my place to start to do the push ups, here you can see that there's
[06:29.520 --> 06:33.200]  like a bit of jittering going on because it doesn't know the zero class.
[06:33.200 --> 06:38.320]  So in order to catch that, what you can say is, okay, if the, you can basically say if
[06:38.320 --> 06:44.960]  the length is, I don't, yeah, if the length is 10, so if you're at least been doing detection
[06:44.960 --> 06:49.400]  for some time, then you can check if the last 10 of them were push up.
[06:49.400 --> 06:53.520]  So I'm basically ready in my position, only then prime it.
[06:53.520 --> 06:56.560]  And then once it's primed, you can start counting.
[06:56.560 --> 07:00.080]  So that's just a very simple, stupid way of doing it.
[07:00.080 --> 07:01.080]  Two minutes left.
[07:01.080 --> 07:02.080]  Excellent.
[07:02.080 --> 07:07.320]  So actually, that's it, but I have two minutes left, so I'm going to do one more thing.
[07:07.320 --> 07:08.480]  Locking the computer.
[07:08.480 --> 07:14.240]  So I use Linux, which allows you to do everything, which is awesome.
[07:14.240 --> 07:19.080]  So locking the computer was easy, but unlocking was hard, as it probably should be.
[07:19.080 --> 07:21.280]  You have to put in a password.
[07:21.280 --> 07:25.200]  So there was no real way to get a custom password.
[07:25.200 --> 07:29.920]  I tried thinking of like maybe I should like scramble my password and then fill in that
[07:29.920 --> 07:32.160]  scramble password, but never do that.
[07:32.160 --> 07:36.520]  You will be locked out if your code is buggy and it happened.
[07:36.520 --> 07:40.000]  So no way to get a custom password, and there is one big problem.
[07:40.000 --> 07:44.840]  I know my password, so if I can't change it and I lock my computer and I really don't
[07:44.840 --> 07:48.000]  want to do push ups, I can just fill it in and be done with it.
[07:48.000 --> 07:54.480]  So the best and simple solution I can come up with is just to use Xdo tool and then spam
[07:54.480 --> 07:55.820]  backspace.
[07:55.820 --> 08:01.280]  So Xdo tool actually allows you to type automatically while your computer is locked.
[08:01.280 --> 08:04.960]  So you can just spam backspace, not allow you to fill it in because it's like backspace
[08:04.960 --> 08:10.600]  20 times a second, and then when you do the push ups, it just fills in your password.
[08:10.600 --> 08:11.600]  And that's it.
[08:11.600 --> 08:17.600]  So yeah, a lot of over engineering, and I hope you find it interesting and you learned something.
[08:17.600 --> 08:26.760]  So thank you so much for listening.
[08:26.760 --> 08:28.440]  One last note before any questions.
[08:28.440 --> 08:48.700]  There is a YouTube video about it on the channel MLMaker.