[00:00.000 --> 00:09.040] The next lightning talk is Victor, and this should be a really fun talk, I think, about [00:09.040 --> 00:10.040] MLOs. [00:10.040 --> 00:11.040] Yes. [00:11.040 --> 00:12.040] So, hello. [00:12.040 --> 00:16.600] This is probably going to be the least serious talk you have seen today, so I'm sorry about [00:16.600 --> 00:17.600] that. [00:17.600 --> 00:19.280] We're going to be automating weight loss with AI. [00:19.280 --> 00:23.760] It's a stupid project I made in a weekend, or like in a few weekends, but I want to talk [00:23.760 --> 00:24.760] about it. [00:24.760 --> 00:25.760] So, who am I? [00:25.760 --> 00:26.760] Lightning talk version. [00:26.760 --> 00:27.760] I'm Victor Zonk. [00:27.760 --> 00:28.760] I work at ClearML. [00:28.760 --> 00:29.760] Hi. [00:29.760 --> 00:30.760] Let's make something. [00:30.760 --> 00:33.480] So, that's the reason why I'm here. [00:33.480 --> 00:37.760] The problem statement was I'm working at home, and I'm not working out enough, like probably [00:37.760 --> 00:44.120] a lot of us are, and so the problem solution is why not lock my PC every hour and force [00:44.120 --> 00:47.960] myself to do push-ups, and then it automatically opens again. [00:47.960 --> 00:48.960] That was the main idea. [00:48.960 --> 00:54.840] I want to do this with AI, obviously, because over-engineering, and I'm a machinery engineer, [00:54.840 --> 00:58.480] so if I am a hammer, everything looks like a nail. [00:58.480 --> 01:02.360] This is going to be the diagram, so left top is an oak one. [01:02.360 --> 01:03.360] It's an AI camera. [01:03.360 --> 01:05.480] I'll talk about it in a second. [01:05.480 --> 01:09.600] That will run one model, and then, because one AI model is not enough, I have two, so [01:09.600 --> 01:13.620] there is a second model that runs on the Raspberry Pi that will lock my PC. [01:13.620 --> 01:18.400] This is what it looks like, so you get like a notification, workout time, lazy bum, and [01:18.400 --> 01:20.800] you have to do push-ups. [01:20.800 --> 01:24.080] It's in Raspberry Pi running in the corner of my room. [01:24.080 --> 01:26.800] You can follow along with the diagram at the right top. [01:26.800 --> 01:29.920] So I'm going to do post-estimation with the oak one. [01:29.920 --> 01:30.920] Now what is the oak one? [01:30.920 --> 01:37.040] The oak one is a 150 bucks open-source hardware AI camera, which is really cool. [01:37.040 --> 01:38.720] I highly recommend it. [01:38.720 --> 01:43.800] They run the Intel Mirrored X, so if you look at the speeds there, if you have the oak one, [01:43.800 --> 01:49.040] because it does the AI, like the AI, on the chip itself, on the camera itself, it can [01:49.040 --> 01:53.160] get a lot higher FPS on the Raspberry Pi, because it doesn't have to go to the Raspberry [01:53.160 --> 01:55.240] Pi to do anything. [01:55.240 --> 02:00.040] Even when compared to another AI accelerator connected to the Raspberry Pi. [02:00.040 --> 02:04.240] It also has excellent documentation, which is a unicorn these days, but yeah, it really [02:04.240 --> 02:06.440] is a nice library. [02:06.440 --> 02:11.200] So they have a bunch of cool examples that you can try, like there's D-plop with segmentation, [02:11.200 --> 02:13.160] there's other stuff. [02:13.160 --> 02:17.880] But luckily for me, I didn't have to do any work, because they also had post-estimation. [02:17.880 --> 02:21.880] So thanks to GXGX for implementing this. [02:21.880 --> 02:23.120] This is an awesome repository. [02:23.120 --> 02:25.960] It's still being maintained, if I remember correctly. [02:25.960 --> 02:27.360] So definitely check that out. [02:27.360 --> 02:28.360] That's really cool. [02:28.360 --> 02:31.400] Now, what does this repository do? [02:31.400 --> 02:36.240] It basically gives me post-estimation, so it films me on that AI camera. [02:36.240 --> 02:39.880] I have one with me, by the way, so after the lightning talks, I can actually give a demo [02:39.880 --> 02:40.880] lightning talk. [02:40.880 --> 02:42.880] I can't do it right now. [02:42.880 --> 02:47.560] Basically it draws like a skeleton on top of you in like seven, eight frames a second [02:47.560 --> 02:52.040] on the Raspberry Pi, which is awesome, and then it even positions them in 3D. [02:52.040 --> 02:53.040] So that's nice. [02:53.040 --> 02:54.040] This is stage one. [02:54.040 --> 02:56.080] We want to go to a pushup detector. [02:56.080 --> 02:57.240] This is stage two. [02:57.240 --> 03:00.480] So we now basically have a skeleton. [03:00.480 --> 03:05.320] If we just throw away the pixels, these are the only points that we actually care about. [03:05.320 --> 03:08.400] And then now it just basically becomes a tabular problem. [03:08.400 --> 03:12.720] So the second part of the machine learning, or like the simple machine learning, is going [03:12.720 --> 03:13.880] to be really simple. [03:13.880 --> 03:18.400] Now we just have a few points, and we want to classify them. [03:18.400 --> 03:22.080] For this second model, though, this is not pre-trained, so I actually have to label. [03:22.080 --> 03:26.040] A few images, it's not a very complex model, but you have to do something. [03:26.040 --> 03:31.440] So what do we want to do is we want to say, okay, this is a pushup, this is a pushdown, [03:31.440 --> 03:35.520] and then we can do some additional logic to actually count them while they're happening. [03:35.520 --> 03:36.520] Right. [03:36.520 --> 03:42.360] But then the question becomes, how do I take a picture when I'm actually doing my pushups? [03:42.360 --> 03:45.560] Because like there is a camera there, do I need a button, but then it might overfit [03:45.560 --> 03:48.160] on me pressing the button or like something else. [03:48.160 --> 03:53.360] So if you're a machine learning engineer, the answer is throw more AI at it. [03:53.360 --> 04:00.840] So basically overengineering using an unnecessary amount of AI, I set up a microphone while [04:00.840 --> 04:02.800] I was pushing up and pushing down. [04:02.800 --> 04:07.520] There's a really cool open source package for Python that can do voice recognition. [04:07.520 --> 04:12.760] It does send it to the proprietary API of Google, but at least the code is there. [04:12.760 --> 04:16.680] And then you can just basically shout label me, and if label, the word label is actually [04:16.680 --> 04:20.720] found inside of what you said, it will take a picture. [04:20.720 --> 04:22.600] So that's that. [04:22.600 --> 04:26.600] Now we have a third AI model that's really useful. [04:26.600 --> 04:28.720] And then I did a lot with ClearML. [04:28.720 --> 04:30.760] So this is actually the MLops part. [04:30.760 --> 04:32.080] I now have two models. [04:32.080 --> 04:33.440] I want to be able to train them. [04:33.440 --> 04:35.120] I want to be able to maintain them. [04:35.120 --> 04:37.520] So what did I do is this is the labeling tool. [04:37.520 --> 04:40.800] So right, left top, left top for you. [04:40.800 --> 04:43.560] Right, left top is the oak one that will take a picture. [04:43.560 --> 04:49.360] When I shout, take a picture, it will send it to ClearML, which is actually an open source [04:49.360 --> 04:52.120] MLops tool, one that I work for. [04:52.120 --> 04:54.400] And they have data versioning, for example. [04:54.400 --> 04:58.880] So every single time I run the labeling tool, it will create a new version of my data set, [04:58.880 --> 05:00.520] which is very useful. [05:00.520 --> 05:03.160] And then I can use this new version of the data set. [05:03.160 --> 05:04.160] I can pull it in. [05:04.160 --> 05:07.680] I can use the experiment manager of ClearML to keep track of all my code. [05:07.680 --> 05:12.840] Every single time I run or I train, I will get all of my output, all of my plots. [05:12.840 --> 05:14.720] And then you can actually build this into pipelines. [05:14.720 --> 05:16.680] You can run this automatically on remote machines. [05:16.680 --> 05:21.080] So I over-engineered the crap out of it, but I can't really tell everything in Lightning [05:21.080 --> 05:22.240] Talk. [05:22.240 --> 05:27.760] The main idea is you have a lot of different tools in ClearML that can help you with that [05:27.760 --> 05:29.600] and automate a lot of that stuff. [05:29.600 --> 05:31.680] Now training my own model. [05:31.680 --> 05:33.960] So now we have all of those points. [05:33.960 --> 05:38.640] We have four each of those sets of points we have if it's a push up or a push down. [05:38.640 --> 05:40.360] Where do you go from here? [05:40.360 --> 05:43.040] Training my own model, it's this. [05:43.040 --> 05:44.240] Like it's super simple. [05:44.240 --> 05:46.400] It's three lines of code these days. [05:46.400 --> 05:47.640] So this is just sklearn. [05:47.640 --> 05:48.640] It's an SVM. [05:48.640 --> 05:49.640] It's a simple classifier. [05:49.640 --> 05:50.640] It takes points in. [05:50.640 --> 05:51.640] Give you one point out. [05:51.640 --> 05:52.880] Push up, push down. [05:52.880 --> 05:53.880] It's not ideal. [05:53.880 --> 05:56.680] I should do a no class, but I was lazy. [05:56.680 --> 05:59.320] No class basically meaning it's nothing, none of the two. [05:59.320 --> 06:06.400] So now it will say when I walk to it, it will maybe register a push up, which is not ideal. [06:06.400 --> 06:13.440] So in order to combat that, I made a very simple, even simpler piece of code that basically [06:13.440 --> 06:14.440] primes it. [06:14.440 --> 06:20.520] So here on the left, you can see one is basically push down, two is push up, and so you can [06:20.520 --> 06:21.520] see it happen. [06:21.520 --> 06:24.800] I think, yeah, you can basically see it happen there in the beginning. [06:24.800 --> 06:29.520] But when I run to my place to start to do the push ups, here you can see that there's [06:29.520 --> 06:33.200] like a bit of jittering going on because it doesn't know the zero class. [06:33.200 --> 06:38.320] So in order to catch that, what you can say is, okay, if the, you can basically say if [06:38.320 --> 06:44.960] the length is, I don't, yeah, if the length is 10, so if you're at least been doing detection [06:44.960 --> 06:49.400] for some time, then you can check if the last 10 of them were push up. [06:49.400 --> 06:53.520] So I'm basically ready in my position, only then prime it. [06:53.520 --> 06:56.560] And then once it's primed, you can start counting. [06:56.560 --> 07:00.080] So that's just a very simple, stupid way of doing it. [07:00.080 --> 07:01.080] Two minutes left. [07:01.080 --> 07:02.080] Excellent. [07:02.080 --> 07:07.320] So actually, that's it, but I have two minutes left, so I'm going to do one more thing. [07:07.320 --> 07:08.480] Locking the computer. [07:08.480 --> 07:14.240] So I use Linux, which allows you to do everything, which is awesome. [07:14.240 --> 07:19.080] So locking the computer was easy, but unlocking was hard, as it probably should be. [07:19.080 --> 07:21.280] You have to put in a password. [07:21.280 --> 07:25.200] So there was no real way to get a custom password. [07:25.200 --> 07:29.920] I tried thinking of like maybe I should like scramble my password and then fill in that [07:29.920 --> 07:32.160] scramble password, but never do that. [07:32.160 --> 07:36.520] You will be locked out if your code is buggy and it happened. [07:36.520 --> 07:40.000] So no way to get a custom password, and there is one big problem. [07:40.000 --> 07:44.840] I know my password, so if I can't change it and I lock my computer and I really don't [07:44.840 --> 07:48.000] want to do push ups, I can just fill it in and be done with it. [07:48.000 --> 07:54.480] So the best and simple solution I can come up with is just to use Xdo tool and then spam [07:54.480 --> 07:55.820] backspace. [07:55.820 --> 08:01.280] So Xdo tool actually allows you to type automatically while your computer is locked. [08:01.280 --> 08:04.960] So you can just spam backspace, not allow you to fill it in because it's like backspace [08:04.960 --> 08:10.600] 20 times a second, and then when you do the push ups, it just fills in your password. [08:10.600 --> 08:11.600] And that's it. [08:11.600 --> 08:17.600] So yeah, a lot of over engineering, and I hope you find it interesting and you learned something. [08:17.600 --> 08:26.760] So thank you so much for listening. [08:26.760 --> 08:28.440] One last note before any questions. [08:28.440 --> 08:48.700] There is a YouTube video about it on the channel MLMaker.