[00:00.000 --> 00:17.800] I used to work at Pyra, so maybe some of you know me from my six years at Pyra company. [00:17.800 --> 00:26.640] Now I work at Omnifish, where we, with our co-founders and employees, were a support [00:26.640 --> 00:34.800] glassfish server, so back to the roots, kind of. But this time I'd like to talk about Java, [00:34.800 --> 00:45.200] plain Java and Jakarta E, and how it all fits together when we combine that with AWS. [00:45.200 --> 00:56.240] So first, before I talked about AWS, let's ask, why do we want to have Java fast, or [00:56.240 --> 01:06.960] do we want to have Java start fast? I think everybody wants that, but why? Because it's [01:06.960 --> 01:14.880] cool, or because we need it. So there were times when we really didn't need that, when [01:14.880 --> 01:20.680] we had the application servers, it was a pain that it took a while to start, but in production [01:20.680 --> 01:26.880] it was already running, so there was no real business need for that, only to make developers [01:26.880 --> 01:33.920] happy and be more productive with developing codes. But now we have several use cases where [01:33.920 --> 01:42.000] it's really needed, because the more time it takes for Java program to start, it costs [01:42.000 --> 01:48.120] more money, and it's not user-friendly. And one example, a perfect example of this is [01:48.120 --> 01:56.480] AWS Lambda. So now, what is AWS Lambda? It's basically a service to which you can deploy [01:56.480 --> 02:04.080] your code, and this service runs your code only when it's needed, and it also charges [02:04.080 --> 02:12.000] you, because we need to pay for the cloud environment. But if we run the code in Lambda, [02:12.000 --> 02:18.120] we are charged only for the time when the code is running. And that's pretty nice, especially [02:18.120 --> 02:26.400] if we have code that usually just sits there and responds to users just once in a while, [02:26.400 --> 02:31.560] or only during certain periods of time, especially during the day or in the morning when there [02:31.560 --> 02:41.240] is some business activity. So how does AWS Lambda do that? It basically creates environment [02:41.240 --> 02:48.480] and deploys our code when it needs to be executed. And for that, if the code is not already deployed, [02:48.480 --> 02:55.920] it needs to create the runtime and initialize our so-called function, because this is how [02:55.920 --> 03:00.880] our code is called. It's called a function because it's basically just called by the [03:00.880 --> 03:08.040] runtime, it gives some result, and then it's thrown away. In reality, it's not always thrown [03:08.040 --> 03:16.160] away because AWS Lambda tries to cache our code so that it doesn't have to re-initialize [03:16.160 --> 03:24.880] it every time when it's run more frequently. So sometimes it stays there, and then AWS [03:24.880 --> 03:29.960] Lambda can skip the initialization phase. This is called warm start, because the code [03:29.960 --> 03:37.080] is already prepared to serve things. But if this doesn't happen, and the code is not available, [03:37.080 --> 03:42.080] it has to initialize everything. And this is usually referred to as cold start, just [03:42.080 --> 03:49.040] start from scratch. So the whole lifecycle of AWS Lambda is as on the slide, you can [03:49.040 --> 03:55.320] see there's init phase. This is only when the code or the function is not initialized. [03:55.320 --> 04:04.600] So in case of cold start, then there is a warm phase, which happens even for warm start-ups. [04:04.600 --> 04:10.840] There is this invoke phase, which actually is the only productive phase from these three. [04:10.840 --> 04:17.240] It actually does some job. The first phase only initialized gets some things ready before [04:17.240 --> 04:23.200] the application can process requests. Then the invoke phase does the job. And then when [04:23.200 --> 04:30.120] AWS Lambda service decides that it doesn't no longer needs our application running, because [04:30.120 --> 04:37.920] it's not doing anything right now, and they need AWS wants to use resources in some other [04:37.920 --> 04:45.400] way, it will tear everything away. So it will shut down the environment. And then we'll [04:45.400 --> 04:54.480] add square at square zero. And next invocation needs to go through the initial initialization [04:54.480 --> 05:05.840] phase. So let's not go back to the roots with plain Java application. And let's see or let's [05:05.840 --> 05:14.280] think about how fast we can get with Java on AWS Lambda. Can we start Java really fast? [05:14.280 --> 05:22.920] I tried to start a very simple Java program on my local machine. And if you do that too [05:22.920 --> 05:29.160] on your computers, you will see that Java really starts fast. In my case, it was 50 [05:29.160 --> 05:40.280] milliseconds, 0.05 seconds. So very small fraction of a second, where JVM started, printed [05:40.280 --> 05:47.200] something on output and finished. So we see on a local computer, plain Java doesn't start, [05:47.200 --> 05:57.040] doesn't take very long to start. If we compare the exactly what's going on in the AWS Lambda, [05:57.040 --> 06:02.440] because AWS Lambda needs to initialize the environment and only then it can run Java [06:02.440 --> 06:08.000] function. It takes a bit longer in reality. But when we compare it to other languages, [06:08.000 --> 06:12.760] I haven't done this. This is done by some other guy who is more experienced with AWS [06:12.760 --> 06:19.760] Lambda than me and compared performance in a more sophisticated way than just running [06:19.760 --> 06:27.600] on the computer or just several measurements. He did a lot of measurements across all the [06:27.600 --> 06:35.360] or various different languages, various different runtimes provided by AWS Lambda. And he found [06:35.360 --> 06:41.400] out that Java basically is on the same level as JavaScript, Python and a lot of other languages [06:41.400 --> 06:45.880] that there's not much difference. There's a small difference that at that time C sharp [06:45.880 --> 06:53.960] was a bit slower. But as AWS improves continually, the AWS Lambda, even these numbers would [06:53.960 --> 07:00.240] be probably better now. And C sharp and Docker will be maybe more even with the with the rest [07:00.240 --> 07:06.360] because the technology running AWS Lambda is continuously improving. But this is just [07:06.360 --> 07:14.760] to compare and show that Java itself or even the implementation of Java AWS function or [07:14.760 --> 07:23.920] the environment isn't worse than other languages. So now what is the problem actually? Why a [07:23.920 --> 07:32.040] lot of people perceive that Java starts very slow. The problem is how I see it is that [07:32.040 --> 07:38.720] many people don't think about Java in this simple way that it's a simple application. [07:38.720 --> 07:45.200] A lot of people think about Java as a language that runs enterprise applications. And with [07:45.200 --> 07:51.280] enterprise applications, we're used to use frameworks that do a lot of job for us. We [07:51.280 --> 07:56.440] run the applications on application servers, which are start to which are slow to start. [07:56.440 --> 08:03.400] And this is what we think about when we think when we say Java or when we talk about Java. [08:03.400 --> 08:11.880] So now we're coming to that. That thing that if we basically can run our applications that [08:11.880 --> 08:20.560] are similar to what we were used to before, but if we can start them fast, we could solve [08:20.560 --> 08:28.000] a problem with Java call starts as least as we use Java now. So the question I have now [08:28.000 --> 08:34.800] is Jakarta EE or some other frameworks like Springboard or something like that. Can that [08:34.800 --> 08:44.720] be as fast as plain Java? Can we run that in AWS Lambda to get good performance and [08:44.720 --> 08:53.080] fast startup? And the answer is there are such frameworks and solutions to that. There [08:53.080 --> 08:58.800] are several ones. I don't have much time to talk about all of them. So I picked one that [08:58.800 --> 09:05.560] I personally like. And it's called Piranha Cloud Framework. And this one is based on [09:05.560 --> 09:15.640] entirely Jakarta EE APIs. Previously, it was called Java EE. So it's a very well-known [09:15.640 --> 09:22.200] API that a lot of people already know, a lot of tools out there already use. So it's interoperable [09:22.200 --> 09:29.640] with existing codebase. But the thing with Piranha Cloud is that the implementation actually [09:29.640 --> 09:38.760] the engine of the framework is new, very flexible, and allows our application to stop, start [09:38.760 --> 09:46.920] very fast. Piranha Cloud is based on a lot of existing components. A lot of them come [09:46.920 --> 09:54.200] from the Glassfish server, which actually sort of proves that the server is not a problem [09:54.200 --> 10:00.040] or Jakarta EE is not a problem. The components are there, they are quite fast. But the problem [10:00.040 --> 10:05.680] how they are assembled in traditional Jakarta EE servers, Java EE application servers, [10:05.680 --> 10:10.080] that is the problem. Because an application server usually has a lot of other things that [10:10.080 --> 10:18.880] we don't need in Lambda, like monitoring a lot of vendor features and go on an administration [10:18.880 --> 10:29.480] console and a lot of other things. So here is an example, it's basically nothing [10:29.480 --> 10:38.160] else than a servlet. But this is an application using already some Jakarta EE APIs. And this [10:38.160 --> 10:45.760] application, this servlet, you can run on any Jakarta EE server. You can run it on Tomcat, [10:45.760 --> 10:51.080] you can run it on Glassfish, you can run it on anything that supports servlets. So the [10:51.080 --> 10:57.200] only difference if we run it with Piranha Cloud is that it starts fast and it uses Piranha's [10:57.200 --> 11:05.920] own servlet container, which was designed from scratch. And it's very flexible and fast. [11:05.920 --> 11:14.240] What is also nice about Piranha's servlet container is that it can be embedded very easily. [11:14.240 --> 11:21.680] And that's the crucial point. When we want to use Jakarta EE in Lambda, we need to basically [11:21.680 --> 11:27.560] shave off everything that we don't need. And in AWS Lambda, we don't even need an HTTP [11:27.560 --> 11:34.800] listener. Because AWS Lambda basically only wants a method from us that will be called, [11:34.800 --> 11:40.040] returns some response. And then AWS Lambda is responsible for mapping the HTTP request [11:40.040 --> 11:46.400] to an object that it passes to our method. And then the returned object should be mapped [11:46.400 --> 11:52.400] to an HTTP response. And not only HTTP requests and responses, but Lambda can handle any type [11:52.400 --> 11:59.640] of basically JSON messages, JSON events. So the only thing that our application needs [11:59.640 --> 12:12.200] is to parse some input object and return some output event. And with Piranha, we can create [12:12.200 --> 12:21.760] an engine and map our servlet onto it and just listen on some object. This object is [12:21.760 --> 12:30.680] usually called or the request response cycle is invoked by a service method, which accepts [12:30.680 --> 12:36.160] a request object and returns the response object. And this is exactly how we can use [12:36.160 --> 12:43.560] it in AWS Lambda. We just need to add one additional layer to map AWS request object [12:43.560 --> 12:53.280] to Piranha request object and back. If we run Piranha Cloud, this simple servlet, which [12:53.280 --> 12:59.760] is comparable to our plain Java, we were running before. If you remember with plain Java on [12:59.760 --> 13:06.280] my computer, I had startup times. Actually, it was not only startup times, but until the [13:06.280 --> 13:14.640] program ended and printed some message and finished, it was around 50 milliseconds. With [13:14.640 --> 13:23.040] Piranha, it's a bit longer time. But this already includes the first request. So it's [13:23.040 --> 13:28.600] very similar to the plain Java application. It's not only that the engine starts, but [13:28.600 --> 13:36.640] it actually serves the request response with text message through HTTP stack. And with [13:36.640 --> 13:50.400] that, it takes still comparable time around 130 milliseconds. Now we can compare how it [13:50.400 --> 13:57.440] works in AWS Lambda. And in AWS Lambda, I have a picture, but I hope I will be able to [13:57.440 --> 14:03.960] show you in a minute. As I said before, it takes a bit longer when we start the other [14:03.960 --> 14:10.320] function first time. Because this doesn't really matter if we run Java or any other [14:10.320 --> 14:19.680] runtime. AWS Lambda first needs to create some environment to execute our code in. And [14:19.680 --> 14:27.200] it takes a little bit of time. But together with creating this environment and running [14:27.200 --> 14:34.280] our code, our example Piranha function, it takes under one second to serve the request. [14:34.280 --> 14:40.920] Even if nothing was ready before, even on the first time we tried to run the function, [14:40.920 --> 14:47.920] it still serves the response under one second. If we tried it again, again, again, then the [14:47.920 --> 14:56.200] response times are much faster. This is on the right side here. It's under two milliseconds. [14:56.200 --> 15:02.400] Because this is only the code that needs to serve the request. Everything was initialized. [15:02.400 --> 15:09.200] Environment was initialized. The Piranha engine was initialized. It's cached in a static variable. [15:09.200 --> 15:15.240] So it's part of the process that is already live. AWS just executes a method basically [15:15.240 --> 15:22.520] on the Piranha engine that goes through the servlet and creates the server response. And [15:22.520 --> 15:28.520] that's it. That's why it takes only two milliseconds. This is only the time required to serve the [15:28.520 --> 15:41.800] actual response. So I'll try. I think I have a link here. How it works. [15:41.800 --> 16:01.200] Okay. So this is the actual AWS console where I already deployed the application, the function. [16:01.200 --> 16:08.440] And AWS console has a nice feature called tester or test button. With that, we can directly [16:08.440 --> 16:14.760] invoke the Lambda. Normally, we would have to create an API gateway and map it to Lambda [16:14.760 --> 16:22.720] so that we can access Lambda via HTTP from outside. AWS can also generate some URL that [16:22.720 --> 16:30.160] we can use to invoke the Lambda. But this is like directly execute the Lambda without actually [16:30.160 --> 16:38.320] invoking an HTTP request. So with this, yeah, there is some examples, but the application [16:38.320 --> 16:46.200] doesn't read anything from the request. It just responds with some hello world message. [16:46.200 --> 16:54.800] And if we execute it, you see it takes a bit of a time. And this is what I had in my slide. [16:54.800 --> 17:04.960] Here it's even shorter, 850 milliseconds. But if we try it again, it's already pre-warmed [17:04.960 --> 17:16.880] because AWS caches. Where is it here? Caches the environment. And now it's just two milliseconds. [17:16.880 --> 17:28.560] So now the question is when the cold starts happen. They happen. I don't have any experience. [17:28.560 --> 17:33.560] How much they have an impact. I heard that it's not much of an impact because they happen [17:33.560 --> 17:41.320] normally once in a while. So the response is once in a while, takes one more second on [17:41.320 --> 17:48.880] top of request processing. But if it takes five seconds, which can happen with normal [17:48.880 --> 17:55.680] spring boot application or traditional frameworks or even application service, I don't know, [17:55.680 --> 18:01.080] sometimes some application service can be embedded, then you can run them in AWS Lambda. [18:01.080 --> 18:07.600] But some of them really are hard to basically map to the method call. So you have to install [18:07.600 --> 18:11.520] application server. And for that, it's not even possible application to run application [18:11.520 --> 18:18.560] servers in Lambda. But if you did, it would take 10, 20 seconds with some application [18:18.560 --> 18:26.120] servers. And that's really a difference. You pay for the execution time, but you also [18:26.120 --> 18:34.040] have exposure users to waiting for a couple of seconds. If it's a user facing Lambda. [18:34.040 --> 18:38.440] If it's not, you maybe don't care so much. If it's something that's some bad job that [18:38.440 --> 18:51.520] takes two, three minutes to finish, then couple of seconds don't really matter. [18:51.520 --> 18:59.520] So here's a slide about Piranha Cloud. In short, Piranha Cloud is basically, as I said, [18:59.520 --> 19:07.280] based on a new servlet container designed from scratch, and a lot of components built [19:07.280 --> 19:14.880] on top of it. The servlet container being servlet implementation can run any servlet [19:14.880 --> 19:21.600] out there. And a lot of Jakarta technologies are created as servlets. So for example, Jersey [19:21.600 --> 19:30.480] as a servlet can be deployed on Piranha. And that's quite an easy way how to get rest endpoints [19:30.480 --> 19:36.920] or rest library on Piranha to deploy Jersey as a servlet. And then we have everything [19:36.920 --> 19:43.920] that Jersey provides. We can embed Piranha as I did in my demo, but we can also build [19:43.920 --> 19:48.760] a war application and run the war application with Piranha on command line. This is using [19:48.760 --> 19:55.760] Jakarta distributions, which already contain this distribution of packages, distribution [19:55.760 --> 20:02.880] of functionality of Piranha that are mostly used. And the last thing, it's plain Java. [20:02.880 --> 20:07.960] There's no real magic. There's no generated code. Everything is just clean code written [20:07.960 --> 20:13.600] by clever people, I think. At least judging on the code, when I looked at the code, it [20:13.600 --> 20:21.760] looks like the people were very clever. So with Piranha Cloud, we were able to achieve [20:21.760 --> 20:27.480] quite fast startup times, but it still takes a couple of milliseconds, 100, 200. It depends [20:27.480 --> 20:34.760] on how our application is complex. It may end up to two seconds even if we add all the [20:34.760 --> 20:43.320] Jakarta functionality that Piranha Cloud provides. If we want to reduce that even further, [20:43.320 --> 20:49.240] we have some general Java options to do that. We can first increase the CPU and RAM on the [20:49.240 --> 20:59.760] Lambda, which we can always do with any language. But we can also use a faster JVM. On the last [20:59.760 --> 21:07.200] slide, I have a table where I compared running the same application with Java 11 and Java [21:07.200 --> 21:14.720] 17. If you look at the numbers, Java 17 is mostly most of the time a bit faster. So just [21:14.720 --> 21:22.920] by deciding which Java version we use, we can get a bit better startup time. [21:22.920 --> 21:29.680] Then the last option here is basically a combination. I did some experiments which options work [21:29.680 --> 21:35.240] well regarding to startup time or reducing the startup time. And in the end, not many [21:35.240 --> 21:41.760] things matter. But what matters is class data sharing, which basically caches class information. [21:41.760 --> 21:49.120] So it doesn't have to be loaded and processed in the beginning. It's already pre-computed [21:49.120 --> 21:57.440] before cold start. And tinkering with compiler, we can disable second level just-in-time compiler [21:57.440 --> 22:02.080] if we want to really focus on startup time. [22:02.080 --> 22:09.520] And then there are other more magical options, but they can even reduce performance or reduce [22:09.520 --> 22:15.120] startup time almost to zero, either compiling the code to Gravium, with Gravium to a native [22:15.120 --> 22:22.880] binary which runs the application almost instantly. Or we can use Crack, which is a [22:22.880 --> 22:34.960] co-ordinated restore and checkpoint mechanism. The next talk will be about it also. And yeah, [22:34.960 --> 22:42.440] which is also nice is that AWS Lambda integrated that basically in one of their Java run times. [22:42.440 --> 22:50.680] And it's called snap start. So you can get it for free, but only with Java 11. But hopefully [22:50.680 --> 22:56.120] Java 17 support will be coming soon. And this works in a way that your application [22:56.120 --> 23:01.640] basically stores, or you at the build time can store a checkpoint of your application [23:01.640 --> 23:07.360] with all the memory or all the information basically like hibernates, you can hibernate [23:07.360 --> 23:13.120] your application. And then it started again and again and cold start and warm start in [23:13.120 --> 23:20.520] that case basically don't make a difference because they start from the same point. [23:20.520 --> 23:25.720] That's all from me. If you have any questions, let me know. Thank you for watching.