[00:00.000 --> 00:17.800]  I used to work at Pyra, so maybe some of you know me from my six years at Pyra company.
[00:17.800 --> 00:26.640]  Now I work at Omnifish, where we, with our co-founders and employees, were a support
[00:26.640 --> 00:34.800]  glassfish server, so back to the roots, kind of. But this time I'd like to talk about Java,
[00:34.800 --> 00:45.200]  plain Java and Jakarta E, and how it all fits together when we combine that with AWS.
[00:45.200 --> 00:56.240]  So first, before I talked about AWS, let's ask, why do we want to have Java fast, or
[00:56.240 --> 01:06.960]  do we want to have Java start fast? I think everybody wants that, but why? Because it's
[01:06.960 --> 01:14.880]  cool, or because we need it. So there were times when we really didn't need that, when
[01:14.880 --> 01:20.680]  we had the application servers, it was a pain that it took a while to start, but in production
[01:20.680 --> 01:26.880]  it was already running, so there was no real business need for that, only to make developers
[01:26.880 --> 01:33.920]  happy and be more productive with developing codes. But now we have several use cases where
[01:33.920 --> 01:42.000]  it's really needed, because the more time it takes for Java program to start, it costs
[01:42.000 --> 01:48.120]  more money, and it's not user-friendly. And one example, a perfect example of this is
[01:48.120 --> 01:56.480]  AWS Lambda. So now, what is AWS Lambda? It's basically a service to which you can deploy
[01:56.480 --> 02:04.080]  your code, and this service runs your code only when it's needed, and it also charges
[02:04.080 --> 02:12.000]  you, because we need to pay for the cloud environment. But if we run the code in Lambda,
[02:12.000 --> 02:18.120]  we are charged only for the time when the code is running. And that's pretty nice, especially
[02:18.120 --> 02:26.400]  if we have code that usually just sits there and responds to users just once in a while,
[02:26.400 --> 02:31.560]  or only during certain periods of time, especially during the day or in the morning when there
[02:31.560 --> 02:41.240]  is some business activity. So how does AWS Lambda do that? It basically creates environment
[02:41.240 --> 02:48.480]  and deploys our code when it needs to be executed. And for that, if the code is not already deployed,
[02:48.480 --> 02:55.920]  it needs to create the runtime and initialize our so-called function, because this is how
[02:55.920 --> 03:00.880]  our code is called. It's called a function because it's basically just called by the
[03:00.880 --> 03:08.040]  runtime, it gives some result, and then it's thrown away. In reality, it's not always thrown
[03:08.040 --> 03:16.160]  away because AWS Lambda tries to cache our code so that it doesn't have to re-initialize
[03:16.160 --> 03:24.880]  it every time when it's run more frequently. So sometimes it stays there, and then AWS
[03:24.880 --> 03:29.960]  Lambda can skip the initialization phase. This is called warm start, because the code
[03:29.960 --> 03:37.080]  is already prepared to serve things. But if this doesn't happen, and the code is not available,
[03:37.080 --> 03:42.080]  it has to initialize everything. And this is usually referred to as cold start, just
[03:42.080 --> 03:49.040]  start from scratch. So the whole lifecycle of AWS Lambda is as on the slide, you can
[03:49.040 --> 03:55.320]  see there's init phase. This is only when the code or the function is not initialized.
[03:55.320 --> 04:04.600]  So in case of cold start, then there is a warm phase, which happens even for warm start-ups.
[04:04.600 --> 04:10.840]  There is this invoke phase, which actually is the only productive phase from these three.
[04:10.840 --> 04:17.240]  It actually does some job. The first phase only initialized gets some things ready before
[04:17.240 --> 04:23.200]  the application can process requests. Then the invoke phase does the job. And then when
[04:23.200 --> 04:30.120]  AWS Lambda service decides that it doesn't no longer needs our application running, because
[04:30.120 --> 04:37.920]  it's not doing anything right now, and they need AWS wants to use resources in some other
[04:37.920 --> 04:45.400]  way, it will tear everything away. So it will shut down the environment. And then we'll
[04:45.400 --> 04:54.480]  add square at square zero. And next invocation needs to go through the initial initialization
[04:54.480 --> 05:05.840]  phase. So let's not go back to the roots with plain Java application. And let's see or let's
[05:05.840 --> 05:14.280]  think about how fast we can get with Java on AWS Lambda. Can we start Java really fast?
[05:14.280 --> 05:22.920]  I tried to start a very simple Java program on my local machine. And if you do that too
[05:22.920 --> 05:29.160]  on your computers, you will see that Java really starts fast. In my case, it was 50
[05:29.160 --> 05:40.280]  milliseconds, 0.05 seconds. So very small fraction of a second, where JVM started, printed
[05:40.280 --> 05:47.200]  something on output and finished. So we see on a local computer, plain Java doesn't start,
[05:47.200 --> 05:57.040]  doesn't take very long to start. If we compare the exactly what's going on in the AWS Lambda,
[05:57.040 --> 06:02.440]  because AWS Lambda needs to initialize the environment and only then it can run Java
[06:02.440 --> 06:08.000]  function. It takes a bit longer in reality. But when we compare it to other languages,
[06:08.000 --> 06:12.760]  I haven't done this. This is done by some other guy who is more experienced with AWS
[06:12.760 --> 06:19.760]  Lambda than me and compared performance in a more sophisticated way than just running
[06:19.760 --> 06:27.600]  on the computer or just several measurements. He did a lot of measurements across all the
[06:27.600 --> 06:35.360]  or various different languages, various different runtimes provided by AWS Lambda. And he found
[06:35.360 --> 06:41.400]  out that Java basically is on the same level as JavaScript, Python and a lot of other languages
[06:41.400 --> 06:45.880]  that there's not much difference. There's a small difference that at that time C sharp
[06:45.880 --> 06:53.960]  was a bit slower. But as AWS improves continually, the AWS Lambda, even these numbers would
[06:53.960 --> 07:00.240]  be probably better now. And C sharp and Docker will be maybe more even with the with the rest
[07:00.240 --> 07:06.360]  because the technology running AWS Lambda is continuously improving. But this is just
[07:06.360 --> 07:14.760]  to compare and show that Java itself or even the implementation of Java AWS function or
[07:14.760 --> 07:23.920]  the environment isn't worse than other languages. So now what is the problem actually? Why a
[07:23.920 --> 07:32.040]  lot of people perceive that Java starts very slow. The problem is how I see it is that
[07:32.040 --> 07:38.720]  many people don't think about Java in this simple way that it's a simple application.
[07:38.720 --> 07:45.200]  A lot of people think about Java as a language that runs enterprise applications. And with
[07:45.200 --> 07:51.280]  enterprise applications, we're used to use frameworks that do a lot of job for us. We
[07:51.280 --> 07:56.440]  run the applications on application servers, which are start to which are slow to start.
[07:56.440 --> 08:03.400]  And this is what we think about when we think when we say Java or when we talk about Java.
[08:03.400 --> 08:11.880]  So now we're coming to that. That thing that if we basically can run our applications that
[08:11.880 --> 08:20.560]  are similar to what we were used to before, but if we can start them fast, we could solve
[08:20.560 --> 08:28.000]  a problem with Java call starts as least as we use Java now. So the question I have now
[08:28.000 --> 08:34.800]  is Jakarta EE or some other frameworks like Springboard or something like that. Can that
[08:34.800 --> 08:44.720]  be as fast as plain Java? Can we run that in AWS Lambda to get good performance and
[08:44.720 --> 08:53.080]  fast startup? And the answer is there are such frameworks and solutions to that. There
[08:53.080 --> 08:58.800]  are several ones. I don't have much time to talk about all of them. So I picked one that
[08:58.800 --> 09:05.560]  I personally like. And it's called Piranha Cloud Framework. And this one is based on
[09:05.560 --> 09:15.640]  entirely Jakarta EE APIs. Previously, it was called Java EE. So it's a very well-known
[09:15.640 --> 09:22.200]  API that a lot of people already know, a lot of tools out there already use. So it's interoperable
[09:22.200 --> 09:29.640]  with existing codebase. But the thing with Piranha Cloud is that the implementation actually
[09:29.640 --> 09:38.760]  the engine of the framework is new, very flexible, and allows our application to stop, start
[09:38.760 --> 09:46.920]  very fast. Piranha Cloud is based on a lot of existing components. A lot of them come
[09:46.920 --> 09:54.200]  from the Glassfish server, which actually sort of proves that the server is not a problem
[09:54.200 --> 10:00.040]  or Jakarta EE is not a problem. The components are there, they are quite fast. But the problem
[10:00.040 --> 10:05.680]  how they are assembled in traditional Jakarta EE servers, Java EE application servers,
[10:05.680 --> 10:10.080]  that is the problem. Because an application server usually has a lot of other things that
[10:10.080 --> 10:18.880]  we don't need in Lambda, like monitoring a lot of vendor features and go on an administration
[10:18.880 --> 10:29.480]  console and a lot of other things. So here is an example, it's basically nothing
[10:29.480 --> 10:38.160]  else than a servlet. But this is an application using already some Jakarta EE APIs. And this
[10:38.160 --> 10:45.760]  application, this servlet, you can run on any Jakarta EE server. You can run it on Tomcat,
[10:45.760 --> 10:51.080]  you can run it on Glassfish, you can run it on anything that supports servlets. So the
[10:51.080 --> 10:57.200]  only difference if we run it with Piranha Cloud is that it starts fast and it uses Piranha's
[10:57.200 --> 11:05.920]  own servlet container, which was designed from scratch. And it's very flexible and fast.
[11:05.920 --> 11:14.240]  What is also nice about Piranha's servlet container is that it can be embedded very easily.
[11:14.240 --> 11:21.680]  And that's the crucial point. When we want to use Jakarta EE in Lambda, we need to basically
[11:21.680 --> 11:27.560]  shave off everything that we don't need. And in AWS Lambda, we don't even need an HTTP
[11:27.560 --> 11:34.800]  listener. Because AWS Lambda basically only wants a method from us that will be called,
[11:34.800 --> 11:40.040]  returns some response. And then AWS Lambda is responsible for mapping the HTTP request
[11:40.040 --> 11:46.400]  to an object that it passes to our method. And then the returned object should be mapped
[11:46.400 --> 11:52.400]  to an HTTP response. And not only HTTP requests and responses, but Lambda can handle any type
[11:52.400 --> 11:59.640]  of basically JSON messages, JSON events. So the only thing that our application needs
[11:59.640 --> 12:12.200]  is to parse some input object and return some output event. And with Piranha, we can create
[12:12.200 --> 12:21.760]  an engine and map our servlet onto it and just listen on some object. This object is
[12:21.760 --> 12:30.680]  usually called or the request response cycle is invoked by a service method, which accepts
[12:30.680 --> 12:36.160]  a request object and returns the response object. And this is exactly how we can use
[12:36.160 --> 12:43.560]  it in AWS Lambda. We just need to add one additional layer to map AWS request object
[12:43.560 --> 12:53.280]  to Piranha request object and back. If we run Piranha Cloud, this simple servlet, which
[12:53.280 --> 12:59.760]  is comparable to our plain Java, we were running before. If you remember with plain Java on
[12:59.760 --> 13:06.280]  my computer, I had startup times. Actually, it was not only startup times, but until the
[13:06.280 --> 13:14.640]  program ended and printed some message and finished, it was around 50 milliseconds. With
[13:14.640 --> 13:23.040]  Piranha, it's a bit longer time. But this already includes the first request. So it's
[13:23.040 --> 13:28.600]  very similar to the plain Java application. It's not only that the engine starts, but
[13:28.600 --> 13:36.640]  it actually serves the request response with text message through HTTP stack. And with
[13:36.640 --> 13:50.400]  that, it takes still comparable time around 130 milliseconds. Now we can compare how it
[13:50.400 --> 13:57.440]  works in AWS Lambda. And in AWS Lambda, I have a picture, but I hope I will be able to
[13:57.440 --> 14:03.960]  show you in a minute. As I said before, it takes a bit longer when we start the other
[14:03.960 --> 14:10.320]  function first time. Because this doesn't really matter if we run Java or any other
[14:10.320 --> 14:19.680]  runtime. AWS Lambda first needs to create some environment to execute our code in. And
[14:19.680 --> 14:27.200]  it takes a little bit of time. But together with creating this environment and running
[14:27.200 --> 14:34.280]  our code, our example Piranha function, it takes under one second to serve the request.
[14:34.280 --> 14:40.920]  Even if nothing was ready before, even on the first time we tried to run the function,
[14:40.920 --> 14:47.920]  it still serves the response under one second. If we tried it again, again, again, then the
[14:47.920 --> 14:56.200]  response times are much faster. This is on the right side here. It's under two milliseconds.
[14:56.200 --> 15:02.400]  Because this is only the code that needs to serve the request. Everything was initialized.
[15:02.400 --> 15:09.200]  Environment was initialized. The Piranha engine was initialized. It's cached in a static variable.
[15:09.200 --> 15:15.240]  So it's part of the process that is already live. AWS just executes a method basically
[15:15.240 --> 15:22.520]  on the Piranha engine that goes through the servlet and creates the server response. And
[15:22.520 --> 15:28.520]  that's it. That's why it takes only two milliseconds. This is only the time required to serve the
[15:28.520 --> 15:41.800]  actual response. So I'll try. I think I have a link here. How it works.
[15:41.800 --> 16:01.200]  Okay. So this is the actual AWS console where I already deployed the application, the function.
[16:01.200 --> 16:08.440]  And AWS console has a nice feature called tester or test button. With that, we can directly
[16:08.440 --> 16:14.760]  invoke the Lambda. Normally, we would have to create an API gateway and map it to Lambda
[16:14.760 --> 16:22.720]  so that we can access Lambda via HTTP from outside. AWS can also generate some URL that
[16:22.720 --> 16:30.160]  we can use to invoke the Lambda. But this is like directly execute the Lambda without actually
[16:30.160 --> 16:38.320]  invoking an HTTP request. So with this, yeah, there is some examples, but the application
[16:38.320 --> 16:46.200]  doesn't read anything from the request. It just responds with some hello world message.
[16:46.200 --> 16:54.800]  And if we execute it, you see it takes a bit of a time. And this is what I had in my slide.
[16:54.800 --> 17:04.960]  Here it's even shorter, 850 milliseconds. But if we try it again, it's already pre-warmed
[17:04.960 --> 17:16.880]  because AWS caches. Where is it here? Caches the environment. And now it's just two milliseconds.
[17:16.880 --> 17:28.560]  So now the question is when the cold starts happen. They happen. I don't have any experience.
[17:28.560 --> 17:33.560]  How much they have an impact. I heard that it's not much of an impact because they happen
[17:33.560 --> 17:41.320]  normally once in a while. So the response is once in a while, takes one more second on
[17:41.320 --> 17:48.880]  top of request processing. But if it takes five seconds, which can happen with normal
[17:48.880 --> 17:55.680]  spring boot application or traditional frameworks or even application service, I don't know,
[17:55.680 --> 18:01.080]  sometimes some application service can be embedded, then you can run them in AWS Lambda.
[18:01.080 --> 18:07.600]  But some of them really are hard to basically map to the method call. So you have to install
[18:07.600 --> 18:11.520]  application server. And for that, it's not even possible application to run application
[18:11.520 --> 18:18.560]  servers in Lambda. But if you did, it would take 10, 20 seconds with some application
[18:18.560 --> 18:26.120]  servers. And that's really a difference. You pay for the execution time, but you also
[18:26.120 --> 18:34.040]  have exposure users to waiting for a couple of seconds. If it's a user facing Lambda.
[18:34.040 --> 18:38.440]  If it's not, you maybe don't care so much. If it's something that's some bad job that
[18:38.440 --> 18:51.520]  takes two, three minutes to finish, then couple of seconds don't really matter.
[18:51.520 --> 18:59.520]  So here's a slide about Piranha Cloud. In short, Piranha Cloud is basically, as I said,
[18:59.520 --> 19:07.280]  based on a new servlet container designed from scratch, and a lot of components built
[19:07.280 --> 19:14.880]  on top of it. The servlet container being servlet implementation can run any servlet
[19:14.880 --> 19:21.600]  out there. And a lot of Jakarta technologies are created as servlets. So for example, Jersey
[19:21.600 --> 19:30.480]  as a servlet can be deployed on Piranha. And that's quite an easy way how to get rest endpoints
[19:30.480 --> 19:36.920]  or rest library on Piranha to deploy Jersey as a servlet. And then we have everything
[19:36.920 --> 19:43.920]  that Jersey provides. We can embed Piranha as I did in my demo, but we can also build
[19:43.920 --> 19:48.760]  a war application and run the war application with Piranha on command line. This is using
[19:48.760 --> 19:55.760]  Jakarta distributions, which already contain this distribution of packages, distribution
[19:55.760 --> 20:02.880]  of functionality of Piranha that are mostly used. And the last thing, it's plain Java.
[20:02.880 --> 20:07.960]  There's no real magic. There's no generated code. Everything is just clean code written
[20:07.960 --> 20:13.600]  by clever people, I think. At least judging on the code, when I looked at the code, it
[20:13.600 --> 20:21.760]  looks like the people were very clever. So with Piranha Cloud, we were able to achieve
[20:21.760 --> 20:27.480]  quite fast startup times, but it still takes a couple of milliseconds, 100, 200. It depends
[20:27.480 --> 20:34.760]  on how our application is complex. It may end up to two seconds even if we add all the
[20:34.760 --> 20:43.320]  Jakarta functionality that Piranha Cloud provides. If we want to reduce that even further,
[20:43.320 --> 20:49.240]  we have some general Java options to do that. We can first increase the CPU and RAM on the
[20:49.240 --> 20:59.760]  Lambda, which we can always do with any language. But we can also use a faster JVM. On the last
[20:59.760 --> 21:07.200]  slide, I have a table where I compared running the same application with Java 11 and Java
[21:07.200 --> 21:14.720]  17. If you look at the numbers, Java 17 is mostly most of the time a bit faster. So just
[21:14.720 --> 21:22.920]  by deciding which Java version we use, we can get a bit better startup time.
[21:22.920 --> 21:29.680]  Then the last option here is basically a combination. I did some experiments which options work
[21:29.680 --> 21:35.240]  well regarding to startup time or reducing the startup time. And in the end, not many
[21:35.240 --> 21:41.760]  things matter. But what matters is class data sharing, which basically caches class information.
[21:41.760 --> 21:49.120]  So it doesn't have to be loaded and processed in the beginning. It's already pre-computed
[21:49.120 --> 21:57.440]  before cold start. And tinkering with compiler, we can disable second level just-in-time compiler
[21:57.440 --> 22:02.080]  if we want to really focus on startup time.
[22:02.080 --> 22:09.520]  And then there are other more magical options, but they can even reduce performance or reduce
[22:09.520 --> 22:15.120]  startup time almost to zero, either compiling the code to Gravium, with Gravium to a native
[22:15.120 --> 22:22.880]  binary which runs the application almost instantly. Or we can use Crack, which is a
[22:22.880 --> 22:34.960]  co-ordinated restore and checkpoint mechanism. The next talk will be about it also. And yeah,
[22:34.960 --> 22:42.440]  which is also nice is that AWS Lambda integrated that basically in one of their Java run times.
[22:42.440 --> 22:50.680]  And it's called snap start. So you can get it for free, but only with Java 11. But hopefully
[22:50.680 --> 22:56.120]  Java 17 support will be coming soon. And this works in a way that your application
[22:56.120 --> 23:01.640]  basically stores, or you at the build time can store a checkpoint of your application
[23:01.640 --> 23:07.360]  with all the memory or all the information basically like hibernates, you can hibernate
[23:07.360 --> 23:13.120]  your application. And then it started again and again and cold start and warm start in
[23:13.120 --> 23:20.520]  that case basically don't make a difference because they start from the same point.
[23:20.520 --> 23:25.720]  That's all from me. If you have any questions, let me know. Thank you for watching.