So, hello everyone. My name is Patricia. I'm very excited and happy to be here. Just quick disclaimer about me. I am Chromium contributor and today I'm going to present about keyboard interactions and how I tried to improve them. So quickly about myself, I am from Vilnius, Lithuania. I moved to the Netherlands to study computer science and during my studies I really got into open source, specifically through the Google Summer Code program. In 2022 I worked on the definition of the INP metric and I continue my work diving deeper into metrics, interactions and specifically into keyboard interactions that I will explain you today. So about 2022 I worked on the Perfetto tool which is a wonderful tool for developers but I won't get into details here because Alexander in few moments will explain everything you need to know. But how I use it to this day, I trace websites with DevTools Timeline check mark and it gives me all necessary information about interactions and specifically event timings. And we know when we have event timings we can get anything about INP metric. So what is INP metric that was already mentioned? It is a very I think popular metric today but it's simply an interaction to next pain metric that assesses responsiveness by measuring key press, tap and click interactions. So for example when you press a key on a virtual or physical keyboard or tap on a touch screen or click a mouse to open any menu on a website everything is measured for developers to see how fast their website responds to users input. And this definition is actually I think wonderful, very innovative and Google since Google announced it very recently that it's going to replace first input delay this March 12th which is very exciting for that performance group. However after looking better into the metric we found out that it's not entirely perfect although it's very wonderful metric but specifically key press interactions aren't working as we would like them to work. So my goal today is to explain how we improve key press interactions, what is firstly a key press interaction and how we measure them and then we will dive deeper into a bit more complex concepts of non-standard interactions such as emojis and how we measure them for the INP. So I guess this lines brings you to kindergarten especially considering the FOSDEM context of very heavy tech topics but to understand what is a key press interaction we really need to look into a simple button because key is just a button but in a more complex context. So when you press any button in this world button goes down and when you release it buttons button goes up. It's just that simple. So within the key press we have very similar behavior that contains the two fundamental events key down and key up. In this example we have one interaction as in the input we see character A and the entire interaction starts with the first event called key down meaning that user press down a key. We immediately generate interaction ID for that saying okay we start the interaction. After key down there is key press event which is dispatched if and only if there is a character value and we see that in this case we do have character value and it means that key value was mapped to a specific key. So for example if you would press anything that wouldn't produce a character value you wouldn't see a key press. Then we have some events about DOM so before input which means that the DOM is about to be updated input which is the immediately dispatched when DOM is updated and lastly we finish the interaction with the key up to which we assign the very same interaction ID as it was generated on the key down. And although this definitely makes sense and most importantly key down and key up are the most significant events in the interaction that you perhaps have already seen. So this sequence gives us the entire definition of keyboard interaction within the INP. So it contains of three time spans as in the click and click and tap interactions but in this case we have input delay, processing time and presentation delay. So input delay is the time when is the duration when user presses down a key and event handlers are executed. Processing time is the time it takes for the code to be executed in the key down and key up event handlers and finally we have presentation delay which is the entire duration when the event handlers are stopped being executed and we finally see something on our frame. And this definition definitely does make sense. However it had some problems. After better investigation we found out that it can be a bit more than confusing. Firstly having key press events interaction ID equals zero makes developers wonder if key press is related to keyboard interaction at all. And to make things worse it turned out that key press can be as large as key up and key down together. Just a second. So with this problem we updated keyboard interaction, the definition of the keyboard interaction and we have that the new update is very similar. It all contains those three time spans of input delay, processing time and presentation delay. However for the processing time we included key press such that it would be between those three candidates of key down, key press and key up at the end. And we really hope that this will remove some confusion for developers as we assign interaction ID to the key press event. And finally we do believe that including key press is a step towards polished, improved and more accurate IMP metric especially within the keyboard interactions. Well with the simple key press is everything is quite well defined. We know where the start is and we know where we finish the interactions. However more interesting things happen with non-standard keyboard interactions because we cannot be sure that our users will always use just standard keyboard interactions. I even came across this post from Instagram on Google's Instagram that has everything to express one idea from emojis to just basic symbols. And to understand how fast the website responds to such input we really need to dive deeper into input method editors. So what is, does anyone know what is input method editor? That's great. So actually I think most of you might have used in some sort of way. It's a software component that enables users to input text that cannot be easily represented on a standard keyboard. So it typically happens due to the very large number of characters in users' written language. And it's very common in East Asia regions for example Korean language, Chinese language and Japanese language. Although I would love to speak Korean, Japanese and Chinese unfortunately I cannot. So today we will look into a bit more standardized example of simple emoji which has very similar structure. So we already can see that we need to process way more events for one emoji than for simple key press. And however everything actually depends how many interactions were made while producing that emoji. In this case we see that users started by typing in H and then selected the emoji as you can see on your left screen example. Since the complexity is way higher of such interactions IDs we only assign interaction ID to input events. But thinking in general the differences between pressing down a key within emoji context and non-emoji context we find out that we still have those very important two fundamental events key down and key up. So with our updates we assigned interaction ID on key down and key up. We still start our interactions with the key down. We generate new interaction ID on key down then assign the same interaction ID to input event and we finish that key press interaction in the emoji context with the key up event. When users just select the emoji without typing we just simply assign interaction ID to the input event. And that gives us better understanding of how non-standard interactions behave. However the algorithm really requires some creativity and some better understanding. But coming up with the solution for something that does not behave in the same way was quite a challenge and the solution might not be most perfect because it heavily relies on the order of events dispatched. And for example we see here that when we hold three keys at the very same time and release them at the very same time we have three key ups at the very end with the exact same interaction IDs. And this shouldn't happen in general. They should have different interaction IDs. But although it might not be the perfect solution but looking into input method editors is a very important way to address web responsiveness within East Asia regions where people actually do use a lot of graphemes in Chinese, Korean and Japanese languages. And who knows maybe just all our emojis are just introduction to a bit more complex interactions. Maybe one day we will see 3D interactions and emojis will be just simple ones. So for this project I'm really grateful for my mentors and I call them heroes. They really helped me through the entire process of understanding interactions, defining INP and understanding how to improve web responsiveness for developers. And thank you a lot for listening. If you have any questions let me know and if you're interested you can read the blog or just ask me anything you want. Thank you. And do you have? Yeah. Yeah basically, so do the interactions go from top to bottom? So it's actually it really depends on the way we see on the websites. And there's like websites that shows all the events dispatched and it starts from top to bottom and maybe it's not the most intuitive way for you to read from top, I mean from bottom to top, right? But is the usual order that what you would get when you try to look into the events dispatched during keyboard interactions? So yeah, I mean yeah, this was from bottom to top, absolutely. Okay, thank you.