[00:00.000 --> 00:12.960]  Hi everybody, welcome to my presentation about Kaitai struct.
[00:12.960 --> 00:17.520]  I am Peter Pucil and I have a question.
[00:17.520 --> 00:21.840]  How many of you have any experience with Kaitai?
[00:21.840 --> 00:27.800]  Okay, there are a few of you.
[00:27.800 --> 00:32.600]  What is Kaitai struct?
[00:32.600 --> 00:41.040]  It's a tool for dealing with binary formats, especially parsing.
[00:41.040 --> 00:49.240]  It is based on a declarative language, Kaitai struct YAML, that can be used to specify arbitrary
[00:49.240 --> 00:51.480]  binary formats.
[00:51.480 --> 01:03.680]  It works as a parse generator and it currently supports 11 target programming languages.
[01:03.680 --> 01:12.440]  Parsing means to convert the binary data you see above to the structure data and object
[01:12.440 --> 01:18.440]  tree so that you can work with it later.
[01:18.440 --> 01:25.240]  Today I will also introduce a new functionality, which is serialization.
[01:25.240 --> 01:38.920]  I've been working on it for the last six months and it currently works in Java.
[01:38.920 --> 01:45.320]  Serialization means, I didn't mention that, that basically the inverse process.
[01:45.320 --> 01:51.920]  You want to create a binary file from an object tree.
[01:51.920 --> 01:56.800]  Something about this story.
[01:56.800 --> 02:07.240]  So the author of Kaitai struct is Michael Action and the project started in 2014.
[02:07.240 --> 02:16.520]  In 2016, Michael decided to release the project as open source and at that time the only supported
[02:16.520 --> 02:20.600]  languages were Java and TrueWe.
[02:20.600 --> 02:29.480]  In 2017, Michael presented Kaitai struct at FOSDEM and by then it already supported eight
[02:29.480 --> 02:36.720]  languages and had over 400 stars on GitHub.
[02:36.720 --> 02:41.000]  Michael also wanted to come today but unfortunately he couldn't.
[02:41.000 --> 02:46.800]  But if there is some chat or something, I think he should be there so you can ask him
[02:46.800 --> 02:49.600]  some questions or whatever.
[02:49.600 --> 02:51.200]  And how is it today?
[02:51.200 --> 03:01.680]  So we have 11 target languages and over 3,000 stars on GitHub and Kaitai is used in more
[03:01.680 --> 03:08.400]  than 500 GitHub projects.
[03:08.400 --> 03:14.320]  So let me share how I discovered Kaitai struct.
[03:14.320 --> 03:23.240]  This was in 2019 and I was playing electronic keyboard with a band and I wanted to create
[03:23.240 --> 03:32.440]  a MIDI editor so that I could record the songs on the keyboard and edit them on the computer.
[03:32.440 --> 03:41.080]  And I wanted the user to be able to upload a sound bank in the sound font to binary format
[03:41.080 --> 03:46.360]  so that they could control how the song could sound.
[03:46.360 --> 03:54.120]  And I wanted a web-based MIDI editor so I searched for a JavaScript parsing library
[03:54.120 --> 04:01.360]  of the.sf2 format but I couldn't find one that would work for me.
[04:01.360 --> 04:09.600]  So I started writing my own parser but it was really hard and a lot of debugging had
[04:09.600 --> 04:13.320]  to be done and it was just not fun.
[04:13.320 --> 04:24.720]  And when I finished I came across Kaitai struct and I found that my two months of work I spent
[04:24.720 --> 04:29.160]  on this could be done in just one day with Kaitai.
[04:29.160 --> 04:38.480]  So Kaitai impressed me with its concept, simplicity and versatility and I started contributing
[04:38.480 --> 04:40.120]  a lot.
[04:40.120 --> 04:47.720]  And Kaitai also helped me my personal development because until then I'd only programmed in
[04:47.720 --> 04:54.440]  JavaScript, PHP and a little bit of Python and within a few months I was able to work
[04:54.440 --> 04:59.120]  in 14 programming languages that were used in Kaitai.
[04:59.120 --> 05:10.520]  And in 2020 I accepted an offer from Michael to become an administrator of the project.
[05:10.520 --> 05:17.760]  So in my story I showed what options there are to get a parser.
[05:17.760 --> 05:24.800]  So the most convenient way that you are probably familiar with is to use a dedicated format
[05:24.800 --> 05:28.520]  library in the given language.
[05:28.520 --> 05:36.240]  So it will probably have a user-friendly API and can be optimized for a format.
[05:36.240 --> 05:44.280]  But sometimes it may be of poor quality and incomplete and it may be difficult to debug
[05:44.280 --> 05:46.560]  and fix it.
[05:46.560 --> 05:55.120]  And also for the most common formats like JPEG, L for zip, you can find even several
[05:55.120 --> 06:03.080]  libraries and you can choose, but for less common formats, some obscure ones, there will
[06:03.080 --> 06:06.440]  simply be no library in your language.
[06:06.440 --> 06:14.680]  So we need to look into other options and another option is to simply write your own
[06:14.680 --> 06:15.680]  parser.
[06:15.680 --> 06:26.600]  But in my experience this is the worst option because it takes a lot of time and you need
[06:26.600 --> 06:34.840]  to do a lot of debugging using some debug brains and dumps and it's just not fun.
[06:34.840 --> 06:40.640]  But it's what most people do, often because they just don't know any better.
[06:40.640 --> 06:44.080]  So that's why I'm here today.
[06:44.080 --> 06:52.480]  And well, the problem is that if you have already written a parser for your format in
[06:52.480 --> 06:58.640]  Python, for example, and then after some time you are asked to create a Java parser for the
[06:58.640 --> 07:06.760]  same format, you basically need to start again.
[07:06.760 --> 07:13.920]  So a bit better way is to use a parser combinator, which means that you are essentially still
[07:13.920 --> 07:21.040]  writing your own parser, but you are using some building blocks from a library.
[07:21.040 --> 07:28.880]  And a parser combinator typically allows you to declaratively define some sub structures,
[07:28.880 --> 07:38.880]  but still in the code and like in CU can define structs for the fixed size pieces of the format
[07:38.880 --> 07:46.000]  and then you can directly interpret some block of bytes with that struct.
[07:46.000 --> 07:57.320]  And there are many parser combinators, perhaps dozens in popular languages, but as with the
[07:57.320 --> 08:08.160]  two previous options, you have still the disadvantage that the parser you get this way is still
[08:08.160 --> 08:11.240]  bound to the particular language.
[08:11.240 --> 08:15.160]  And it may be even bound to an application.
[08:15.160 --> 08:23.120]  For example, if it was developed for a graphical editor, so it may be difficult to separate
[08:23.120 --> 08:30.120]  just a parser from that application to use it somewhere else.
[08:30.120 --> 08:39.520]  And the fourth option is to use a parser generator, which means that you are not writing the parsing
[08:39.520 --> 08:46.880]  code directly in the programming language, but instead you describe it in a domain, describe
[08:46.880 --> 08:54.520]  the format structure in a domain-specific language, and this description can be then
[08:54.520 --> 08:58.960]  automatically translated into a parser.
[08:58.960 --> 09:08.400]  So, KeitaStruck falls into this category and it is the KeitaStruck language is designed
[09:08.400 --> 09:18.200]  so that it's independent of both the application and the programming language.
[09:18.200 --> 09:22.600]  Here I'll show you how to work with Keita.
[09:22.600 --> 09:26.280]  The first stage is compilation.
[09:26.280 --> 09:36.080]  So you take this KeitaStruck specification of the format and in this case, this is a format
[09:36.080 --> 09:45.840]  that has one byte because this U1 type means unsigned integer of one byte.
[09:45.840 --> 09:53.560]  And you take this KeitaStruck specification and you compile it using the KeitaStruck compiler,
[09:53.560 --> 09:55.560]  which is a command line tool.
[09:55.560 --> 10:06.360]  And as output, you get the source code of the parser, in this case in Python.
[10:06.360 --> 10:11.200]  The main stage is parsing.
[10:11.200 --> 10:18.800]  You take, you give the input binary file to the generated parser.
[10:18.800 --> 10:28.320]  You get in the first step and you give the input binary file to the parser as input and
[10:28.320 --> 10:35.040]  you get parsed data as output, so an object tree.
[10:35.040 --> 10:43.480]  And in case of KeitaStruck, the generated parser works with the runtime library so you
[10:43.480 --> 10:50.600]  need to include it also into your application.
[10:50.600 --> 10:51.680]  Why use Keita?
[10:51.680 --> 10:56.360]  What are the advantages?
[10:56.360 --> 11:06.680]  So as I already mentioned, the advantage is that you write the KSY specification once
[11:06.680 --> 11:09.320]  and you can use it everywhere.
[11:09.320 --> 11:16.320]  It standardizes the way we describe binary formats and there are already many formats
[11:16.320 --> 11:25.640]  described in the Keita format gallery and any described format can be visualized automatically
[11:25.640 --> 11:33.040]  in a Gravis diagram and the KeitaStruck language is simple, you will see.
[11:33.040 --> 11:41.360]  There are also several visualization and dumping tool available in KeitaStruck.
[11:41.360 --> 11:53.360]  So the write once use everywhere feature means that you get parses in 11 programming languages
[11:53.360 --> 11:57.920]  for free from a single KSY specification.
[11:57.920 --> 12:07.440]  So in this case, I've had the compiler generate Java, Python and Ruby parsers from a simple
[12:07.440 --> 12:14.120]  KSY specification you see on the left.
[12:14.120 --> 12:20.760]  When you look for specifications of binary formats, you will find that each one looks
[12:20.760 --> 12:32.640]  different and there is no single standard to how to document formats and Keita is used
[12:32.640 --> 12:41.200]  or intended primarily for creating parses but some people write KSY specification just
[12:41.200 --> 12:48.440]  to document a format in an easy to understand way because you don't even have to be a programmer
[12:48.440 --> 13:00.320]  to understand a KSY specification and it's often easier than to read these long PDF documents.
[13:00.320 --> 13:08.080]  And the Keita project includes an extensive gallery of described formats.
[13:08.080 --> 13:28.480]  At the moment, there are 181 formats described by 76 contributors and there are also several
[13:28.480 --> 13:36.720]  hundreds more format specifications in various Keita projects.
[13:36.720 --> 13:46.400]  And so the Keita format gallery contains formats of various kinds, for example, as you see
[13:46.400 --> 13:55.160]  archive files, for example, executables, file systems, game data files, multimedia files
[13:55.160 --> 14:04.560]  and network protocols, you can go to this page and I took it from there.
[14:04.560 --> 14:10.680]  And this suggests the wide applicability of Keita.
[14:10.680 --> 14:19.320]  And it offers an idea to create an international database of formats where various obscure
[14:19.320 --> 14:26.760]  and historical formats would be documented in a uniform way for future preservation.
[14:26.760 --> 14:36.320]  And this would guarantee that we could basically, we could read the binary files we write now
[14:36.320 --> 14:43.800]  in like 100 or 200 years from now.
[14:43.800 --> 14:49.960]  The fact that the Keita extract language is declarative makes it possible to automatically
[14:49.960 --> 15:01.600]  visualize it, visualize the described format in a Gravis diagram.
[15:01.600 --> 15:05.200]  The Keita extract language is simple but powerful.
[15:05.200 --> 15:10.520]  You can describe pretty much any binary format with it.
[15:10.520 --> 15:18.840]  And a case one specification starts with the meta section and this sets the little end
[15:18.840 --> 15:22.760]  in byte order as default.
[15:22.760 --> 15:28.680]  The SEQ section is a sequence of attributes.
[15:28.680 --> 15:34.160]  The attribute name is in the ID key.
[15:34.160 --> 15:45.000]  The type U4 means that in this case num underscore files will be an unsigned for byte integer.
[15:45.000 --> 15:50.280]  You can define your own types in the type section.
[15:50.280 --> 15:53.000]  A field can also be repeated.
[15:53.000 --> 16:04.000]  So in this case the files attribute will be a list or an array of base type file.
[16:04.000 --> 16:13.320]  In the instances section you can define attributes that start at an arbitrary byte position.
[16:13.320 --> 16:20.200]  You can also use a powerful expression language in many places.
[16:20.200 --> 16:27.800]  And there is another built-in type is a character string in a certain encoding.
[16:27.800 --> 16:38.040]  And if you omit the type and only specify the size, the result is a byte array.
[16:38.040 --> 16:44.680]  There are several visualization and dumping tools available for inspecting files.
[16:44.680 --> 16:53.560]  And this can be useful for, for example, for finding errors, forensic analysis, or debugging.
[16:53.560 --> 17:04.880]  And the visualizers allow us to view the structured data parts from the input file based on a
[17:04.880 --> 17:09.800]  kitesh track specification, so something like this.
[17:09.800 --> 17:19.080]  And you can use the console visualizer or also the command line to case dump is available,
[17:19.080 --> 17:24.720]  which can give you the same structured data as you can see in JSON format.
[17:24.720 --> 17:30.640]  And this can be useful for automation.
[17:30.640 --> 17:35.600]  But the most popular visualization tool is the Web IDE.
[17:35.600 --> 17:39.880]  You can check it out on this URL.
[17:39.880 --> 17:48.160]  And at the top right is a hex-dump of the input binary file.
[17:48.160 --> 17:57.280]  So in this case I selected this.png file in the file tree on the left.
[17:57.280 --> 18:06.480]  And at the top left is the kitesh track specification editor, so a KSY spec editor.
[18:06.480 --> 18:13.800]  And according to the kitesh track specification, the input file is parsed and the result is
[18:13.800 --> 18:20.000]  the structured data that you see in the object tree at the bottom, bottom left.
[18:20.000 --> 18:26.120]  And when you edit the kitesh track specification, the input file is automatically parsed again
[18:26.120 --> 18:33.520]  and the object tree is updated.
[18:33.520 --> 18:42.200]  Serialization is a new feature in kitesh track and it's being developed thanks to the financial
[18:42.200 --> 18:47.800]  support of the NLNET Foundation.
[18:47.800 --> 18:56.080]  While parsing allows you to read binary data to an object, serialization is all about
[18:56.080 --> 18:58.080]  the inverse process.
[18:58.080 --> 19:04.240]  So we want to write an object to binary data.
[19:04.240 --> 19:16.800]  And currently in kitesh track, the serialization for support for Java is fully working and
[19:16.800 --> 19:23.160]  C-sharp and Python are in development.
[19:23.160 --> 19:27.240]  There are basically two use cases of serialization.
[19:27.240 --> 19:34.960]  You can edit an existing file or you can create a new file from scratch.
[19:34.960 --> 19:43.240]  And the support for serialization greatly extends the use of all written format specifications
[19:43.240 --> 19:49.160]  because now you can use them not only for parsing but also for serialization.
[19:49.160 --> 19:57.480]  And this has many uses, for example, you can convert one format into another or it can
[19:57.480 --> 20:06.780]  be used for fuzzing or video games modding and so on.
[20:06.780 --> 20:12.960]  This serialization process in kitesh track can be divided into four phases.
[20:12.960 --> 20:20.720]  First you need to create a ks object and then you fill it with data.
[20:20.720 --> 20:27.880]  So you set its individual fields or attributes using setters.
[20:27.880 --> 20:33.560]  Then you should call the underscore check method to check the consistency of the data
[20:33.560 --> 20:37.400]  with the format constraints.
[20:37.400 --> 20:50.720]  Finally, we can call underscore write and pass the stream where to write.
[20:50.720 --> 20:58.920]  And you can actually check out more details of how to use serialization in Java on this
[20:58.920 --> 21:01.280]  page.
[21:01.280 --> 21:12.320]  Currently, the serialization support in kitesh track is designed for the general case so
[21:12.320 --> 21:16.760]  that it works for every conceivable format specification.
[21:16.760 --> 21:25.240]  While a simple solution would work for perhaps most specifications, well, the solution that
[21:25.240 --> 21:29.600]  works for all of them was chosen.
[21:29.600 --> 21:33.720]  Even at the cost of delegating some task to the user.
[21:33.720 --> 21:40.960]  In the future, I would like to automate these tasks that need to be done manually at the
[21:40.960 --> 21:47.560]  moment so that it's more convenient for the user.
[21:47.560 --> 21:56.840]  The basic idea is that the user sets everything, including lengths of sets, magic signatures
[21:56.840 --> 22:00.960]  and kitesh track checks for consistency.
[22:00.960 --> 22:06.000]  Also, only fixed length streams are considered.
[22:06.000 --> 22:11.800]  So once you create a stream, you cannot resize it.
[22:11.800 --> 22:20.640]  Finally, I would like to talk about the plans for the future.
[22:20.640 --> 22:30.800]  Design for C-sharp and Python is in development and they should be ready in two months.
[22:30.800 --> 22:37.640]  There is also interest in adding Rust, C and Julia as target languages.
[22:37.640 --> 22:45.240]  And I would also like to see Wireshark desectors as a target because the concept of kitesh
[22:45.240 --> 22:48.440]  is not limited to programming languages.
[22:48.440 --> 22:56.200]  A target can be anything, for example, we already have a target for construct, which
[22:56.200 --> 23:04.640]  is a Python library for parsing and serialization of binary data.
[23:04.640 --> 23:05.640]  Thanks for listening.
[23:05.640 --> 23:33.200]  Now it's time for our questions.
[23:33.200 --> 23:47.840]  Yes, there is a dock key for which you can use on attributes and types in many places
[23:47.840 --> 24:00.640]  and you can write some documentation of the specific element and in some languages, but
[24:00.640 --> 24:14.880]  it doesn't work like 100% of the time, but the idea is that these documentation should
[24:14.880 --> 24:27.000]  translate to the generated parser as dock blocks and then the IDs and tools for development
[24:27.000 --> 24:32.280]  should autocomplete usually this documentation.
[24:32.280 --> 24:43.280]  Do you support in DNS when generating source code, depending on the target machine?
[24:43.280 --> 24:56.080]  Yes, there is a feature for calculated NDNS, it is called and you can switch the NDNS or
[24:56.080 --> 25:05.720]  the default NDNS based on the value of an arbitrary expression basically, so this can
[25:05.720 --> 25:06.720] ...
[25:06.720 --> 25:12.200]  But do you support host NDNS and target NDNS?
[25:12.200 --> 25:23.560]  Well, not really, but it's not that of a limitation because you can, for example, you can use
[25:23.560 --> 25:31.400]  parameters, for example, to pass it from your application basically because I don't
[25:31.400 --> 25:40.360]  know if I can... I don't know if it's a good idea, but another feature of KiteStruck
[25:40.360 --> 25:48.880]  is that you can define that types can have parameters and even the top level... Yeah,
[25:48.880 --> 25:57.840]  I should probably at least... Never mind, yeah, and you can define parameters and you
[25:57.840 --> 26:09.000]  can easily pass a parameter from your application that will somehow change the behavior of the
[26:09.000 --> 26:13.200]  specification over, yeah, so it's possible.
[26:13.200 --> 26:24.400]  With KSI, you seem to aim to define specification for certain languages or formats, but for
[26:24.400 --> 26:30.400]  languages and formats that already have a specification, how can you ensure that these two specs are
[26:30.400 --> 26:38.400]  actually the same and that you're not passing differently than other parts of it?
[26:38.400 --> 26:44.800]  I don't... Well, you mean that there is already an implementation of some...
[26:44.800 --> 26:53.800]  For example, someone's passing ZIP files out there, how do you guarantee that KiteStruck
[26:53.800 --> 26:59.400]  will pass ZIP files the same way?
[26:59.400 --> 27:13.000]  You don't basically, but from this point of view, it's just another implementation... Well,
[27:13.000 --> 27:20.320]  if you compare it to other parsers, for example, so there is, for example, a ZIP parser in
[27:20.320 --> 27:27.240]  every language, yeah, so ZIP parser library and this KiteStruck specification, it's just
[27:27.240 --> 27:34.800]  another implementation, so, well, it needs to be developed carefully so that it works
[27:34.800 --> 27:35.800]  well or... Yeah.
[27:35.800 --> 27:44.800]  I guess you would need a way to translate from a written specification to the KataI structure
[27:44.800 --> 27:51.280]  or the other way around to validate that what you wrote as the script actually corresponds
[27:51.280 --> 27:57.040]  to the actual specification, for example, if a specification is already matched in machine
[27:57.040 --> 28:04.200]  written, which is readable, I mean, it's not here, we should have a tool to convert from
[28:04.200 --> 28:08.360]  one to the other, so that would ensure that the passing is correct.
[28:08.360 --> 28:14.400]  But it doesn't help, because the implementation is done by humans, it's impossible, it's
[28:14.400 --> 28:15.400]  impossible.
[28:15.400 --> 28:16.400]  It's just an introduction.
[28:16.400 --> 28:17.400]  You have to run all those things.
[28:17.400 --> 28:18.400]  Why?
[28:18.400 --> 28:45.920]  I'm wondering if it would be possible to add some functionality to that, not only parsing
[28:45.920 --> 28:52.480]  but some very common functionality, do you think you can add that in the highest form?
[28:52.480 --> 28:57.760]  Common functionalities, so... Like, for example, there's a binary format and there's
[28:57.760 --> 29:04.320]  very common functionality everybody uses on that, let's say, like, I don't know, cutting
[29:04.320 --> 29:11.960]  a part of it or getting, calculating some, I don't know, value, magic value or hash value,
[29:11.960 --> 29:19.040]  could you add some extra functionality other than parsing in there?
[29:19.040 --> 29:27.280]  Well, so the question was that if you can, if we can add some common functionality in
[29:27.280 --> 29:37.120]  addition to the format specification, and the answer is that, well, you can do this
[29:37.120 --> 29:46.000]  to a certain extent, because there are, I didn't mention them or talk about them, but
[29:46.000 --> 29:57.960]  there are value instances, and you can prepare some, you can, this is like a calculated attribute,
[29:57.960 --> 30:03.520]  so you can write an arbitrary expression to it, and this can calculate, for example,
[30:03.520 --> 30:14.520]  some, like, I wrote a, I wrote a BMP specification or I extended it, and I used this, for example,
[30:14.520 --> 30:23.000]  to, well, in the BMP format, there are like color masks in different places, depending
[30:23.000 --> 30:33.480]  on the head version, and I used a value instance to get it from, so, depending on
[30:33.480 --> 30:44.000]  the version, so either get it from here on here, or if it's a fixed, fixed, I don't
[30:44.000 --> 30:50.720]  know if it's fixed core palette, or what is it called, so, yeah, we can do this to
[30:50.720 --> 31:00.800]  a certain extent, but some common functionality, like, I don't know some, well, if it would
[31:00.800 --> 31:09.320]  require, like, a programming language or something like that, so this would be infeasible, basically,
[31:09.320 --> 31:22.200]  because then, then we should, we would have, we would have to some, some programming languages,
[31:22.200 --> 31:31.240]  something language that translates to all targets, which is basically impossible, I
[31:31.240 --> 31:32.240]  think, yeah.
[31:32.240 --> 31:52.760]  There is some different type of learning, like service, you know, testing, you are
[31:52.760 --> 31:57.760]  this tool set to write a comprehensive diff tools
[31:58.320 --> 32:01.080]  that explains the differences between two binaries
[32:01.080 --> 32:06.080]  and that can leverage the existing descriptions
[32:07.880 --> 32:11.880]  to explain what the difference became to find.
[32:13.240 --> 32:18.240]  Yes, so I think you can compute some diff.
[32:18.240 --> 32:23.240]  Basically, I would do it, I showed the ksdump tool here.
[32:24.920 --> 32:29.920]  So I think you could generate the gson dumps of the two files
[32:31.840 --> 32:36.840]  and compare them, but when I did this,
[32:37.240 --> 32:40.240]  it was usually very, very massive,
[32:40.240 --> 32:49.240]  but you can probably improve that somehow, I don't know.
[32:49.240 --> 33:02.240]  But it's, yeah, okay, so thanks and...