Hi everyone. How many Django users in here? Raise your hands. Keep your hands up if you are dealing with Django projects with a lot of migrations, with time and continuous integration minutes. Okay, let's talk it for you. Perfect. You are in the right room. Now, I am Denny. I am on your right side of the photo. I work again in JavaScript, Python, Vue.js, Django, everything. It's pain-me-stuff. So let's start with Django migrations. Our way to propagate changes from your models to a database schema and keeping track of them. Let's quickly recap migration commands. So you can use make migrations, migrate, show migration, and SQL migrate. The first one, make migrations, create new migrations based on your model chains. You can use different parameters in there. For example, an empty migration you can customize. You can give a migration a specific name, and you can restrict the creation of a migration to a specific application. The model, for example, if you want to recreate Twitter, we know the reason for that, is this one. You can create a class for a model, and then creating the migration with the command will create a new file in your project in the migration folder with this content. So initial equal true if it's the first migration in your project. A list of dependencies if you are using something like, for example, authentication, or if you are on the second migration in the project, the first dependency is the first migration, and a list of operations performed during the migration. Then you can migrate your migration, of course, using this command specifying an application or not, or a migration name. So if you want to move to a specific point in the history of your migration, you can specify this. So as a new project, you can migrate everything using managepi migrates, and everything is at the last version of your database schema. Then if you want to roll back every migration in a project, you can migrate to the zero migration, and everything is rolled back. You can move to the second migration in your project with this, and without specifying a migration number, you can migrate everything to the latest version. Now how this works under the hood, you have in your database a Django migration table with a content like this, so the application name, the name of the migration, and the date time when the migration has been applied to your database, so everything is on your database. There is a better way to show this, so using show migration, you can have a view of your list of migration in your database, in your schema, with a tick if the migration has been already applied in your database. And then with SQL Migrate, you can print your SQL statement for a specific migration. So with our example, we can display the SQL code for this. So let's take a look at this. A transaction will be opened, every command will be applied on your database, and then the transaction will be committed if no errors. Now if you need to make further changes in your model, you can apply those changes and then create another migration. The migration will depend on the first one, and then the code will be another transaction, the SQL command, and commit. And again, and again, you can apply migration on your database in production using this. What if you need to do further changes, then for example, an every tweet likes and a lot of other stuff, then you can make change in your models, create a single migration, because of course I like to be well organized and structured, so every single change for me means a single migration. Then you end up having a lot of migration like this one. But even worse, if you need to create, for example, a shop app for a customer, then you need to create a model, and then during the lifetime of your application, you need to do a lot of changes to your model structures. Okay, we won't list this, but we had to do a lot of changes, for example, adding tables, switching data from a table to another, to a main table to a detailed table, and a lot of other stuff, changing data during your workflow. So changes can be a lot of pain, a lot of stuff, and when migrations become a lot, then your performance during tests could decrease a lot, because during the deploy is perfect, you can move forward and backward with simplicity, but in tests it's not that simple, because you need to wait for every migration to apply before running tests. And if you are paying for your testing time on GitHub workflows or other platforms, then that could be painful. As a disclaimer, the timing for this talk may change from laptop to laptop, so keep this in mind, but on my old laptop, this is brand new, so it's faster, hopefully, on my old laptop, it was the timing. So running tests on 20 apps like Shop, I just copy pasted them 20 times in the example repository. Test took just a single second, less than a second to run, and that was perfect, so there's no need to do this talk. Well, not exactly, because creating the test database took 20 seconds. So one second of tests for this project, and 20 seconds for database creation. And that was not optimal, because we were on the verge between the team license and the enterprise license for the timing of workflow runs, so between 3,000 minutes monthly, and that wasn't optimal, we wanted to remain in the team license, because it was cheap, and then we wanted to optimize that time. The first possible workaround is to use KIPDB, running tests, and this parameter preserve the test database between runs, and that's perfect, because the first run applies the migrations, and then the database will be kept on your cache somewhere, on your Oculus, for example. If the database, of course, does not exist, it will be first created and migrated, and during other changes in other prequests, for example, migration will also be applied, so everything is okay, hopefully. So this approach saves 20 seconds for us after the first test run. The problem was configuring your CI CD, because a solution could be using cache or artifacts in GitHub workflow, but this takes time to create and store artifacts in GitHub, or, for example, using an external test database from inside the GitHub workflow, but that wasn't optimal, and a friend of mine, or mistaken, suggested me this package, Django migration CI, that allows you to simply configure an external test database, so you can consider this and save 20 seconds if you have an external database. Another possible workaround, one line workaround, is to use in your settings migrate equal false, so if you are using this, migration won't run during the test, and this is similar to set none as a value in migration modules, but for every apps in your project, so it's better this way, so single line change, and this has a lot of pros and cons, pros, of course, single line change, and it doesn't run migration during tests. The problem is it's like make migrations plus migrate before every test run, so this will add in our example repository five seconds of time, so that was the opposite of what I wanted to obtain. So diving into the Django documentation, I discovered this great, great comment, squash migration, and this squash an existing set of migrations into a single one, you can specify your migration name, and optionally start migration name, it will squash every migration into a single one. This was pretty good, I tried this one on the shop application, and I decided to squash every migration into a single one. It was good, not perfect for us, but it was good. The problem is that we needed to add manual porting, because for example we used a lot of functions, manual function, from a migration to another, from a version to another, and that weren't migrated or automatically squashed, so we had to copy paste the function code into the squash migration and make some adjustments. And if we inspect the squash migration file, we can see there is on the top of the class definition a list of things, a list of tuples in the replaces variable. So the first item is shop, the application name, and the second one is the migration name, for every one of the 26 migration. And the recommended process is first squash, keep the old files, commit and release to production, to staging the demo until production, then wait until all systems are upgraded with the new release, then you can remove the old migration files, commit and do a second release. Then last but not least, you need to transition your squash migration to a normal migration, delete all migration files, all old migration files that has been replaced, and update all migration that depends on the deleted ones with the new squash migration, and after everything you can remove the replaces attribute in the squash migration, and everything is fine. Then if you want to clean up your database, you can prove references, so in your database there won't be references to old migrations. Let's test performances after squashing, after spending a week on my work project doing that, and oh no changes, so I lost a week doing that without results, and don't tell my chief. So what's the point? Well the point of squash migration is to move back from having several hundred migrations, five to just a few, for example if you create a branch, a separate branch where you are working you alone, you can squash migrations and propose just a single migration file in your request. I know, I know you wanted to speed up tests, so let's do it. Are you ready? It's not that easy, but first you need to recreate migrations. So let's annotate migrations for a single specific application with show migrations, and then copy paste all the names of your migration files, and then you need to manually create a replaces, you remember this one from a moment ago, you need to recreate the replaces list with application name and migration file name, and store it somewhere in your computer, then move your migrations in a temporary directory, so out of the way, and make sure that show migrations doesn't show stuff. Now it's time to recreate migrations using your application name and a name, a specific name, for example init squash, so you remember that this is the squash migration, and that will create the first migration at your last model version. Then open your migration file, copy paste the replaces array list, you created a moment ago inside your class, then you can restore your old migration files in the original directories, make sure for missing or overwritten files, and then remove the temporary directory. Now with show migrations you need to check that everything is there, so all in this case 26 migrations are there, and the first one, the squash migration is there but has not been applied, then apply your squash migrations and check again with squash show migrations that everything has been squashed and you have just a single migration, and then you can go back to your post squash task, so commit and release to production, upgrade those systems, of course staging demo production, everything else, update on migrations that depends on the deleted migrations, remove the replace attribute, and if you want to bring references to the little migration, and everything is perfect, right? Well, not exactly, if you have migrations providing initial data, you need to create a new migration for that, because recreating migration from scratch, it doesn't create that insertion in your modules, or even better, you can use fixtures, and in the doc you can see how to use fixtures in both database migration and also in testing, and that's perfect, and then you need to be aware of circular dependencies, because if your project is big and grows during the time, you could have circular dependencies from a project to another and backward, and this problem requires you to remove all foreign key, causing the circular dependency, create the first migration, restore the foreign keys, and create a second migration, and this way you will hopefully solve this. Now, let's try to test performances after all of this, after another week spent on the project trying to tell your chief that, oh, I'm working on something useful, I promise you, and yeah, of course, after recreating everything from scratch, our database creation task took five seconds instead of 20, that was perfect. Yeah, it was perfect, but does this apply to everyone? It depends, because if you have really big, big projects and you are paying by the minute your CI CD workflows, and you are on the verge of having to pay $3-4 per user per month, to 20 something dollars per user per month, then maybe you want to stay on the little cheaper branch of this, so that could be a solution, but if you just want to make order in your migration file, then just use squash migration without everything else, or if you want to speed up tests on your localhost, you just need to use KipDB, and everything is fine, without having to spend, in my case, two weeks working on this, just to save maybe a couple of seconds on your project, so it depends on your use case, and we are done, so if you want to see the example repository, it's there with three different branches, if you want to compare them on your local machine, and I uploaded the slides on the FOSDEN website, so they are there if you want to take a look at them. Thank you very much. Okay, we have time for quite a few questions, I see one up there. Given your salary, and these two weeks of work you've done, how many years of enterprise lessons did you avoid? That's a nice question, hopefully my chief didn't ask me that, but I think we could have paid maybe a year, I don't know, one year of this, but yeah, it was fun to play with this, and for me at least spending two weeks trying new stuff, or trying to discover hidden stuff from the jungle. More questions? Good question. Yeah, thanks for the great talk, I was wondering if you looked into using like seed data betas for CI, so that... Sorry. Yeah, you don't hear it? No, I didn't hear you, sorry. If you looked into seed data betas for CI, so that you run your migrations locally, and then dump the database, and then use that database during CI to start off with a pre-migrated database. No, I didn't think about that, it's a good idea, so you just upload your database dump, and then on your... Yeah, so you just set up your CI script to use that database when it initializes. That could be a good idea, I need to try that, thank you. So you restore the database and just applies your last migrations without having to apply everything. Yeah, exactly. Yeah, that's a good idea, thank you. Thank you. I was also wondering if you're using Postgres for example, you can disable fsync that will just keep database in memory, so that probably be a solution for big time. So locally we kept the database in memory, the problem was on our CI CD, so we created a service in the workflow files, and that was creating a database from scratch. So it was just a configuration you can just add on your Postgres site on the CI... We had to consider the time for storing and restoring the database on that configuration from the cache. So it was a little bit of time for that, but yeah, that was an option I tried to... More questions? So very cool talk. I like your method. I basically came up with the same method about five years ago for this approach. Do you think there's an opportunity to create a tool to automate some of this process? Well, that's a good question. Maybe implementing that in the squash migration in some way, I don't know. We could, we can try to do it just to save other two weeks of salary from other people. Okay, I think we're done with questions, so we're going to have another five minute break and then continue with the next talk. Thank you.