[00:00.000 --> 00:10.000] And welcome to Sam for his talk on Music Recommendations in Python. [00:10.000 --> 00:11.000] Welcome. [00:11.000 --> 00:12.000] Thank you. [00:12.000 --> 00:13.000] Can you hear me? [00:13.000 --> 00:14.000] Yes. [00:14.000 --> 00:15.000] Good. [00:15.000 --> 00:16.000] Okay. [00:16.000 --> 00:28.000] So I'm a system software developer, actually, and this is a hobby that I like to make music [00:28.000 --> 00:30.000] playlists and play around with Python. [00:30.000 --> 00:33.000] I'm also a musician and a music fan. [00:33.000 --> 00:35.000] And I also used to be a teacher. [00:35.000 --> 00:36.000] I think that's relevant. [00:36.000 --> 00:38.000] That's going to come into play later. [00:38.000 --> 00:45.000] Thanks to my player code think for sponsoring the travel and allowing me to be here. [00:45.000 --> 00:48.000] So as a music fan, I used to make a lot of playlists. [00:48.000 --> 00:49.000] I still do. [00:49.000 --> 00:53.000] And I'm quite old, so when I first started making playlists, they look like this. [00:53.000 --> 00:55.000] And very convenient to share. [00:55.000 --> 00:59.000] Just give someone else the piece of plastic and they have a machine that plays it. [00:59.000 --> 01:04.000] But quite difficult to make because you have to, you remember, you have to line up all the songs, [01:04.000 --> 01:07.000] just write, press record, press play. [01:07.000 --> 01:10.000] The 2000s came and we all moved to digital music. [01:10.000 --> 01:13.000] If you were cool, you had like a win-amp skin. [01:13.000 --> 01:16.000] If you were really cool, you had XMMS. [01:16.000 --> 01:20.000] And these playlists, much easier to make, you just drag and drop. [01:20.000 --> 01:24.000] But they were more difficult to share because nobody else had the same MP3s as you. [01:24.000 --> 01:28.000] So you couldn't give the playlist to your friend anymore quite so easily. [01:28.000 --> 01:33.000] But if you ask someone now to make a playlist, probably they're going to think of this. [01:33.000 --> 01:37.000] They're going to make you a playlist on Spotify or YouTube and send you a link. [01:37.000 --> 01:40.000] And that's even better, right? [01:40.000 --> 01:41.000] It's super easy to make. [01:41.000 --> 01:42.000] You drag and drop. [01:42.000 --> 01:43.000] It's easy to share. [01:43.000 --> 01:44.000] You send someone the link. [01:44.000 --> 01:48.000] And it even recommends you songs to put on the list. [01:48.000 --> 01:50.000] So what's not to like? [01:50.000 --> 01:52.000] Honestly, I don't actually want that song on the list. [01:52.000 --> 01:57.000] So the recommendations aren't always helpful. [01:57.000 --> 01:59.000] Spotify is fine. [01:59.000 --> 02:00.000] You can use it. [02:00.000 --> 02:01.000] It has a great team of researchers. [02:01.000 --> 02:03.000] There are some negative things about the company. [02:03.000 --> 02:05.000] I mean, it's a private company. [02:05.000 --> 02:09.000] The duty theater investors is to minimize the amount that they pay out to musicians [02:09.000 --> 02:12.000] and pay that to investors instead. [02:12.000 --> 02:16.000] And they've been steadily doing that and reducing the rates they pay to musicians. [02:16.000 --> 02:18.000] And they kind of focus on passive listening, right? [02:18.000 --> 02:23.000] So you put on an album, it finishes, but they put on more songs for you. [02:23.000 --> 02:27.000] People actually now adapt to their music to fit the Spotify algorithm. [02:27.000 --> 02:30.000] So the first 10 or 20 seconds are very important. [02:30.000 --> 02:33.000] So songs don't have long intros anymore. [02:33.000 --> 02:36.000] That's been done to please the Spotify algorithm. [02:36.000 --> 02:39.000] So I started to think, what would the opposite look like? [02:39.000 --> 02:43.000] And I came up with, it would have to be something DIY, [02:43.000 --> 02:46.000] something that doesn't have a profit motive behind it. [02:46.000 --> 02:52.000] It would focus on having local music and going to artist websites, [02:52.000 --> 02:56.000] buying music from Bandcamp, from paying them on Petrion. [02:56.000 --> 02:59.000] It would also involve working with open data. [02:59.000 --> 03:04.000] So when I say open data, I don't necessarily mean public data that everyone can see, [03:04.000 --> 03:08.000] but data where it's hosted by you or by an entity you trust. [03:08.000 --> 03:11.000] And you can choose if it's open or private. [03:11.000 --> 03:15.000] You can download it, export it, et cetera. [03:15.000 --> 03:18.000] So I have no idea really what I'm doing, [03:18.000 --> 03:23.000] but back in 2016 I started experimenting with some ideas. [03:23.000 --> 03:26.000] And I was inspired by one or two other projects. [03:26.000 --> 03:30.000] So has anyone heard of Dynamic Land? [03:30.000 --> 03:32.000] That's a shame. [03:32.000 --> 03:34.000] So note that down if you have a notebook. [03:34.000 --> 03:37.000] It's something very interesting to research about. [03:37.000 --> 03:39.000] Out of scope for this talk. [03:39.000 --> 03:42.000] It's a project where the room, the whole room is a computer. [03:42.000 --> 03:46.000] Each of these pieces of paper has a program on it or some data, [03:46.000 --> 03:50.000] and you can interact with them by moving them around physically. [03:50.000 --> 03:52.000] Now, I can't create that myself, [03:52.000 --> 03:56.000] but I like the idea of having a program that fits on a sheet of A4 paper. [03:56.000 --> 03:59.000] You know, the philosophy is if your program doesn't fit on the paper, [03:59.000 --> 04:02.000] then it's too big and it needs to become smaller. [04:02.000 --> 04:04.000] And I like that as a philosophy. [04:04.000 --> 04:07.000] I feel like the playlist generators that I want to write [04:07.000 --> 04:11.000] should also fit on a piece of A4 paper or on a slide deck. [04:11.000 --> 04:15.000] And it should be a process that people can participate in. [04:15.000 --> 04:18.000] Okay, another thing that really inspired me was Git. [04:18.000 --> 04:21.000] That might seem counterintuitive, [04:21.000 --> 04:26.000] but Git, Linus Torvalds recently said he's better known actually for Git than for Linux, [04:26.000 --> 04:31.000] despite having basically created Git in a month. [04:31.000 --> 04:33.000] And quite an achievement, right? [04:33.000 --> 04:36.000] So there were a few key ideas. [04:36.000 --> 04:39.000] Git's data model is really well defined. [04:39.000 --> 04:42.000] It's simple. You have refs and commits. [04:42.000 --> 04:44.000] You work with those directly. [04:44.000 --> 04:48.000] And then your commits are made of trees and your trees have blobs. [04:48.000 --> 04:52.000] And you work with this directly. Like, you get your hands dirty. [04:52.000 --> 04:54.000] Git is also a multi-core binary, [04:54.000 --> 04:58.000] which has a really nice advantage that you can write one part of it in Perl [04:58.000 --> 05:02.000] and then another part of it in TCL and then another part of it in C. [05:02.000 --> 05:04.000] So you don't have to keep rewriting. [05:04.000 --> 05:08.000] You can have different people working on small components. [05:08.000 --> 05:12.000] And I had this idea of having the user interface commands, [05:12.000 --> 05:15.000] they call the porcelain, and the innards, like the plumbing. [05:15.000 --> 05:17.000] But it's all available, right? [05:17.000 --> 05:19.000] So if you have Git on your laptop, [05:19.000 --> 05:23.000] you can build a commit using the lowest level commands that you want. [05:23.000 --> 05:26.000] And that's a huge advantage in getting people involved. [05:26.000 --> 05:28.000] Git is a real DIY project. [05:28.000 --> 05:31.000] It's not some shiny thing that just magically works. [05:31.000 --> 05:33.000] You push a button and have a nice day. [05:33.000 --> 05:35.000] It's something that you really have to get involved with. [05:35.000 --> 05:37.000] It'll break. You have to learn how it works. [05:37.000 --> 05:40.000] And that's the secret to its success, I think. [05:40.000 --> 05:44.000] And of course, Git, the interface to Git is the command line, right? [05:44.000 --> 05:46.000] So you can build a website around it in Ruby. [05:46.000 --> 05:49.000] You can build a website around it in Python. [05:49.000 --> 05:51.000] You can build extensions. [05:51.000 --> 05:52.000] Very inspiring. [05:52.000 --> 05:58.000] I set out to build a similar tool, but for playlists. [06:01.000 --> 06:05.000] And the first thing I thought about was the data model. [06:05.000 --> 06:08.000] And I realized that actually everything is a playlist. [06:08.000 --> 06:11.000] You know, a music collection is just a playlist [06:11.000 --> 06:14.000] where the order doesn't really matter. [06:14.000 --> 06:17.000] Metadata can be stored as metadata in the playlist. [06:17.000 --> 06:20.000] So everything is a playlist. [06:20.000 --> 06:22.000] I wanted to write a multi-core binary. [06:22.000 --> 06:24.000] This is called CPE. [06:24.000 --> 06:26.000] The tool I wrote is called Calliope, by the way. [06:26.000 --> 06:29.000] I'm not really here to show off about the tool, actually. [06:29.000 --> 06:31.000] You can look at it, and it's fun, [06:31.000 --> 06:34.000] but the ideas are the thing I'm more excited about. [06:34.000 --> 06:36.000] I'd like people to re-implement this in other languages [06:36.000 --> 06:39.000] and go forth with the ideas and do stuff I never thought of, [06:39.000 --> 06:42.000] or contribute to the project itself. [06:42.000 --> 06:44.000] So it has a multi-core binary. [06:44.000 --> 06:46.000] Currently everything's written in Python. [06:46.000 --> 06:52.000] That could change if somebody decides to write a new tool in Haskell or whatever. [06:52.000 --> 06:54.000] The main interface is the command line. [06:54.000 --> 07:00.000] So you can create a recommendation pipeline as a shell pipeline. [07:00.000 --> 07:04.000] Or you can do stuff in Python directly for greater control. [07:04.000 --> 07:06.000] And it's optimized for ease of maintenance, right? [07:06.000 --> 07:07.000] Because I'm lazy. [07:07.000 --> 07:09.000] I have one hour a weekend to spend on this, [07:09.000 --> 07:14.000] so it has to be easy to maintain. [07:14.000 --> 07:18.000] Okay, so the data model, as simple as possible. [07:18.000 --> 07:19.000] Here's a playlist item. [07:19.000 --> 07:22.000] It's a Python dictionary, which we can represent as JSON, [07:22.000 --> 07:26.000] and it has key value pairs. [07:26.000 --> 07:30.000] And then a playlist is a list of playlist items. [07:30.000 --> 07:32.000] One quite key decision is that, [07:32.000 --> 07:36.000] notice I haven't represented this as a JSON list. [07:36.000 --> 07:38.000] It's a JSON lines document. [07:38.000 --> 07:42.000] So that's JSON objects separated by a new line. [07:42.000 --> 07:43.000] And this is really cool, [07:43.000 --> 07:46.000] because you can process it with shell pipeline tools. [07:46.000 --> 07:47.000] You can process it with JSON tools, [07:47.000 --> 07:53.000] but you can also process it with line-based processing tools. [07:53.000 --> 07:55.000] Think if we had a JSON list, [07:55.000 --> 07:57.000] and this playlist was 10,000 items long, [07:57.000 --> 07:58.000] then we stream it, [07:58.000 --> 08:00.000] and you have to wait for the closing parenthesis [08:00.000 --> 08:02.000] before the next process can read it. [08:02.000 --> 08:05.000] But this way, the processes can read a line at a time, [08:05.000 --> 08:08.000] and you can have an infinite-length playlist [08:08.000 --> 08:11.000] and start processing the beginning of it [08:11.000 --> 08:14.000] before it's even, before it's finished. [08:14.000 --> 08:18.000] Okay, so that's the data model. [08:18.000 --> 08:20.000] Those key value pairs, creator and title, [08:20.000 --> 08:21.000] those on arbitrary, [08:21.000 --> 08:24.000] those come from an existing playlist format called SPF, [08:24.000 --> 08:29.000] which has been around since 2006 and is almost perfect. [08:29.000 --> 08:31.000] Like, they got the design almost perfect. [08:31.000 --> 08:32.000] One of the flaws was choosing XML, [08:32.000 --> 08:35.000] which was a good idea in 2006. [08:35.000 --> 08:39.000] And the other tweak I made was representing it as JSON lines, [08:39.000 --> 08:45.000] but the data model is effectively the same as SPF. [08:45.000 --> 08:48.000] So we can already do some fun stuff with this playlist, right? [08:48.000 --> 08:52.000] Let me quickly show you what you can do. [08:52.000 --> 08:54.000] Here's a playlist. [08:54.000 --> 08:59.000] These songs aren't real, obviously. [08:59.000 --> 09:01.000] We can shuffle it. [09:01.000 --> 09:03.000] I have to give it a file name, [09:03.000 --> 09:04.000] and the file name is standard in. [09:04.000 --> 09:07.000] Okay, so now it's shuffled. [09:07.000 --> 09:11.000] I can export it to a different playlist format. [09:11.000 --> 09:14.000] So now I've converted it into an actual SPF playlist, [09:14.000 --> 09:18.000] so you can put it into rhythm box. [09:18.000 --> 09:20.000] But we don't even need to use calliope tools, right? [09:20.000 --> 09:26.000] I could use head to get the first item. [09:26.000 --> 09:30.000] I could shuffle it using sort. [09:30.000 --> 09:33.000] Okay. [09:33.000 --> 09:35.000] And I can use data-oriented tools as well. [09:35.000 --> 09:36.000] So this is actually new shell, [09:36.000 --> 09:38.000] which is a data-oriented shell. [09:38.000 --> 09:43.000] So I can also load it into new shell, [09:43.000 --> 09:45.000] and now I have JSON, [09:45.000 --> 09:49.000] and now I can sort it by the artist's name or by the title. [09:49.000 --> 09:51.000] So just by defining a data format, [09:51.000 --> 09:53.000] you get all this stuff for free. [09:53.000 --> 09:55.000] Like, I didn't even have to write any code yet, [09:55.000 --> 09:59.000] and we can already shuffle a playlist. [09:59.000 --> 10:02.000] So what's next? [10:02.000 --> 10:05.000] Well, these aren't even real songs, right? [10:05.000 --> 10:06.000] You can't play them. [10:06.000 --> 10:08.000] There's no content. [10:08.000 --> 10:10.000] So the next step is get some content [10:10.000 --> 10:14.000] so we can actually listen to the playlist. [10:14.000 --> 10:16.000] The developers of the SPF format have thought of this, [10:16.000 --> 10:21.000] and they designed SPF with a portable design [10:21.000 --> 10:23.000] where when you go to play the music, [10:23.000 --> 10:25.000] you resolve it at that moment. [10:25.000 --> 10:29.000] So you search based on the metadata, like creator and title, [10:29.000 --> 10:33.000] and then you find a URL where you can actually play it. [10:33.000 --> 10:36.000] So I implemented that, [10:36.000 --> 10:38.000] and I can demo that as well. [10:38.000 --> 10:40.000] Okay, so here's three. [10:40.000 --> 10:42.000] These are real songs now, [10:42.000 --> 10:50.000] and if I pipe it to the Spotify sub-command, [10:50.000 --> 10:53.000] they get resolved to actual tracks on Spotify. [10:53.000 --> 10:55.000] So over here somewhere is a URL, [10:55.000 --> 10:59.000] and you can click it and listen to the track. [10:59.000 --> 11:01.000] This is all done using the Spotify API, [11:01.000 --> 11:03.000] so you need a Spotify API key to do that. [11:03.000 --> 11:06.000] You can get it for free, but it's a little bit of an effort. [11:06.000 --> 11:10.000] And it works by searching based on creator, title, [11:10.000 --> 11:12.000] and ranking the results. [11:12.000 --> 11:15.000] Or I can resolve it to tracks on my local machine. [11:15.000 --> 11:17.000] So I'm a GNOME developer, [11:17.000 --> 11:20.000] so I have the tracker search engine installed, [11:20.000 --> 11:23.000] and tracker can match against my local music collection [11:23.000 --> 11:26.000] and return the URL. [11:26.000 --> 11:29.000] Let me make that pretty. [11:29.000 --> 11:33.000] Okay, so it's resolved to URLs on my local machine. [11:33.000 --> 11:36.000] This one, I seem to have deleted the Madonna album, [11:36.000 --> 11:38.000] but the other two are here. [11:38.000 --> 11:41.000] And then you see here I exported it as an M3U playlist as well, [11:41.000 --> 11:44.000] now that we have URLs. [11:44.000 --> 11:49.000] So this is the basics of how you can make playlists in Python, right? [11:49.000 --> 11:52.000] What's next? [11:52.000 --> 11:54.000] So I promised music recommendations, right, [11:54.000 --> 11:57.000] and we haven't actually done any recommendations yet. [11:57.000 --> 12:02.000] So the next thing I'm going to talk about is a program I made [12:02.000 --> 12:05.000] that generates me a playlist every day. [12:05.000 --> 12:08.000] And that's as far as I've got with this, [12:08.000 --> 12:11.000] because actually I quite like the playlists it generates, [12:11.000 --> 12:14.000] so I haven't needed to make any other recommenders yet. [12:14.000 --> 12:16.000] I'm still happy with this one. [12:16.000 --> 12:18.000] Soon I shall look at some more. [12:18.000 --> 12:20.000] But a recommendation algorithm is basically this. [12:20.000 --> 12:23.000] You have a very big playlist on the left, [12:23.000 --> 12:25.000] which is all the possible music you could listen to, [12:25.000 --> 12:28.000] and then some sort of algorithm happens, [12:28.000 --> 12:30.000] and on the right you have a shorter playlist, [12:30.000 --> 12:35.000] which is hopefully better, and that's the one you listen to. [12:35.000 --> 12:40.000] So the algorithm I came up with, I called it the Special Mix, [12:40.000 --> 12:45.000] and its goal is to create a one-hour playlist of music that I already know, [12:45.000 --> 12:47.000] and there's three ingredients for that. [12:47.000 --> 12:49.000] All of these are Python libraries. [12:49.000 --> 12:51.000] One is PyListenBrains, [12:51.000 --> 12:54.000] which is an interface to the ListenBrains database. [12:54.000 --> 12:56.000] One is the Beats Music Organiser, [12:56.000 --> 12:59.000] which is a great tool for maintaining a local music collection. [12:59.000 --> 13:02.000] And one is the Python SimpleAI module, [13:02.000 --> 13:05.000] which gives you really basic AI algorithms [13:05.000 --> 13:08.000] that let you do constraint solving. [13:08.000 --> 13:10.000] So I'll go through those one at a time. [13:10.000 --> 13:12.000] I'll go have a little drink first. [13:16.000 --> 13:20.000] So if you want to do music recommendations, [13:20.000 --> 13:23.000] it's a good idea to save the history of what you listen to. [13:23.000 --> 13:25.000] Spotify already does that, [13:25.000 --> 13:29.000] although they make it a little difficult for you to then get at the data. [13:29.000 --> 13:32.000] Lastfm does that, and ListenBrains, [13:32.000 --> 13:35.000] which I recommend that solution because it's open. [13:35.000 --> 13:37.000] It's an open source platform. [13:37.000 --> 13:39.000] It's open data. [13:39.000 --> 13:42.000] So you can get a browser extension, [13:42.000 --> 13:44.000] or phone apps and music players [13:44.000 --> 13:46.000] that will save everything you listen to [13:46.000 --> 13:48.000] into the ListenBrains database, [13:48.000 --> 13:51.000] and then ListenBrains gives you charts and graphs [13:51.000 --> 13:53.000] to show what a great taste you have. [13:53.000 --> 13:57.000] And Python ListenBrains and the Kaliot ListenBrains command [13:57.000 --> 13:59.000] let you access the data. [14:01.000 --> 14:03.000] So... [14:05.000 --> 14:08.000] I would run the ListenBrains history command, [14:08.000 --> 14:10.000] put my username, [14:10.000 --> 14:12.000] and fetch all the listens. [14:12.000 --> 14:14.000] This does something kind of dumb. [14:14.000 --> 14:17.000] It just syncs all of the listens into a local SQLI database. [14:17.000 --> 14:19.000] And then I've dumped the first one here [14:19.000 --> 14:21.000] to show the kind of metadata you get. [14:21.000 --> 14:24.000] So you get a timestamp, you get an ID for the track, [14:24.000 --> 14:27.000] and then you get the creator and the title and the album. [14:27.000 --> 14:29.000] And in this case, the URL of where I listened to it, [14:29.000 --> 14:31.000] because it came from the web scrubbler. [14:31.000 --> 14:33.000] So that's useful. [14:33.000 --> 14:36.000] And then because it saved in a local SQL database, [14:36.000 --> 14:39.000] we can do things like calculate a histogram [14:39.000 --> 14:42.000] of which years I actually listened to music. [14:42.000 --> 14:45.000] We can select tracks based on, [14:45.000 --> 14:48.000] okay, it was first listened to in 2019, [14:48.000 --> 14:51.000] or it was first listened to in 2020. [14:51.000 --> 14:53.000] So that's what I did. [14:53.000 --> 14:55.000] And now I have a playlist, right? [14:55.000 --> 14:57.000] So the first thing the special mix does [14:57.000 --> 15:01.000] is it chooses one year out of this histogram, [15:01.000 --> 15:04.000] so a year where I did actually listen to music, [15:04.000 --> 15:07.000] and then it selects all the tracks [15:07.000 --> 15:09.000] that I listened to [15:09.000 --> 15:11.000] where the first listen is in that year. [15:11.000 --> 15:13.000] So songs I discovered in 2019 [15:13.000 --> 15:16.000] or discovered in 2021, for example. [15:16.000 --> 15:19.000] And now we have a playlist, but it's very long. [15:19.000 --> 15:23.000] So the next step is... [15:23.000 --> 15:26.000] The next step would be to select tracks from it. [15:26.000 --> 15:29.000] However, we want to know a bit more about the songs first. [15:29.000 --> 15:31.000] So actually the next step here [15:31.000 --> 15:35.000] is to resolve the tracks to local content. [15:35.000 --> 15:37.000] This is where I keep my music collection. [15:37.000 --> 15:39.000] It's a hi-tech home server. [15:39.000 --> 15:42.000] And I manage it with a Python program called Beats. [15:42.000 --> 15:45.000] This is a command line tool that lets you [15:46.000 --> 15:48.000] take music that you've got from Bandcamp or wherever [15:48.000 --> 15:50.000] and import it into a database [15:50.000 --> 15:52.000] and it fixes the tags using MusicBrains. [15:52.000 --> 15:54.000] So it's always correct. [15:54.000 --> 15:56.000] It has nice apostrophe characters. [15:56.000 --> 16:00.000] You can use any content resolver in theory. [16:00.000 --> 16:02.000] You can generate playlists against Spotify [16:02.000 --> 16:04.000] and upload them to Spotify, [16:04.000 --> 16:06.000] but that's not my goal here. [16:06.000 --> 16:10.000] So using Beats, I can resolve the tracks in the playlist [16:10.000 --> 16:14.000] to actual mp3 files on my music server. [16:14.000 --> 16:16.000] That's actually a lie. [16:16.000 --> 16:18.000] I haven't implemented that yet and I use Tracker. [16:18.000 --> 16:20.000] But because Beats is written in Python, [16:20.000 --> 16:22.000] we'll pretend that I use Beats. [16:22.000 --> 16:25.000] Either way, now I have a playlist [16:25.000 --> 16:28.000] where the track location is set to a file [16:28.000 --> 16:31.000] and we also know the duration of every song. [16:31.000 --> 16:34.000] So we have a bit of more metadata. [16:34.000 --> 16:38.000] And now we can select [16:38.000 --> 16:41.000] which tracks go in the playlist. [16:41.000 --> 16:43.000] Okay, so here's the fun part. [16:43.000 --> 16:45.000] All the parts are fun, [16:45.000 --> 16:47.000] but maybe this is the most fun part. [16:49.000 --> 16:54.000] The approach I took was to try and do constraint solving. [16:54.000 --> 16:57.000] Now, this is a pretty old area of AI, right? [16:57.000 --> 16:59.000] People have been looking at constraint solving [16:59.000 --> 17:01.000] for years and years, so the fashion at the moment [17:01.000 --> 17:03.000] is to solve everything with machine learning [17:03.000 --> 17:05.000] and lots and lots of GPUs. [17:05.000 --> 17:07.000] And that works. It produces nice results, [17:07.000 --> 17:09.000] but it's hard to reproduce in an hour [17:09.000 --> 17:12.000] on the weekend on your old laptop, [17:12.000 --> 17:15.000] whereas the constraint solving approach is pretty simple. [17:15.000 --> 17:19.000] You can run it on, you know, a Raspberry Pi with no issues. [17:21.000 --> 17:24.000] This was inspired by a paper that I found, [17:24.000 --> 17:26.000] which, again, is from 2008. [17:26.000 --> 17:28.000] It's nothing too futuristic. [17:28.000 --> 17:32.000] And in this paper, they define a constraint model. [17:32.000 --> 17:35.000] So this looks kind of academic if you're not a mathematician. [17:35.000 --> 17:40.000] But these are ways of making constraints on a playlist, [17:40.000 --> 17:42.000] such as I want the playlist for whatever reason [17:42.000 --> 17:45.000] to be 20% or more Stevie Wonder songs. [17:45.000 --> 17:50.000] And they can express that as a function. [17:50.000 --> 17:54.000] And the key is that you can apply this function to a playlist. [17:54.000 --> 18:00.000] So let's say I have a playlist and it has 100 Stevie Wonder songs. [18:01.000 --> 18:05.000] This constraint function would return a score of one [18:05.000 --> 18:08.000] for that playlist because every song is Stevie Wonder, [18:08.000 --> 18:10.000] so the constraint is completely satisfied. [18:10.000 --> 18:12.000] Now let's say I have another playlist, [18:12.000 --> 18:14.000] which is entirely Death Metal. [18:14.000 --> 18:17.000] This constraint function would return zero [18:17.000 --> 18:20.000] because none of the tracks are by Stevie Wonder. [18:20.000 --> 18:23.000] And if we had a playlist that was 10% Stevie Wonder songs, [18:23.000 --> 18:27.000] then we would assume this function would return 0.5 [18:27.000 --> 18:31.000] because the playlist kind of half satisfies the constraint. [18:31.000 --> 18:34.000] So the first step in constraint solving like this [18:34.000 --> 18:39.000] is to define a constraint function that can score any playlist. [18:39.000 --> 18:42.000] And then we use a local search algorithm [18:42.000 --> 18:46.000] to find a playlist that best matches the constraints. [18:46.000 --> 18:49.000] So local search is a kind of try it, [18:49.000 --> 18:52.000] try, try, try, try again until you get bored [18:52.000 --> 18:55.000] and then pick the best solution that you found. [18:55.000 --> 18:58.000] You set a fixed number of iterations like 10,000 [18:58.000 --> 19:00.000] and you kind of go, OK, well, this works, [19:00.000 --> 19:02.000] this one's a bit better, this one's worse, [19:02.000 --> 19:05.000] and choose the best one that you found. [19:05.000 --> 19:09.000] So I'm going to do a quick worked example of this [19:09.000 --> 19:11.000] with two constraints. [19:11.000 --> 19:13.000] And the constraints I'm going to put are [19:13.000 --> 19:15.000] the songs must be, [19:15.000 --> 19:18.000] each song must be two to four minutes long [19:18.000 --> 19:21.000] and the playlist as a whole must be 10 minutes. [19:21.000 --> 19:29.000] And here's a demo of solving that constraint problem. [19:29.000 --> 19:31.000] As promised, it fits on a sheet of A4 paper. [19:31.000 --> 19:33.000] This is the whole program. [19:33.000 --> 19:36.000] So here I've defined two constraints. [19:36.000 --> 19:39.000] One of them is an item duration constraint [19:39.000 --> 19:41.000] saying that the duration of each item [19:41.000 --> 19:44.000] should be between two and four minutes. [19:44.000 --> 19:46.000] And here's a playlist duration constraint [19:46.000 --> 19:48.000] saying that the overall duration should be [19:48.000 --> 19:50.000] between 10 and 10 minutes. [19:50.000 --> 19:52.000] I know exactly what I want. [19:52.000 --> 19:55.000] And then here is the input, [19:55.000 --> 19:58.000] which is a playlist made up of four fake songs. [19:58.000 --> 20:03.000] I haven't used real songs here because it's too complicated. [20:03.000 --> 20:05.000] Notice they have vastly different lengths, [20:05.000 --> 20:07.000] so it's quite hard to solve this problem. [20:07.000 --> 20:10.000] There's not an obvious solution. [20:10.000 --> 20:13.000] And we're going to use the Kaliope Select module, [20:13.000 --> 20:16.000] which internally uses simple AI. [20:16.000 --> 20:21.000] So the only thing that's required in this playlist, [20:21.000 --> 20:23.000] the only required piece of metadata is an ID. [20:23.000 --> 20:26.000] In this case, I've put emojis because they're pretty, [20:26.000 --> 20:28.000] but normally you'd have an integer or something. [20:28.000 --> 20:31.000] And these constraints will look at the duration field. [20:31.000 --> 20:33.000] So the duration field is also required [20:33.000 --> 20:36.000] because otherwise we can't calculate the score, right? [20:36.000 --> 20:39.000] Because we need to know the duration of each item. [20:39.000 --> 20:42.000] So if I've got time, how are we doing for time? [20:42.000 --> 20:45.000] Okay, I think I have time to show you [20:45.000 --> 20:49.000] a live demo of solving this constraint problem. [20:49.000 --> 20:52.000] And the good news is, [20:52.000 --> 20:54.000] Simple AI, the Simple AI module, [20:54.000 --> 20:58.000] has a web viewer that can give us a kind of graphical view [20:58.000 --> 21:00.000] of what's going on here. [21:00.000 --> 21:04.000] So with luck, yeah, with luck this will load, [21:04.000 --> 21:07.000] and it's going to step through [21:07.000 --> 21:09.000] each step of the local search algorithm [21:09.000 --> 21:11.000] to find the best playlist. [21:11.000 --> 21:16.000] So in the beginning we have an empty playlist [21:16.000 --> 21:18.000] and the score is zero because it doesn't satisfy [21:18.000 --> 21:21.000] either of the constraints. [21:21.000 --> 21:24.000] We take a step. [21:24.000 --> 21:27.000] Each step it can choose to do different actions [21:27.000 --> 21:29.000] that will change the playlist. [21:29.000 --> 21:33.000] So here it's chosen actions of adding a song [21:33.000 --> 21:35.000] because that's all it can do. [21:35.000 --> 21:37.000] So we can add this song which was quite long [21:37.000 --> 21:40.000] and the score is now 0.4 because one of the songs is too long. [21:40.000 --> 21:43.000] Or if we have the really short song, the score is lower. [21:43.000 --> 21:48.000] Or if we add this seven-minute ambient tune, [21:48.000 --> 21:52.000] then, ah, it's very difficult to get the scale right. [21:52.000 --> 21:56.000] There we go. [21:56.000 --> 21:58.000] So this one has the highest score, [21:58.000 --> 22:00.000] no, this one has the highest score actually. [22:00.000 --> 22:03.000] So probably it's going to choose this one. [22:03.000 --> 22:05.000] Yeah, so the next step we have a playlist [22:05.000 --> 22:07.000] which is just this song. [22:07.000 --> 22:08.000] What song was that? [22:08.000 --> 22:10.000] Amazing tune. [22:10.000 --> 22:12.000] So we have amazing tune in the list [22:12.000 --> 22:14.000] and now the score is 0.6. [22:14.000 --> 22:17.000] Let's take the next step. [22:17.000 --> 22:20.000] Okay, so now one of the possible actions [22:20.000 --> 22:22.000] is removing the song again, [22:22.000 --> 22:25.000] but we won't do that because the score is back down to zero. [22:25.000 --> 22:27.000] Or we can add one of these other songs [22:27.000 --> 22:30.000] and this one seems to be the playlist [22:30.000 --> 22:34.000] that best matches the constraint. [22:34.000 --> 22:37.000] Okay, so now we've got a playlist of two items. [22:38.000 --> 22:40.000] Now we can take some more actions. [22:40.000 --> 22:44.000] We can add another song or remove either of those songs. [22:44.000 --> 22:47.000] And we've added another song. [22:47.000 --> 22:50.000] Probably at this point it's going to say, [22:50.000 --> 22:53.000] okay, so it can't find any actions that increase the score, [22:53.000 --> 22:55.000] so the algorithm has said, right, it's done. [22:55.000 --> 22:57.000] That's the best playlist you're going to get. [22:57.000 --> 23:00.000] And that is how you can create a playlist [23:00.000 --> 23:03.000] in a page full of Python. [23:03.000 --> 23:06.000] So pretty short on time. [23:07.000 --> 23:09.000] We can export this playlist. [23:09.000 --> 23:12.000] And I have a jelly-fin music player set up [23:12.000 --> 23:15.000] and that's how I listen to it. [23:15.000 --> 23:18.000] That's a recap of what we've just seen. [23:18.000 --> 23:20.000] So what's next? I don't know, really. [23:20.000 --> 23:22.000] We have a couple of minutes for questions. [23:22.000 --> 23:25.000] Maybe you can answer the question of what's next. [23:26.000 --> 23:29.000] APPLAUSE [23:41.000 --> 23:43.000] Thank you, Sam. [23:43.000 --> 23:46.000] We have two minutes for questions. [23:46.000 --> 23:49.000] I will repeat the questions then. [23:49.000 --> 23:54.000] Yeah, so if I want to actually use this project [23:54.000 --> 23:56.000] like, for example, Francis, [23:56.000 --> 24:00.000] how much time would it take me to set up the project [24:00.000 --> 24:05.000] and find constraints that actually match and replace music [24:05.000 --> 24:08.000] given that I'm fairly familiar with Python [24:08.000 --> 24:10.000] or setting up some projects? [24:10.000 --> 24:13.000] So how much time would it take to set up the whole project [24:13.000 --> 24:15.000] and find the constraints and so on? [24:15.000 --> 24:17.000] Yeah, interesting question, actually. [24:17.000 --> 24:19.000] So, I mean, setting up the project is fairly easy. [24:19.000 --> 24:21.000] You pip install. [24:22.000 --> 24:24.000] After that... [24:24.000 --> 24:26.000] I don't know, it wouldn't be quick. [24:26.000 --> 24:28.000] I'll tell you that. [24:28.000 --> 24:31.000] You would have to enjoy getting your hands dirty a bit at this stage. [24:31.000 --> 24:35.000] My general goal is to create a folder of example recommenders. [24:35.000 --> 24:37.000] So hopefully in future you'd be able to... [24:37.000 --> 24:40.000] And you can actually run the examples as modules as well. [24:40.000 --> 24:43.000] So hopefully in the future you'd be able to kind of run [24:43.000 --> 24:46.000] an existing example and get started fairly quickly [24:46.000 --> 24:48.000] and just tweak a few values. [24:52.000 --> 24:55.000] One more over there. [24:55.000 --> 24:57.000] Or one over here. [24:57.000 --> 25:01.000] So depending on the number of times you have used [25:01.000 --> 25:03.000] this recommendation system, [25:03.000 --> 25:06.000] how often has it repeated the same set of music, [25:06.000 --> 25:10.000] same set of songs or very similar tasting songs? [25:10.000 --> 25:12.000] Yeah, how long has it repeated? [25:12.000 --> 25:15.000] It's never come up with two playlists the same, actually. [25:15.000 --> 25:17.000] What it does sometimes do, though, is it'll take an album [25:17.000 --> 25:20.000] and kind of give me four or five songs of the same album [25:20.000 --> 25:22.000] in one playlist. [25:22.000 --> 25:24.000] So maybe I need to tweak the constraints a bit there. [25:24.000 --> 25:27.000] But there's infinite possibilities, really. [25:27.000 --> 25:30.000] Yeah, I haven't got bored of it so far. [25:30.000 --> 25:33.000] If your input is very short, then it will get repetitive. [25:33.000 --> 25:35.000] One more question? [25:41.000 --> 25:43.000] Thanks for the very interesting talk. [25:43.000 --> 25:45.000] Just a quick question. [25:45.000 --> 25:47.000] How easy would it be to implement? [25:47.000 --> 25:52.000] So say if you wanted to search for different performances [25:52.000 --> 25:55.000] or different interpretations of a particular piece of music, [25:55.000 --> 25:58.000] so if you had classical music, some symphony [25:58.000 --> 26:00.000] with a bunch of different recordings, [26:00.000 --> 26:02.000] how easy would it be to implement that in the current? [26:02.000 --> 26:05.000] How familiar are you with music brains? [26:05.000 --> 26:06.000] Not at all. [26:06.000 --> 26:08.000] So the tool you would use would be music brains. [26:08.000 --> 26:10.000] Yeah, talk to this guy. [26:10.000 --> 26:12.000] He'll bring you up to speed. [26:12.000 --> 26:14.000] Thank you again, Sam. [26:14.000 --> 26:16.000] Thank you.