[00:00.000 --> 00:12.920] Hello. So PGSTATV is, from the name, I think you can understand that it's probably something [00:12.920 --> 00:20.880] to do with Postgres. And it is. It is a new Postgres extension and utility pair. So it [00:20.880 --> 00:29.800] comes with its own tool that you use outside of Postgres. It's minimalist. We'll get into [00:29.800 --> 00:37.080] that in a moment. It only does the thing it's supposed to do and it doesn't touch anything [00:37.080 --> 00:43.480] else in the system that it's not supposed to. And the purpose of PGSTATV is time series [00:43.480 --> 00:50.280] analysis and visualization. That's the vis part of Postgres internal statistics. That's [00:50.280 --> 01:02.000] the stat part. So Postgres internally keeps its own statistics. They are cumulative and [01:02.000 --> 01:08.200] dynamic statistics, right? So you get, like, number of buffers written is a cumulative [01:08.200 --> 01:15.560] statistic that keeps going up. You also have dynamic statistics like PGSTAT activity that [01:15.560 --> 01:21.480] tells you what's happening inside your Postgres server at that moment in time. So if you take [01:21.480 --> 01:28.280] snapshots of these statistics internally from within Postgres and you perform time series [01:28.280 --> 01:35.880] analysis on them, you can gain insights into how your server is behaving. So this utility [01:35.880 --> 01:43.920] that comes with PGSTATV's extension can produce visualizations for selected time ranges on [01:43.920 --> 01:51.320] the stored snapshots that are inside the database. So you can, for example, take snapshots of [01:51.320 --> 01:58.880] your server every 15 minutes during the course of the day and then analyze it over 24 hours [01:58.880 --> 02:05.040] to see what your peak times were and what was happening inside the server at that time. [02:05.040 --> 02:13.400] I wouldn't recommend taking snapshots more frequently than a minute. And it's easy to [02:13.400 --> 02:21.200] see why. You have too many snapshots. It's harder to see the bigger picture, maybe. So [02:21.200 --> 02:28.360] the reason for all of this is you want to track your performance over time and potentially [02:28.360 --> 02:33.080] you can perform troubleshooting on why your server is not behaving the way you expect [02:33.080 --> 02:45.480] it to and additional tuning. So minimalist, this is a tiny package that is based on the [02:45.480 --> 02:55.880] KISS and UNIX philosophies. So keep it simple and sweet, right? And the UNIX philosophy [02:55.880 --> 03:01.200] is that it comes with a tool that you can run as a normal Postgres command line tool [03:01.200 --> 03:09.000] like P SQL with the same parameters and everything else. And it allows you very simply to create [03:09.000 --> 03:16.280] snapshots of the statistics and visualize them. So it's modular. We'll get into the modules [03:16.280 --> 03:23.160] in a minute. It's minimal. It's the least amount of code I could write to make this [03:23.160 --> 03:31.160] thing work. And it's unobtrusive. So you can take snapshots without affecting any other [03:31.160 --> 03:36.760] activity running on your system. And I think that's very important for being able to monitor [03:36.760 --> 03:43.440] and analyze in production. So the components are Postgres extension, as we said, and the [03:43.440 --> 03:51.120] Python utility that retrieves the stored snapshots from the database and creates simple visualizations [03:51.120 --> 04:01.200] with them using matplotlib. The extension is written in plain SQL and PLPG SQL. So there's [04:01.200 --> 04:06.400] nothing to put in shared preload libraries. So this means that you can just type create [04:06.400 --> 04:13.800] extension and you can start using it without even restarting your server. So create extension [04:13.800 --> 04:21.320] PGstatvis is all you need to do. We're working on the packaging now to get it distributed [04:21.320 --> 04:30.200] through the PGDG repos, Postgres global development group repositories. And by extension, it will [04:30.200 --> 04:36.160] find its way into distributions hopefully soon. The way you install the utility is very [04:36.160 --> 04:41.640] simple. You just type pip install PGstatvis. If you tried that this morning, it wouldn't [04:41.640 --> 04:47.040] work, but I just uploaded the file so you can try it out. As I said, this is the last [04:47.040 --> 04:54.360] minute talk. It's very new. The code is pre-production quality. I would call it alpha code, but [04:54.360 --> 05:02.640] you can give it a try for yourself and offer any suggestions or fixes or tell me what I'm [05:02.640 --> 05:10.080] doing wrong. Now, the extension can be used by super users, but you don't have to. The [05:10.080 --> 05:14.440] only thing that the extension needs is PG monitor role privileges in order to be able [05:14.440 --> 05:22.280] to select from the internal Postgres statistics tables. And the usage is dead simple. And [05:22.280 --> 05:29.760] to take a snapshot, you just type from within a client, select PGstatvis.snapshot. Now, why [05:29.760 --> 05:36.360] is there no underscore there? It's because Postgres doesn't like us naming schemas PG [05:36.360 --> 05:40.840] underscore something. That's reserved only for core Postgres. So, extensions are not [05:40.840 --> 05:48.960] allowed to do it. So, what does the command line look like? You just pip install PGstatvis [05:48.960 --> 05:56.840] and you have the utility and the utility when you ask for help is a normal Postgres utility. [05:56.840 --> 06:02.480] You get your database selection, user name, host name part, et cetera, the same way you [06:02.480 --> 06:10.240] would connect with any Postgres client. And you've got modules like buff that shows you [06:10.240 --> 06:16.680] statistics on the background writer and buffers written to disk, cache hit ratio, checkpoint [06:16.680 --> 06:26.240] rate, connections, number of tuples, weights that it found in the server during the snapshot, [06:26.240 --> 06:31.920] wall generation and so on. And you can either run analyze which runs all of the modules [06:31.920 --> 06:38.360] at once and generates visualizations or you can run just one module if you're only interested [06:38.360 --> 06:48.720] in buffers. You can only say run buff. Most importantly, there's a capital D option that [06:48.720 --> 06:55.640] you can use to specify the date range in order to visualize only the time range you're interested [06:55.640 --> 07:03.120] in. So, like the last 24 hours only. And these are specified, of course, in ISO 8601 format. [07:03.120 --> 07:09.880] So, there's no ambiguity in how to type in dates. And it works something like this. [07:09.880 --> 07:17.040] You connect to database fof as user Postgres. You give it a date range and it just generates [07:17.040 --> 07:27.280] the snapshots and writes the visualizations as PNG to disk. And yes, it has a logo. So, [07:27.280 --> 07:33.240] it's complete. The visualizations look something like this. I apologize if the points are a [07:33.240 --> 07:39.440] bit too small for you to see. So, as we said, buffers written to disk is a line that keeps [07:39.440 --> 07:46.520] going up until stats reset. When the stats get reset, it starts from zero again. So, [07:46.520 --> 07:52.680] perhaps this is more useful. This is the buffer write rate in megabytes per second. So, you [07:52.680 --> 07:57.760] can see exactly how many buffers your Postgres server was writing to disk at any moment in [07:57.760 --> 08:02.280] time. And also, you can analyze what was happening because of the background writer. [08:02.280 --> 08:24.520] No? Yes. Thank you. So, you can see here that because this was a test I ran on my laptop [08:24.520 --> 08:32.720] with a script that was just inserting rows into the same table. The checkpoint line, [08:32.720 --> 08:38.120] which is the orange line, didn't do much because it wasn't scheduled activity taken [08:38.120 --> 08:44.400] care of by checkpoints. But you can see that back ends were doing most of the work. And [08:44.400 --> 08:49.160] also, you can see that the background writer, which is the green line, didn't get the chance [08:49.160 --> 08:55.800] to participate in all this buffer writing because from what we can see from the very [08:55.800 --> 09:04.640] low line, its limits were set to low for production. So, you can gain insights into the behavior [09:04.640 --> 09:11.560] of your Postgres server like this. Or you can look at connection versus status count. [09:11.560 --> 09:16.400] So, you can see how many connections you had coming into your server from clients, how [09:16.400 --> 09:21.520] many of them were active, how many of them were idle, how many were idle in transaction, [09:21.520 --> 09:28.840] and so on. But you can also see which users were taking up those connections. And I think [09:28.840 --> 09:33.560] that's really interesting when you have like an environment that's used by multiple applications [09:33.560 --> 09:38.440] so you can know which developers to blame when it all goes south. [09:38.440 --> 09:45.160] Weight events. As I was testing this on my laptop and was overflowing it with IO because [09:45.160 --> 09:54.400] I was inserting millions of rows into a table, I generated an IO data file read, sorry for [09:54.400 --> 10:00.320] the small letters, weight condition. And that was captured by the snapshot that was being [10:00.320 --> 10:09.080] taken every 10 seconds or so for this example. So, thank you for listening. The project is [10:09.080 --> 10:17.360] going to be live at github.com slash virus slash pgstatvis in a few moments. You can find [10:17.360 --> 10:28.840] me on master don. And what the hell? I'll do it right now. So, as we said, this is alpha [10:28.840 --> 10:36.120] quality code. Oh, I forgot to say that it doesn't do any scheduling or any maintenance [10:36.120 --> 10:43.160] or any partitioning of those internal tables where it keeps the snapshots. So you can delete [10:43.160 --> 10:53.320] them by hand. You can schedule the snapshots very easily with any tool you like, like cron [10:53.320 --> 10:59.160] or pgcron. But I didn't want to make this a dependency on the extension. So you can [10:59.160 --> 11:15.640] just configure it yourself. And I can just go to settings and make it public right now. [11:15.640 --> 11:45.560] Cool. Thank you. Any questions? No, okay. [11:45.560 --> 11:50.240] Thanks anyway. Thanks. Thanks for doing me.