こんにちは。豊泉です。
データベースはPostgreSQLを使うことが多いのですが、最近実行計画をあまり見ていないなと思ったので、PostgreSQLでいくつかクエリを試しながら実行計画を見てみようと思います。
検証環境
DockerでPostgreSQLを立ち上げます。
クエリの実行やデータの確認にはpgAdminを使用します。
// compose.yml services: postgres: image: postgres:16.3 ports: - ${DB_PORT}:${DB_PORT} volumes: - db-store:/var/lib/postgresql/data environment: POSTGRES_USER: ${DB_USER} POSTGRES_PASSWORD: ${DB_PASSWORD} POSTGRES_PORT: ${DB_PORT} POSTGRES_DB: ${DB_NAME} TZ: Asia/Tokyo restart: always pgadmin: image: dpage/pgadmin4:8.8 ports: - 8888:80 volumes: - pgadmin-data:/var/lib/pgadmin environment: PGADMIN_DEFAULT_EMAIL: root@example.com PGADMIN_DEFAULT_PASSWORD: ${DB_PASSWORD} TZ: Asia/Tokyo depends_on: - postgres restart: always volumes: db-store: pgadmin-data:
検証データ
検証データの準備にはprismaとfakerを使用しました。
// package.json { "name": "postgresql-explain", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "test": "echo \"Error: no test specified\" && exit 1" }, "author": "", "license": "ISC", "dependencies": { "@faker-js/faker": "^8.4.1", "@prisma/client": "^5.15.0", "prisma": "^5.15.0", "ts-node": "^10.9.2" }, "prisma": { "seed": "ts-node prisma/seed.ts" } }
AccountテーブルとUserテーブルを作成。
// schema.prisma generator client { provider = "prisma-client-js" } datasource db { provider = "postgresql" url = env("DATABASE_URL") } enum AccountType { FREE STANDARD PREMIUM SPECIAL } model Account { id Int @id @default(autoincrement()) name String @db.VarChar(255) type AccountType @default(STANDARD) createdAt DateTime @default(now()) @map("created_at") updatedAt DateTime @default(now()) @updatedAt @map("updated_at") users User[] @@map("accounts") } enum UserRoleType { OWNER ADMIN STANDARD } model User { id Int @id @default(autoincrement()) accountId Int @map("account_id") name String @db.VarChar(255) email String @db.VarChar(255) @unique role UserRoleType @default(STANDARD) createdAt DateTime @default(now()) @map("created_at") updatedAt DateTime @default(now()) @updatedAt @map("updated_at") account Account @relation(fields: [accountId], references: [id]) @@map("users") }
データを作成。
// seed.ts import { PrismaClient } from '@prisma/client'; import { faker } from '@faker-js/faker'; const prisma = new PrismaClient(); const ACCOUNT_COUNT = 10; const USER_COUNT = 10000; enum AccountType { FREE = 'FREE', STANDARD = 'STANDARD', PREMIUM = 'PREMIUM', SPECIAL = 'SPECIAL', } enum UserRoleType { OWNER = 'OWNER', ADMIN = 'ADMIN', STANDARD = 'STANDARD', } async function main() { const accountTypes = Object.values(AccountType); const userRoleTypes = Object.values(UserRoleType); for (let i = 0; i < ACCOUNT_COUNT; i++) { const type = accountTypes[Math.floor(Math.random() * accountTypes.length)]; const account = await prisma.account.create({ data: { name: faker.company.name(), type: type, } }); const users = [...Array(USER_COUNT)].map((_, index) => { const role = userRoleTypes[Math.floor(Math.random() * userRoleTypes.length)]; return { name: faker.person.fullName(), email: `${i}${index}${faker.internet.email()}`, role: role, accountId: account.id, }; }); await prisma.user.createMany({ data: users }); } } main() .then(async () => { await prisma.$disconnect(); }) .catch(async (e) => { console.error(e); await prisma.$disconnect(); process.exit(1); });
クエリを試してみる
EXPLAIN
コマンドを使用してクエリの実行計画を表示します。- 下記のオプションもつけてみます。
ANALYZE
: 実際の実行時間やその他の統計情報を表示BUFFERS
: バッファの使用情報(メモリorディスクのI/O状況)を表示
- クエリの実行前に
VACUUM ANALYZE;
しておきます。
Accountテーブルに10, 100, 1000, 10000, 100000件のデータを用意し全件取得してみる
// 実行するクエリ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM public.accounts;
// 結果 // 10 rows "Seq Scan on accounts (cost=0.00..1.10 rows=10 width=42) (actual time=0.012..0.014 rows=10 loops=1)" " Buffers: shared hit=1" "Planning Time: 0.271 ms" "Execution Time: 0.031 ms" // 100 rows "Seq Scan on accounts (cost=0.00..2.00 rows=100 width=43) (actual time=0.010..0.018 rows=100 loops=1)" " Buffers: shared hit=1" "Planning Time: 0.058 ms" "Execution Time: 0.038 ms" // 1000 rows "Seq Scan on accounts (cost=0.00..20.00 rows=1000 width=43) (actual time=0.010..0.104 rows=1000 loops=1)" " Buffers: shared hit=10" "Planning Time: 0.057 ms" "Execution Time: 0.173 ms" // 10000 rows "Seq Scan on accounts (cost=0.00..193.00 rows=10000 width=43) (actual time=0.017..1.218 rows=10000 loops=1)" " Buffers: shared hit=93" "Planning Time: 0.074 ms" "Execution Time: 1.820 ms" // 100000 rows "Seq Scan on accounts (cost=0.00..1921.00 rows=100000 width=43) (actual time=0.011..8.659 rows=100000 loops=1)" " Buffers: shared hit=921" "Planning Time: 2.020 ms" "Execution Time: 17.374 ms"
- 実行結果は上記のとおりでした
- 試しにvarchar(255)型のカラムを追加してみたら widthの値が少し大きくなりました。
- ANALYZEコマンドを実行すると推定行数と実際の行数が一致していました。
WHERE条件を追加してインデックスありorなしで試してみる
// 実行するクエリ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM public.accounts WHERE type = 'STANDARD';
// 結果 // インデックスなし // 10 rows "Seq Scan on accounts (cost=0.00..1.12 rows=1 width=41) (actual time=0.017..0.017 rows=0 loops=1)" " Filter: (type = 'STANDARD'::""AccountType"")" " Rows Removed by Filter: 10" " Buffers: shared hit=1" "Planning Time: 0.076 ms" "Execution Time: 0.044 ms" // 100 rows "Seq Scan on accounts (cost=0.00..2.25 rows=24 width=44) (actual time=0.018..0.032 rows=24 loops=1)" " Filter: (type = 'STANDARD'::""AccountType"")" " Rows Removed by Filter: 76" " Buffers: shared hit=1" "Planning Time: 0.090 ms" "Execution Time: 0.054 ms" // 1000 rows "Seq Scan on accounts (cost=0.00..22.50 rows=252 width=43) (actual time=0.015..0.130 rows=252 loops=1)" " Filter: (type = 'STANDARD'::""AccountType"")" " Rows Removed by Filter: 748" " Buffers: shared hit=10" "Planning Time: 0.082 ms" "Execution Time: 0.161 ms" // 10000 rows "Seq Scan on accounts (cost=0.00..218.00 rows=2532 width=43) (actual time=0.012..1.162 rows=2532 loops=1)" " Filter: (type = 'STANDARD'::""AccountType"")" " Rows Removed by Filter: 7468" " Buffers: shared hit=93" "Planning Time: 0.076 ms" "Execution Time: 1.320 ms" // 100000 rows "Seq Scan on accounts (cost=0.00..2171.00 rows=25000 width=43) (actual time=0.016..11.027 rows=24963 loops=1)" " Filter: (type = 'STANDARD'::""AccountType"")" " Rows Removed by Filter: 75037" " Buffers: shared hit=921" "Planning Time: 0.080 ms" "Execution Time: 12.270 ms" // インデックスあり // 10 rows "Seq Scan on accounts (cost=0.00..1.12 rows=3 width=47) (actual time=0.018..0.021 rows=3 loops=1)" " Filter: (type = 'STANDARD'::""AccountType"")" " Rows Removed by Filter: 7" " Buffers: shared hit=1" "Planning Time: 0.127 ms" "Execution Time: 0.040 ms" // 100 rows "Seq Scan on accounts (cost=0.00..2.25 rows=23 width=44) (actual time=0.015..0.030 rows=23 loops=1)" " Filter: (type = 'STANDARD'::""AccountType"")" " Rows Removed by Filter: 77" " Buffers: shared hit=1" "Planning Time: 0.102 ms" "Execution Time: 0.053 ms" // 1000 rows "Bitmap Heap Scan on accounts (cost=6.06..19.13 rows=246 width=43) (actual time=0.042..0.099 rows=246 loops=1)" " Recheck Cond: (type = 'STANDARD'::""AccountType"")" " Heap Blocks: exact=10" " Buffers: shared hit=11" " -> Bitmap Index Scan on accounts_type_idx (cost=0.00..6.00 rows=246 width=0) (actual time=0.029..0.029 rows=246 loops=1)" " Index Cond: (type = 'STANDARD'::""AccountType"")" " Buffers: shared hit=1" "Planning Time: 0.123 ms" "Execution Time: 0.141 ms" // 10000 rows "Bitmap Heap Scan on accounts (cost=31.74..156.11 rows=2510 width=43) (actual time=0.200..1.066 rows=2510 loops=1)" " Recheck Cond: (type = 'STANDARD'::""AccountType"")" " Heap Blocks: exact=93" " Buffers: shared hit=96" " -> Bitmap Index Scan on accounts_type_idx (cost=0.00..31.11 rows=2510 width=0) (actual time=0.170..0.171 rows=2510 loops=1)" " Index Cond: (type = 'STANDARD'::""AccountType"")" " Buffers: shared hit=3" "Planning Time: 0.106 ms" "Execution Time: 1.544 ms" // 100000 rows "Bitmap Heap Scan on accounts (cost=280.18..1510.68 rows=24760 width=43) (actual time=1.177..6.132 rows=24963 loops=1)" " Recheck Cond: (type = 'STANDARD'::""AccountType"")" " Heap Blocks: exact=921" " Buffers: shared hit=944" " -> Bitmap Index Scan on accounts_type_idx (cost=0.00..273.99 rows=24760 width=0) (actual time=1.005..1.005 rows=24963 loops=1)" " Index Cond: (type = 'STANDARD'::""AccountType"")" " Buffers: shared hit=23" "Planning Time: 0.097 ms" "Execution Time: 7.361 ms"
- 実行結果は上記のとおりでした
- インデックスが存在する場合は1000行くらいから実行計画がBitmap Heap Scanになっていました。
- また、100000行くらいになると実行速度にも差が出てきていました。
JOIN, ORDER BY, COUNT, LIMIT/OFFSETを試してみる
JOIN
// 実行するクエリ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM public.accounts INNER JOIN users ON accounts.id = users.account_id;
// 結果 "Hash Join (cost=1.23..2657.97 rows=100000 width=114) (actual time=0.025..33.274 rows=100000 loops=1)" " Hash Cond: (users.account_id = accounts.id)" " Buffers: shared hit=1284" " -> Seq Scan on users (cost=0.00..2283.00 rows=100000 width=72) (actual time=0.006..7.511 rows=100000 loops=1)" " Buffers: shared hit=1283" " -> Hash (cost=1.10..1.10 rows=10 width=42) (actual time=0.009..0.010 rows=10 loops=1)" " Buckets: 1024 Batches: 1 Memory Usage: 9kB" " Buffers: shared hit=1" " -> Seq Scan on accounts (cost=0.00..1.10 rows=10 width=42) (actual time=0.002..0.003 rows=10 loops=1)" " Buffers: shared hit=1" "Planning:" " Buffers: shared hit=4" "Planning Time: 0.243 ms" "Execution Time: 36.996 ms"
ORDER BY
// 実行するクエリ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM public.users ORDER BY created_at DESC;
// 結果 "Sort (cost=14689.82..14939.82 rows=100000 width=72) (actual time=23.787..31.736 rows=100000 loops=1)" " Sort Key: created_at DESC" " Sort Method: external merge Disk: 8456kB" " Buffers: shared hit=1283, temp read=1057 written=1060" " -> Seq Scan on users (cost=0.00..2283.00 rows=100000 width=72) (actual time=0.009..9.380 rows=100000 loops=1)" " Buffers: shared hit=1283" "Planning Time: 0.324 ms" "Execution Time: 35.338 ms"
COUNT
// 実行するクエリ EXPLAIN (ANALYZE, BUFFERS) SELECT count(id) FROM public.accounts;
// 結果 "Aggregate (cost=1.12..1.14 rows=1 width=8) (actual time=0.021..0.022 rows=1 loops=1)" " Buffers: shared hit=1" " -> Seq Scan on accounts (cost=0.00..1.10 rows=10 width=4) (actual time=0.011..0.013 rows=10 loops=1)" " Buffers: shared hit=1" "Planning Time: 0.079 ms" "Execution Time: 0.046 ms"
LIMIT/OFFSET
// 実行するクエリ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM public.accounts LIMIT 10 OFFSET 10;
// 結果 "Limit (cost=1.10..1.21 rows=1 width=42) (actual time=0.015..0.016 rows=0 loops=1)" " Buffers: shared hit=1" " -> Seq Scan on accounts (cost=0.00..1.10 rows=10 width=42) (actual time=0.011..0.013 rows=10 loops=1)" " Buffers: shared hit=1" "Planning Time: 0.220 ms" "Execution Time: 0.031 ms"
クエリの実行速度を見てみる
pg_stat_statementsというエクステンションを入れることで、下記のようにしてクエリの情報を見ることができるようです。
// 実行するクエリ SELECT query, calls, total_exec_time FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10;
実行計画はJSONやYAML、XMLでも表示できるみたい
// 実行するクエリ EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) // jsonの場合 SELECT * FROM public.accounts;
// json [ { "Plan": { "Node Type": "Seq Scan", "Parallel Aware": false, "Async Capable": false, "Relation Name": "accounts", "Alias": "accounts", "Startup Cost": 0.00, "Total Cost": 2279.00, "Plan Rows": 100000, "Plan Width": 71, "Actual Startup Time": 0.079, "Actual Total Time": 36.894, "Actual Rows": 100000, "Actual Loops": 1, "Shared Hit Blocks": 1279, "Shared Read Blocks": 0, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0 }, "Planning": { "Shared Hit Blocks": 0, "Shared Read Blocks": 0, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0 }, "Planning Time": 4.952, "Triggers": [ ], "Execution Time": 43.987 } ]
// yaml - Plan: Node Type: "Seq Scan" Parallel Aware: false Async Capable: false Relation Name: "accounts" Alias: "accounts" Startup Cost: 0.00 Total Cost: 2279.00 Plan Rows: 100000 Plan Width: 71 Actual Startup Time: 0.080 Actual Total Time: 13.224 Actual Rows: 100000 Actual Loops: 1 Shared Hit Blocks: 1279 Shared Read Blocks: 0 Shared Dirtied Blocks: 0 Shared Written Blocks: 0 Local Hit Blocks: 0 Local Read Blocks: 0 Local Dirtied Blocks: 0 Local Written Blocks: 0 Temp Read Blocks: 0 Temp Written Blocks: 0 Planning: Shared Hit Blocks: 0 Shared Read Blocks: 0 Shared Dirtied Blocks: 0 Shared Written Blocks: 0 Local Hit Blocks: 0 Local Read Blocks: 0 Local Dirtied Blocks: 0 Local Written Blocks: 0 Temp Read Blocks: 0 Temp Written Blocks: 0 Planning Time: 0.411 Triggers: Execution Time: 19.508
// xml <explain xmlns="http://www.postgresql.org/2009/explain"> <Query> <Plan> <Node-Type>Seq Scan</Node-Type> <Parallel-Aware>false</Parallel-Aware> <Async-Capable>false</Async-Capable> <Relation-Name>accounts</Relation-Name> <Alias>accounts</Alias> <Startup-Cost>0.00</Startup-Cost> <Total-Cost>2279.00</Total-Cost> <Plan-Rows>100000</Plan-Rows> <Plan-Width>71</Plan-Width> <Actual-Startup-Time>0.041</Actual-Startup-Time> <Actual-Total-Time>12.188</Actual-Total-Time> <Actual-Rows>100000</Actual-Rows> <Actual-Loops>1</Actual-Loops> <Shared-Hit-Blocks>1279</Shared-Hit-Blocks> <Shared-Read-Blocks>0</Shared-Read-Blocks> <Shared-Dirtied-Blocks>0</Shared-Dirtied-Blocks> <Shared-Written-Blocks>0</Shared-Written-Blocks> <Local-Hit-Blocks>0</Local-Hit-Blocks> <Local-Read-Blocks>0</Local-Read-Blocks> <Local-Dirtied-Blocks>0</Local-Dirtied-Blocks> <Local-Written-Blocks>0</Local-Written-Blocks> <Temp-Read-Blocks>0</Temp-Read-Blocks> <Temp-Written-Blocks>0</Temp-Written-Blocks> </Plan> <Planning> <Shared-Hit-Blocks>0</Shared-Hit-Blocks> <Shared-Read-Blocks>0</Shared-Read-Blocks> <Shared-Dirtied-Blocks>0</Shared-Dirtied-Blocks> <Shared-Written-Blocks>0</Shared-Written-Blocks> <Local-Hit-Blocks>0</Local-Hit-Blocks> <Local-Read-Blocks>0</Local-Read-Blocks> <Local-Dirtied-Blocks>0</Local-Dirtied-Blocks> <Local-Written-Blocks>0</Local-Written-Blocks> <Temp-Read-Blocks>0</Temp-Read-Blocks> <Temp-Written-Blocks>0</Temp-Written-Blocks> </Planning> <Planning-Time>0.409</Planning-Time> <Triggers> </Triggers> <Execution-Time>15.334</Execution-Time> </Query> </explain>
使用したソースはこちら
最後に
普段はORMがあるおかげでSQLを書くことは少ないですが、こんな感じで他にもいろんなクエリを試してみるのは発見も多くて面白そうです。