diff --git a/apps/docs/pages/guides/ai/choosing-compute-addon.mdx b/apps/docs/pages/guides/ai/choosing-compute-addon.mdx index fb7bf5c9fa..fef2f75860 100644 --- a/apps/docs/pages/guides/ai/choosing-compute-addon.mdx +++ b/apps/docs/pages/guides/ai/choosing-compute-addon.mdx @@ -21,6 +21,36 @@ The number of dimensions in your embeddings is the most important factor in choo ## HNSW +### 384 Dimensions + +This benchmark uses the dbpedia-entities-openai-1M dataset, which contains 1,000,000 embeddings of text. Each embedding is 384 dimensions created with the [gte-small](https://huggingface.co/Supabase/gte-small). + + + + +| Plan | Vectors | m | ef_construction | ef_search | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | +| ------ | --------- | --- | --------------- | --------- | ---- | ------------ | ----------- | ---------- | ------ | +| Free | 100,000 | 16 | 64 | 60 | 580 | 0.017 sec | 0.024 sec | 1.2 (Swap) | 1 GB | +| Small | 250,000 | 24 | 64 | 60 | 440 | 0.022 sec | 0.033 sec | 2 GB | 2 GB | +| Medium | 500,000 | 24 | 64 | 80 | 350 | 0.028 sec | 0.045 sec | 4 GB | 4 GB | +| Large | 1,000,000 | 32 | 80 | 100 | 270 | 0.073 sec | 0.108 sec | 7 GB | 8 GB | +| XL | 1,000,000 | 32 | 80 | 100 | 525 | 0.038 sec | 0.059 sec | 9 GB | 16 GB | +| 2XL | 1,000,000 | 32 | 80 | 100 | 790 | 0.025 sec | 0.037 sec | 9 GB | 32 GB | +| 4XL | 1,000,000 | 32 | 80 | 100 | 1650 | 0.015 sec | 0.018 sec | 11 GB | 64 GB | +| 8XL | 1,000,000 | 32 | 80 | 100 | 2690 | 0.015 sec | 0.016 sec | 13 GB | 128 GB | +| 12XL | 1,000,000 | 32 | 80 | 100 | 3900 | 0.014 sec | 0.016 sec | 13 GB | 192 GB | +| 16XL | 1,000,000 | 32 | 80 | 100 | 4200 | 0.014 sec | 0.016 sec | 20 GB | 256 GB | + +Accuracy was 0.99 for benchmarks. + + + + ### 1536 Dimensions This benchmark uses the [dbpedia-entities-openai-1M](https://huggingface.co/datasets/KShivendu/dbpedia-entities-openai-1M) dataset, which contains 1,000,000 embeddings of text. And 224,482 embeddings from [Wikipedia articles](https://huggingface.co/datasets/Supabase/wikipedia-en-embeddings) for compute add-ons `large` and below. Each embedding is 1536 dimensions created with the [OpenAI Embeddings API](https://platform.openai.com/docs/guides/embeddings). @@ -33,18 +63,18 @@ This benchmark uses the [dbpedia-entities-openai-1M](https://huggingface.co/data > -| Plan | Vectors | m | ef_construction | ef_search | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | -| ------ | --------- | --- | --------------- | --------- | ---- | ------------ | ----------- | ------------------ | ------ | -| Free | 15,000 | 16 | 40 | 40 | 480 | 0.011 sec | 0.016 sec | 1 GB + 200 Mb Swap | 1 GB | -| Small | 50,000 | 32 | 64 | 100 | 175 | 0.031 sec | 0.051 sec | 2 GB + 200 Mb Swap | 2 GB | -| Medium | 100,000 | 32 | 64 | 100 | 240 | 0.083 sec | 0.126 sec | 4 GB | 4 GB | -| Large | 224,482 | 32 | 64 | 100 | 280 | 0.017 sec | 0.028 sec | 8 GB | 8 GB | -| XL | 500,000 | 24 | 56 | 100 | 360 | 0.055 sec | 0.135 sec | 13 GB | 16 GB | -| 2XL | 1,000,000 | 24 | 56 | 250 | 560 | 0.036 sec | 0.058 sec | 32 GB | 32 GB | -| 4XL | 1,000,000 | 24 | 56 | 250 | 950 | 0.021 sec | 0.033 sec | 39 GB | 64 GB | -| 8XL | 1,000,000 | 24 | 56 | 250 | 1650 | 0.016 sec | 0.023 sec | 40 GB | 128 GB | -| 12XL | 1,000,000 | 24 | 56 | 250 | 1900 | 0.015 sec | 0.021 sec | 38 GB | 192 GB | -| 16XL | 1,000,000 | 24 | 56 | 250 | 2200 | 0.015 sec | 0.020 sec | 40 GB | 256 GB | +| Plan | Vectors | m | ef_construction | ef_search | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | +| ------ | --------- | --- | --------------- | --------- | ---- | ------------ | ----------- | ------------- | ------ | +| Free | 15,000 | 16 | 40 | 40 | 480 | 0.011 sec | 0.016 sec | 1.2 GB (Swap) | 1 GB | +| Small | 50,000 | 32 | 64 | 100 | 175 | 0.031 sec | 0.051 sec | 2.2 GB (Swap) | 2 GB | +| Medium | 100,000 | 32 | 64 | 100 | 240 | 0.083 sec | 0.126 sec | 4 GB | 4 GB | +| Large | 224,482 | 32 | 64 | 100 | 280 | 0.017 sec | 0.028 sec | 8 GB | 8 GB | +| XL | 500,000 | 24 | 56 | 100 | 360 | 0.055 sec | 0.135 sec | 13 GB | 16 GB | +| 2XL | 1,000,000 | 24 | 56 | 250 | 560 | 0.036 sec | 0.058 sec | 32 GB | 32 GB | +| 4XL | 1,000,000 | 24 | 56 | 250 | 950 | 0.021 sec | 0.033 sec | 39 GB | 64 GB | +| 8XL | 1,000,000 | 24 | 56 | 250 | 1650 | 0.016 sec | 0.023 sec | 40 GB | 128 GB | +| 12XL | 1,000,000 | 24 | 56 | 250 | 1900 | 0.015 sec | 0.021 sec | 38 GB | 192 GB | +| 16XL | 1,000,000 | 24 | 56 | 250 | 2200 | 0.015 sec | 0.020 sec | 40 GB | 256 GB | Accuracy was 0.99 for benchmarks. @@ -59,49 +89,61 @@ It is possible to upload more vectors to a single table if Memory allows it (for +
+ multi database + multi database +
+ ## IVFFlat -### 512 Dimensions +### 384 Dimensions -This benchmark uses the [GloVe Reddit comments](https://nlp.stanford.edu/projects/glove/) dataset, which contains 1,623,397 embeddings of text. Each embedding is 512 dimensions. Random vectors were generated for queries. +This benchmark uses the dbpedia-entities-openai-1M dataset, which contains 1,000,000 embeddings of text. Each embedding is 384 dimensions created with the [gte-small](https://huggingface.co/Supabase/gte-small). - + -| Plan | Vectors | Lists | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | -| ------ | --------- | ----- | ---- | ------------ | ----------- | ------------------ | ------ | -| Free | 100,000 | 100 | 250 | 0.395 sec | 0.432 sec | 1 GB + 300 Mb Swap | 1 GB | -| Small | 250,000 | 250 | 440 | 0.223 sec | 0.250 sec | 2 GB + 200 Mb Swap | 2 GB | -| Medium | 500,000 | 500 | 425 | 0.116 sec | 0.143 sec | 3.7 GB | 4 GB | -| Large | 1,000,000 | 1000 | 515 | 0.096 sec | 0.116 sec | 7.5 GB | 8 GB | -| XL | 1,623,397 | 1275 | 465 | 0.212 sec | 0.272 sec | 14 GB | 16 GB | -| 2XL | 1,623,397 | 1275 | 1400 | 0.061 sec | 0.075 sec | 22 GB | 32 GB | -| 4XL | 1,623,397 | 1275 | 1800 | 0.027 sec | 0.043 sec | 20 GB | 64 GB | -| 8XL | 1,623,397 | 1275 | 2850 | 0.032 sec | 0.049 sec | 21 GB | 128 GB | -| 12XL | 1,623,397 | 1275 | 3700 | 0.020 sec | 0.036 sec | 26 GB | 192 GB | -| 16XL | 1,623,397 | 1275 | 3700 | 0.025 sec | 0.042 sec | 29 GB | 256 GB | +| Plan | Vectors | Lists | Probes | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | +| ------ | --------- | ----- | ------ | ---- | ------------ | ----------- | ------------- | ------ | +| Free | 100,000 | 500 | 50 | 205 | 0.048 sec | 0.066 sec | 1.2 GB (Swap) | 1 GB | +| Small | 250,000 | 1000 | 60 | 160 | 0.062 sec | 0.079 sec | 2 GB | 2 GB | +| Medium | 500,000 | 2000 | 80 | 120 | 0.082 sec | 0.104 sec | 3.2 GB | 4 GB | +| Large | 1,000,000 | 5000 | 150 | 75 | 0.269 sec | 0.375 sec | 6.5 GB | 8 GB | +| XL | 1,000,000 | 5000 | 150 | 150 | 0.131 sec | 0.178 sec | 9 GB | 16 GB | +| 2XL | 1,000,000 | 5000 | 150 | 300 | 0.066 sec | 0.099 sec | 10 GB | 32 GB | +| 4XL | 1,000,000 | 5000 | 150 | 570 | 0.035 sec | 0.046 sec | 10 GB | 64 GB | +| 8XL | 1,000,000 | 5000 | 150 | 1400 | 0.023 sec | 0.028 sec | 12 GB | 128 GB | +| 12XL | 1,000,000 | 5000 | 150 | 1550 | 0.030 sec | 0.039 sec | 12 GB | 192 GB | +| 16XL | 1,000,000 | 5000 | 150 | 1800 | 0.030 sec | 0.039 sec | 16 GB | 256 GB | - + -| Plan | Vectors | Lists | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | -| ------ | --------- | ----- | --- | ------------ | ----------- | --------- | ------ | -| Free | 100,000 | 100 | - | - | - | - | 1 GB | -| Small | 250,000 | 250 | - | - | - | - | 2 GB | -| Medium | 500,000 | 500 | 75 | 0.656 sec | 0.750 sec | 3.7 GB | 4 GB | -| Large | 1,000,000 | 1000 | 102 | 0.488 sec | 0.580 sec | 7.5 GB | 8 GB | -| XL | 1,000,000 | 1000 | 188 | 0.525 sec | 0.596 sec | 14 GB | 16 GB | -| XL | 1,623,397 | 1275 | 75 | 0.679 sec | 0.798 sec | 14 GB | 16 GB | -| 2XL | 1,623,397 | 1275 | 160 | 0.314 sec | 0.384 sec | 22 GB | 32 GB | -| 4XL | 1,623,397 | 1275 | 300 | 0.083 sec | 0.113 sec | 20 GB | 64 GB | -| 8XL | 1,623,397 | 1275 | 565 | 0.105 sec | 0.141 sec | 21 GB | 128 GB | -| 12XL | 1,623,397 | 1275 | 840 | 0.093 sec | 0.124 sec | 26 GB | 192 GB | -| 16XL | 1,623,397 | 1275 | 940 | 0.084 sec | 0.108 sec | 29 GB | 256 GB | +| Plan | Vectors | Lists | Probes | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | +| ------ | --------- | ----- | ------ | ---- | ------------ | ----------- | ------------- | ------ | +| Free | 100,000 | 500 | 70 | 160 | 0.062 sec | 0.079 sec | 1.2 GB (Swap) | 1 GB | +| Small | 250,000 | 1000 | 100 | 100 | 0.096 sec | 0.113 sec | 2 GB | 2 GB | +| Medium | 500,000 | 2000 | 120 | 85 | 0.117 sec | 0.147 sec | 3.2 GB | 4 GB | +| Large | 1,000,000 | 5000 | 250 | 50 | 0.394 sec | 0.521 sec | 6.5 GB | 8 GB | +| XL | 1,000,000 | 5000 | 250 | 100 | 0.197 sec | 0.255 sec | 10 GB | 16 GB | +| 2XL | 1,000,000 | 5000 | 250 | 200 | 0.098 sec | 0.140 sec | 10 GB | 32 GB | +| 4XL | 1,000,000 | 5000 | 250 | 390 | 0.051 sec | 0.066 sec | 11 GB | 64 GB | +| 8XL | 1,000,000 | 5000 | 250 | 850 | 0.036 sec | 0.042 sec | 12 GB | 128 GB | +| 12XL | 1,000,000 | 5000 | 250 | 1000 | 0.043 sec | 0.055 sec | 13 GB | 192 GB | +| 16XL | 1,000,000 | 5000 | 250 | 1200 | 0.043 sec | 0.055 sec | 16 GB | 256 GB | @@ -118,18 +160,18 @@ This benchmark uses the [gist-960-angular](http://corpus-texmex.irisa.fr/) datas > -| Plan | Vectors | Lists | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | -| ------ | --------- | ----- | ---- | ------------ | ----------- | ------------------ | ------ | -| Free | 30,000 | 30 | 75 | 0.065 sec | 0.088 sec | 1 GB + 100 Mb Swap | 1 GB | -| Small | 100,000 | 100 | 78 | 0.064 sec | 0.092 sec | 1.8 GB | 2 GB | -| Medium | 250,000 | 250 | 58 | 0.085 sec | 0.129 sec | 3.2 GB | 4 GB | -| Large | 500,000 | 500 | 55 | 0.088 sec | 0.140 sec | 5 GB | 8 GB | -| XL | 1,000,000 | 1000 | 110 | 0.046 sec | 0.070 sec | 14 GB | 16 GB | -| 2XL | 1,000,000 | 1000 | 235 | 0.083 sec | 0.136 sec | 10 GB | 32 GB | -| 4XL | 1,000,000 | 1000 | 420 | 0.071 sec | 0.106 sec | 11 GB | 64 GB | -| 8XL | 1,000,000 | 1000 | 815 | 0.072 sec | 0.106 sec | 13 GB | 128 GB | -| 12XL | 1,000,000 | 1000 | 1150 | 0.052 sec | 0.078 sec | 15.5 GB | 192 GB | -| 16XL | 1,000,000 | 1000 | 1345 | 0.072 sec | 0.106 sec | 17.5 GB | 256 GB | +| Plan | Vectors | Lists | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | +| ------ | --------- | ----- | ---- | ------------ | ----------- | ------------- | ------ | +| Free | 30,000 | 30 | 75 | 0.065 sec | 0.088 sec | 1.1 GB (Swap) | 1 GB | +| Small | 100,000 | 100 | 78 | 0.064 sec | 0.092 sec | 1.8 GB | 2 GB | +| Medium | 250,000 | 250 | 58 | 0.085 sec | 0.129 sec | 3.2 GB | 4 GB | +| Large | 500,000 | 500 | 55 | 0.088 sec | 0.140 sec | 5 GB | 8 GB | +| XL | 1,000,000 | 1000 | 110 | 0.046 sec | 0.070 sec | 14 GB | 16 GB | +| 2XL | 1,000,000 | 1000 | 235 | 0.083 sec | 0.136 sec | 10 GB | 32 GB | +| 4XL | 1,000,000 | 1000 | 420 | 0.071 sec | 0.106 sec | 11 GB | 64 GB | +| 8XL | 1,000,000 | 1000 | 815 | 0.072 sec | 0.106 sec | 13 GB | 128 GB | +| 12XL | 1,000,000 | 1000 | 1150 | 0.052 sec | 0.078 sec | 15.5 GB | 192 GB | +| 16XL | 1,000,000 | 1000 | 1345 | 0.072 sec | 0.106 sec | 17.5 GB | 256 GB | @@ -146,18 +188,18 @@ This benchmark uses the [dbpedia-entities-openai-1M](https://huggingface.co/data > -| Plan | Vectors | Lists | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | -| ------ | --------- | ----- | ---- | ------------ | ----------- | ------------------ | ------ | -| Free | 20,000 | 40 | 135 | 0.372 sec | 0.412 sec | 1 GB + 200 Mb Swap | 1 GB | -| Small | 50,000 | 100 | 140 | 0.357 sec | 0.398 sec | 1.8 GB | 2 GB | -| Medium | 100,000 | 200 | 130 | 0.383 sec | 0.446 sec | 3.7 GB | 4 GB | -| Large | 250,000 | 500 | 130 | 0.378 sec | 0.434 sec | 7 GB | 8 GB | -| XL | 500,000 | 1000 | 235 | 0.213 sec | 0.271 sec | 13.5 GB | 16 GB | -| 2XL | 1,000,000 | 2000 | 380 | 0.133 sec | 0.236 sec | 30 GB | 32 GB | -| 4XL | 1,000,000 | 2000 | 720 | 0.068 sec | 0.120 sec | 35 GB | 64 GB | -| 8XL | 1,000,000 | 2000 | 1250 | 0.039 sec | 0.066 sec | 38 GB | 128 GB | -| 12XL | 1,000,000 | 2000 | 1600 | 0.030 sec | 0.052 sec | 41 GB | 192 GB | -| 16XL | 1,000,000 | 2000 | 1790 | 0.029 sec | 0.051 sec | 45 GB | 256 GB | +| Plan | Vectors | Lists | QPS | Latency Mean | Latency p95 | RAM Usage | RAM | +| ------ | --------- | ----- | ---- | ------------ | ----------- | ------------- | ------ | +| Free | 20,000 | 40 | 135 | 0.372 sec | 0.412 sec | 1.2 GB (Swap) | 1 GB | +| Small | 50,000 | 100 | 140 | 0.357 sec | 0.398 sec | 1.8 GB | 2 GB | +| Medium | 100,000 | 200 | 130 | 0.383 sec | 0.446 sec | 3.7 GB | 4 GB | +| Large | 250,000 | 500 | 130 | 0.378 sec | 0.434 sec | 7 GB | 8 GB | +| XL | 500,000 | 1000 | 235 | 0.213 sec | 0.271 sec | 13.5 GB | 16 GB | +| 2XL | 1,000,000 | 2000 | 380 | 0.133 sec | 0.236 sec | 30 GB | 32 GB | +| 4XL | 1,000,000 | 2000 | 720 | 0.068 sec | 0.120 sec | 35 GB | 64 GB | +| 8XL | 1,000,000 | 2000 | 1250 | 0.039 sec | 0.066 sec | 38 GB | 128 GB | +| 12XL | 1,000,000 | 2000 | 1600 | 0.030 sec | 0.052 sec | 41 GB | 192 GB | +| 16XL | 1,000,000 | 2000 | 1790 | 0.029 sec | 0.051 sec | 45 GB | 256 GB | For 1,000,000 vectors 10 probes results to accuracy of 0.91. And for 500,000 vectors and below 10 probes results to accuracy in the range of 0.95 - 0.99. To increase accuracy, you need to increase the number of probes. @@ -209,10 +251,33 @@ There are various ways to improve your pgvector performance. Here are some tips: It's useful to execute a few thousand “warm-up” queries before going into production. This helps help with RAM utilization. This can also help to determine that you've selected the right instance size for your workload. -### Increase the number of lists +### Finetune index parameters -You can increase the Requests per Second by increasing the number of `lists`. This also has an important caveat: building the index takes longer with more lists. +You can increase the Requests per Second by increasing `m` and `ef_construction` or `lists`. This also has an important caveat: building the index takes longer with higher values for these parameters. + + + +
+ dbpedia embeddings comparing hnsw queries-per-second using different build parameters (light) + dbpedia embeddings comparing hnsw queries-per-second using different build parameters (dark) +
+ +
+
multi database
+
+
Check out more tips and the complete step-by-step guide in [Going to Production for AI applications](going-to-prod). diff --git a/apps/docs/public/img/ai/instance-type/hnsw-dims--dark.png b/apps/docs/public/img/ai/instance-type/hnsw-dims--dark.png new file mode 100644 index 0000000000..10ec177cbb Binary files /dev/null and b/apps/docs/public/img/ai/instance-type/hnsw-dims--dark.png differ diff --git a/apps/docs/public/img/ai/instance-type/hnsw-dims--light.png b/apps/docs/public/img/ai/instance-type/hnsw-dims--light.png new file mode 100644 index 0000000000..1fb11e846e Binary files /dev/null and b/apps/docs/public/img/ai/instance-type/hnsw-dims--light.png differ diff --git a/apps/docs/public/img/ai/instance-type/lists-for-1m--dark.png b/apps/docs/public/img/ai/instance-type/lists-for-1m--dark.png index 713ae4f445..d9d0117fd8 100644 Binary files a/apps/docs/public/img/ai/instance-type/lists-for-1m--dark.png and b/apps/docs/public/img/ai/instance-type/lists-for-1m--dark.png differ diff --git a/apps/docs/public/img/ai/instance-type/lists-for-1m--light.png b/apps/docs/public/img/ai/instance-type/lists-for-1m--light.png index 2035617093..90c1dc884c 100644 Binary files a/apps/docs/public/img/ai/instance-type/lists-for-1m--light.png and b/apps/docs/public/img/ai/instance-type/lists-for-1m--light.png differ