AI application cost calculator

Estimate costs by purpose: model, embeddings, and vector store.

Answer using your documents; needs model, embeddings, and vector store.

Token counts in the Model section are per conversation.

Model (completion)LLM used for generating responses. Cost is based on input (prompt) and output (completion) token volume. Prices from llm-prices.com.

Prices from llm-prices.com (updated 2025-12-17)

Tokenization (GPT-4o Mini (OpenAI))Approximate for English text. Token counts vary by language and format. Use your provider's tokenizer for exact counts.

~1.25 tokens per word · ~0.80 words per token

RAG retrievalConfigure how many chunks are retrieved from your vector store and added to the LLM prompt. These tokens are added to input tokens per conversation.

RAG context added:

2,500 tokens/conversation

Total input per conversation: 3,500 tokens (base: 1,000 + RAG: 2,500)

EmbeddingsVector representations of text for similarity search and RAG. You pay per token embedded; re-embedding is only needed when content changes.

Word ↔ token conversionApproximate for English. 1 token is roughly 0.77 words. Enter volume in either field; the other updates automatically. Billing is per token.

~1.3 tokens per word · ~0.77 words per token

Vector storeDatabase that stores embeddings and runs similarity search. Cost is typically per GB of vectors stored per month.

EstimateSum of model, embeddings, and vector store costs for the selected period. Prices are indicative; check provider docs for current rates. Base prices are in USD; other currencies use approximate exchange rates.

Model$0.83
Embeddings$0.10
Vector store$0.96
Total (month)$1.88