A major pain point when using OpenAI's embedding API is the cost estimate for a document. Cost varies depending upon the file size and content and it is difficult to estimate the cost upfront.

So I built a simple tool using LangChain to calculate the cost upfront. Right now the tool uses LangChain's RecursiveCharacterTextSplitter to create chunks.

Considering certain enhancements, will build if there is enough interest:

• Allow users to change chunking logic (using various Text Splitters).
• Create this as an API
• Support other file formats

No comments yet…

Login to comment.