A major pain point when using OpenAI's embedding API is the cost estimate for a document. Cost varies depending upon the file size and content and it is difficult to estimate the cost upfront.
So I built a simple tool using LangChain to calculate the cost upfront. Right now the tool uses LangChain's RecursiveCharacterTextSplitter to create chunks.
Considering certain enhancements, will build if there is enough interest:
• Allow users to change chunking logic (using various Text Splitters).
• Create this as an API
• Support other file formats
No comments yet…