-
Notifications
You must be signed in to change notification settings - Fork 168
Description
As there is a difference in the way we generate embeddings in Azure Open AI vs Open AI as we are not able to create bulk embeddings in single request through Azure Open AI. I have created Azure Open AI embeddings in the below way:-
from langchain.text_splitter import CharacterTextSplitter
#from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.create_documents(contents)
print(texts)
texts_new = [doc.page_content for doc in texts]
Print the extracted texts
list_docs=[]
for text in texts_new:
print(text)
list_docs.append(text)
import time
Embeddings_list=[]
for item in list_docs:
print(item)
while True:
try:
response = openai.Embedding.create(input=item, engine="text-embedding-ada-002")
embeddings = response['data'][0]['embedding']
Embeddings_list.append(embeddings)
break
except Exception as e:
print(e)
time.sleep(15)
How to use this Embedding_list to ingest into vector store and club it to QNA process as this way of embedding integration into the below process has no guided documentation
Create a vectorstore from documents
db = Chroma.from_documents(texts, embeddings)
Create retriever interface
retriever = db.as_retriever()
Create QA chain
qa = RetrievalQA.from_chain_type(llm=OpenAI(openai_api_key=openai_api_key), chain_type='stuff', retriever=retriever)