Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
9b199b8
Convert prepdocs to skills
pamelafox Nov 3, 2025
b971ca7
More Bicep to get funcs deployed with auth
pamelafox Nov 3, 2025
df0c17a
chore(functions): add missing prepdocslib dependencies to function re…
pamelafox Nov 4, 2025
e805ee3
build(functions): vendor dependencies into .python_packages for flex …
pamelafox Nov 4, 2025
253cb7e
chore(functions): copy backend requirements as requirements.backend.t…
pamelafox Nov 4, 2025
d66a620
chore(functions): overwrite function requirements with backend pins (…
pamelafox Nov 4, 2025
0d7e8a9
chore(functions): remove requirements backup; always overwrite with b…
pamelafox Nov 4, 2025
12d71d5
Get function apps deployed
pamelafox Nov 4, 2025
9ac595f
Updates to function auth
pamelafox Nov 5, 2025
d8dd729
latest changes to get auth working
pamelafox Nov 5, 2025
e906fb5
Fix tests
pamelafox Nov 7, 2025
f7638d4
always upload local files
mattgotteiner Nov 8, 2025
ba1a997
update to storageMetadata extraction
mattgotteiner Nov 8, 2025
57b53fd
Merge pull request #7 from mattgotteiner/matt/update-prepskills
pamelafox Nov 9, 2025
628609a
Got it working
pamelafox Nov 10, 2025
7bec324
Working more on the docs
pamelafox Nov 10, 2025
6dee74a
Merge in latest
pamelafox Nov 10, 2025
267ff51
Update
pamelafox Nov 11, 2025
8df151f
Push latest for review
pamelafox Nov 11, 2025
be98004
Consolidate docs
pamelafox Nov 11, 2025
b733d20
Clean up vectorization docs and refs
pamelafox Nov 11, 2025
1db5f14
More code cleanup
pamelafox Nov 11, 2025
6d4e490
Address Copilot feedback on tests
pamelafox Nov 11, 2025
9fcaa55
More code cleanups
pamelafox Nov 12, 2025
46bbaf7
Cleanup function test
pamelafox Nov 12, 2025
2d7b453
100% diff coverage
pamelafox Nov 12, 2025
c5116c8
Update app/functions/document_extractor/function_app.py
pamelafox Nov 12, 2025
cfa762c
Update app/backend/prepdocslib/page.py
pamelafox Nov 12, 2025
7c25851
Update app/functions/document_extractor/function_app.py
pamelafox Nov 12, 2025
e9f13f5
Address feedback and tweak docs
pamelafox Nov 12, 2025
db9dc7e
Merge branch 'prepskills' of https://github.com/pamelafox/azure-searc…
pamelafox Nov 12, 2025
b96f9c1
Apply suggestions from code review
pamelafox Nov 12, 2025
0211250
Adding diagram
pamelafox Nov 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ npm-debug.log*
node_modules
static/

app/functions/*/prepdocslib/

data/**/*.md5

.DS_Store
6 changes: 5 additions & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,11 @@ If necessary, edit this file to ensure it accurately reflects the current state
* app/backend/approaches/prompts/chat_query_rewrite.prompty: Prompt used to rewrite the query based off search history into a better search query
* app/backend/approaches/prompts/chat_query_rewrite_tools.json: Tools used by the query rewriting prompt
* app/backend/approaches/prompts/chat_answer_question.prompty: Prompt used by the Chat approach to actually answer the question based off sources
* app/backend/prepdocslib/cloudingestionstrategy.py: Builds the Azure AI Search indexer and skillset for the cloud ingestion pipeline
* app/backend/prepdocslib/pdfparser.py: Uses Azure Document Intelligence to emit page text plus figure placeholders
* app/backend/prepdocslib/figureprocessor.py: Shared helper that generates figure descriptions for both local ingestion and the cloud figure-processor skill
* app/backend/app.py: The main entry point for the backend application.
* app/functions: Azure Functions used for cloud ingestion custom skills (document extraction, figure processing, text processing). Each function bundles a synchronized copy of `prepdocslib`; run `python scripts/copy_prepdocslib.py` to refresh the local copies if you modify the library.
* app/frontend: Contains the React frontend code, built with TypeScript, built with vite.
* app/frontend/src/api: Contains the API client code for communicating with the backend.
* app/frontend/src/components: Contains the React components for the frontend.
Expand Down Expand Up @@ -65,7 +69,7 @@ When adding a new developer setting, update:
* app/backend/approaches/retrievethenread.py : Retrieve from overrides parameter
* app/backend/app.py: Some settings may need to be sent down in the /config route.

## When adding tests for a new feature:
## When adding tests for a new feature

All tests are in the `tests` folder and use the pytest framework.
There are three styles of tests:
Expand Down
3 changes: 2 additions & 1 deletion app/backend/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -586,7 +586,7 @@ async def setup_clients():
current_app.config[CONFIG_USER_BLOB_MANAGER] = user_blob_manager

# Set up ingester
file_processors = setup_file_processors(
file_processors, figure_processor = setup_file_processors(
azure_credential=azure_credential,
document_intelligence_service=os.getenv("AZURE_DOCUMENTINTELLIGENCE_SERVICE"),
local_pdf_parser=os.getenv("USE_LOCAL_PDF_PARSER", "").lower() == "true",
Expand Down Expand Up @@ -627,6 +627,7 @@ async def setup_clients():
image_embeddings=image_embeddings_service,
search_field_name_embedding=AZURE_SEARCH_FIELD_NAME_EMBEDDING,
blob_manager=user_blob_manager,
figure_processor=figure_processor,
)
current_app.config[CONFIG_INGESTER] = ingester

Expand Down
Loading