Skip to content

Commit 5d8c86a

Browse files
committed
adding info how to use
1 parent 059c7e5 commit 5d8c86a

File tree

1 file changed

+16
-0
lines changed

1 file changed

+16
-0
lines changed

archive_query_log/export/knowledge_graph.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,3 +147,19 @@ def iter_turtle_triples(serp: SERP) -> Iterator[tuple[str, str, str]]:
147147
# yield(entity, "schema:title", result.title)
148148
# yield(entity, "schema:isBasedOn", f"https://aql.webis.de/capture/{result.capture.id}")
149149

150+
151+
152+
## How to use
153+
154+
# def iter_providers_turtle_triples(providers: list[Provider]) -> Iterator[tuple[str, str, str]]:
155+
# for provider in providers:
156+
# yield from iter_provider_turtle_triples(provider)
157+
158+
# from pandas import DataFrame
159+
# def map_provider_batch_to_turtle(batch: DataFrame) -> DataFrame:
160+
# providers: list[Provider] = load_providers_from_dataframe(batch)
161+
# triples: list[tuple[str, str, str]] = list(iter_providers_turtle_triples(providers))
162+
# return DataFrame(triples, columns=["subject", "predicate", "object"])
163+
164+
# In Ray: Read from ES (in parallel, e.g., 100 worker) -> Map batches (Providers to Triples; concurrency 100) -> (Repartition, e.g., max. 1M triple per file) -> Write to Turtle files (creates X files)
165+
```

0 commit comments

Comments
 (0)