|
229 | 229 | ")" |
230 | 230 | ] |
231 | 231 | }, |
232 | | - { |
233 | | - "cell_type": "markdown", |
234 | | - "metadata": {}, |
235 | | - "source": [ |
236 | | - "### Create a table (if not already exists)" |
237 | | - ] |
238 | | - }, |
239 | | - { |
240 | | - "cell_type": "code", |
241 | | - "execution_count": null, |
242 | | - "metadata": {}, |
243 | | - "outputs": [], |
244 | | - "source": [ |
245 | | - "from langchain_google_cloud_sql_pg import Column\n", |
246 | | - "\n", |
247 | | - "await engine.ainit_document_table(\n", |
248 | | - " table_name=TABLE_NAME,\n", |
249 | | - " content_column=\"product_name\",\n", |
250 | | - " metadata_columns=[\n", |
251 | | - " Column(\"id\", \"SERIAL\", nullable=False),\n", |
252 | | - " Column(\"content\", \"VARCHAR\", nullable=False),\n", |
253 | | - " Column(\"description\", \"VARCHAR\", nullable=False),\n", |
254 | | - " ],\n", |
255 | | - " metadata_json_column=\"metadata\",\n", |
256 | | - " store_metadata=True,\n", |
257 | | - ")" |
258 | | - ] |
259 | | - }, |
260 | 232 | { |
261 | 233 | "cell_type": "markdown", |
262 | 234 | "metadata": {}, |
|
286 | 258 | ] |
287 | 259 | }, |
288 | 260 | { |
289 | | - "cell_type": "code", |
290 | | - "execution_count": null, |
291 | | - "metadata": { |
292 | | - "id": "z-AZyzAQ7bsf" |
293 | | - }, |
294 | | - "outputs": [], |
| 261 | + "cell_type": "markdown", |
| 262 | + "metadata": {}, |
295 | 263 | "source": [ |
296 | | - "from langchain_google_cloud_sql_pg import PostgresLoader\n", |
297 | | - "\n", |
298 | | - "# Creating a basic PostgreSQL object\n", |
299 | | - "loader = await PostgresLoader.create(\n", |
300 | | - " engine,\n", |
301 | | - " table_name=TABLE_NAME,\n", |
302 | | - " # schema_name=SCHEMA_NAME,\n", |
303 | | - ")" |
| 264 | + "When creating an `PostgresLoader` for fetching data from Cloud SQL PG, you have two main options to specify the data you want to load:\n", |
| 265 | + "* using the table_name argument - When you specify the table_name argument, you're telling the loader to fetch all the data from the given table.\n", |
| 266 | + "* using the query argument - When you specify the query argument, you can provide a custom SQL query to fetch the data. This allows you to have full control over the SQL query, including selecting specific columns, applying filters, sorting, joining tables, etc.\n", |
| 267 | + "\n" |
| 268 | + ] |
| 269 | + }, |
| 270 | + { |
| 271 | + "cell_type": "markdown", |
| 272 | + "metadata": {}, |
| 273 | + "source": [ |
| 274 | + "### Load Documents using the `table_name` argument" |
304 | 275 | ] |
305 | 276 | }, |
306 | 277 | { |
|
309 | 280 | "id": "PeOMpftjc9_e" |
310 | 281 | }, |
311 | 282 | "source": [ |
312 | | - "### Load Documents via default table\n", |
| 283 | + "#### Load Documents via default table\n", |
313 | 284 | "The loader returns a list of Documents from the table using the first column as page_content and all other columns as metadata. The default table will have the first column as\n", |
314 | 285 | "page_content and the second column as metadata (JSON). Each row becomes a document. \n", |
315 | 286 | "\n", |
|
343 | 314 | "id": "kSkL9l1Hc9_e" |
344 | 315 | }, |
345 | 316 | "source": [ |
346 | | - "### Load documents via custom table/metadata or custom page content columns" |
| 317 | + "#### Load documents via custom table/metadata or custom page content columns" |
347 | 318 | ] |
348 | 319 | }, |
349 | 320 | { |
|
363 | 334 | "print(docs)" |
364 | 335 | ] |
365 | 336 | }, |
| 337 | + { |
| 338 | + "cell_type": "markdown", |
| 339 | + "metadata": {}, |
| 340 | + "source": [ |
| 341 | + "### Load Documents using a SQL query\n", |
| 342 | + "The query parameter allows users to specify a custom SQL query which can include filters to load specific documents from a database." |
| 343 | + ] |
| 344 | + }, |
| 345 | + { |
| 346 | + "cell_type": "code", |
| 347 | + "execution_count": null, |
| 348 | + "metadata": {}, |
| 349 | + "outputs": [], |
| 350 | + "source": [ |
| 351 | + "table_name = \"products\"\n", |
| 352 | + "content_columns = [\"product_name\", \"description\"]\n", |
| 353 | + "metadata_columns = [\"id\", \"content\"]\n", |
| 354 | + "\n", |
| 355 | + "loader = PostgresLoader.create(\n", |
| 356 | + " engine=engine,\n", |
| 357 | + " query=f\"SELECT * FROM {table_name};\",\n", |
| 358 | + " content_columns=content_columns,\n", |
| 359 | + " metadata_columns=metadata_columns,\n", |
| 360 | + ")\n", |
| 361 | + "\n", |
| 362 | + "docs = await loader.aload()\n", |
| 363 | + "print(docs)" |
| 364 | + ] |
| 365 | + }, |
| 366 | + { |
| 367 | + "cell_type": "markdown", |
| 368 | + "metadata": {}, |
| 369 | + "source": [ |
| 370 | + "**Note**: If the `content_columns` and `metadata_columns` are not specified, the loader will automatically treat the first returned column as the document’s `page_content` and all subsequent columns as `metadata`." |
| 371 | + ] |
| 372 | + }, |
366 | 373 | { |
367 | 374 | "cell_type": "markdown", |
368 | 375 | "metadata": { |
|
396 | 403 | "cell_type": "markdown", |
397 | 404 | "metadata": {}, |
398 | 405 | "source": [ |
399 | | - "### Create PostgresSaver\n", |
| 406 | + "## Create PostgresSaver\n", |
400 | 407 | "The `PostgresSaver` allows for saving of pre-processed documents to the table using the first column as page_content and all other columns as metadata. This table can easily be loaded via a Document Loader or updated to be a VectorStore. The default table will have the first column as page_content and the second column as metadata (JSON)." |
401 | 408 | ] |
402 | 409 | }, |
| 410 | + { |
| 411 | + "cell_type": "markdown", |
| 412 | + "metadata": {}, |
| 413 | + "source": [ |
| 414 | + "### Create a table (if not already exists)" |
| 415 | + ] |
| 416 | + }, |
| 417 | + { |
| 418 | + "cell_type": "code", |
| 419 | + "execution_count": null, |
| 420 | + "metadata": {}, |
| 421 | + "outputs": [], |
| 422 | + "source": [ |
| 423 | + "from langchain_google_cloud_sql_pg import Column\n", |
| 424 | + "\n", |
| 425 | + "await engine.ainit_document_table(\n", |
| 426 | + " table_name=TABLE_NAME,\n", |
| 427 | + " content_column=\"product_name\",\n", |
| 428 | + " metadata_columns=[\n", |
| 429 | + " Column(\"id\", \"SERIAL\", nullable=False),\n", |
| 430 | + " Column(\"content\", \"VARCHAR\", nullable=False),\n", |
| 431 | + " Column(\"description\", \"VARCHAR\", nullable=False),\n", |
| 432 | + " ],\n", |
| 433 | + " metadata_json_column=\"metadata\",\n", |
| 434 | + " store_metadata=True,\n", |
| 435 | + ")" |
| 436 | + ] |
| 437 | + }, |
| 438 | + { |
| 439 | + "cell_type": "markdown", |
| 440 | + "metadata": {}, |
| 441 | + "source": [ |
| 442 | + "### Create PostgresSaver" |
| 443 | + ] |
| 444 | + }, |
403 | 445 | { |
404 | 446 | "cell_type": "code", |
405 | 447 | "execution_count": null, |
|
0 commit comments