Skip to content

[Bug]: gRPC handler is reraising Exceptions in a way that breaks low-level Python Exception tracking #2221

@bcalvert-graft

Description

@bcalvert-graft

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

Hello Milvus maintainers,

Thank you in advance for reading my bug-report.

There are three spots in grpc_handler.py (first spot, second spot and third spot) that use the following pattern to reraise an Exception

try:
    <some code>
except Exception as err:
    <do some other stuff>
    raise err from err
...

Respectfully, this raise err from err is not a great way to reraise exceptions; instead, as mentioned in the Python docs it should be

try:
    <some code>
except Exception as err:
    <do some other stuff>
    raise
...

Why is the first one not great? It attaches the err Exception as the __cause__ to itself. As I understand it, this is not the desired semantics for __cause__ and this breaks any tools that recursively unroll chained Exceptions (e.g. for processing stack traces). As a concrete example of such a tool, we actually uncovered this seeming bug in grpc_handler.py when using Milvus as part of a larger Dask computation; specifically, as part of Dask's error-handling, the library uses a utility called collect_causes,

def collect_causes(e: BaseException) -> list[BaseException]:
    causes = []
    while e.__cause__ is not None:
        causes.append(e.__cause__)
        e = e.__cause__
    return causes

When collect_causes encounters an Exception raised from one of the linked spots in grpc_handler.py it triggers an infinite loop of unrolling e.__cause__, consequently leaking memory until the process is killed.

Assuming you agree with the quick fix, I'm happy to submit a PR, but wanted to run it by you first to make sure I'm not missing something.

Cheers,
Brian

Expected Behavior

When handling Exceptions raised by the linked methods in grpc_handler.py, the Exceptions should not be attached as their own __cause__

Steps/Code To Reproduce behavior

In one process, run

In [1]: from milvus_lite.server import Server

In [2]: s = Server('test.db', '127.0.0.1:9999')

In [3]: s.start()
Out[3]: True

In [4]: s._p.wait()

In another process, run

In [1]: import pymilvus

In [2]: client = pymilvus.MilvusClient("http://localhost:9999")
   ...: collection_name = "fake_embeddings"
   ...: if client.has_collection(collection_name=collection_name):
   ...:     client.drop_collection(collection_name=collection_name)
   ...: 

In [3]: vector_dim = 512
   ...: client.create_collection(
   ...:     collection_name=collection_name,
   ...:     vector_field_name="vector",
   ...:     dimension=vector_dim,
   ...:     auto_id=True,
   ...:     enable_dynamic_field=True,
   ...:     metric_type="COSINE",
   ...: )

In [4]: import numpy as np

In [5]: vectors = np.random.normal(loc=0, scale=1, size=(vector_dim + 2,))

In [6]: try:
   ...:     client.insert(collection_name, {"vector": vectors})
   ...: except pymilvus.MilvusException as exc:
   ...:     assert exc.__cause__ is not exc, "Exception is its own cause"

The In [6]: step will throw an AssertionError

Environment details

- No hardware/software conditions of note
- Installed pymilvus with pip (version 2.4.4)
- Can reproduce the error with milvus_lite module, so I don't think server config has an impact?

Anything else?

No response

Metadata

Metadata

Labels

kind/bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions