-
Notifications
You must be signed in to change notification settings - Fork 2
Happy path fix #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Happy path fix #48
Conversation
src/humanloop/eval_utils/run.py
Outdated
| except Exception as e: | ||
| logger.error(f"Failed to log: {e}") | ||
| error_message = str(e).replace("\n", " ") | ||
| if len(error_message) > 100: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems hacky, is it really needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed 502s tend to return a full HTML, polluting the screen - so I've condensed the error by removing the newlines and trimmed the length
| # Notify the run_eval utility about one Log being created | ||
| if log_belongs_to_evaluated_file(log_args=kwargs): | ||
| evaluation_context = get_evaluation_context() | ||
| evaluation_context.logged = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this for the specific datapoint and evaluation run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - every datapoint is ran in isolation from other threads, so only one log statement is accepted per evaluated file - run - datapoint
| # Running the evaluation locally | ||
| logger.info( | ||
| f"{CYAN}\nRunning '{hl_file.name}' over the Dataset '{hl_dataset.name}' using {workers} workers{RESET} " | ||
| sys.stdout.write( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why sys vs logger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger doesn't interact with ANSI codes such as color, move cursor out of the box - will look into whether the logger can accept those
| logger.warning( | ||
| msg=f"\nYour {hl_file.type}'s `callable` failed for Datapoint: {dp.id}. \n Error: {str(e)}" | ||
| ) | ||
| error_message = str(e).replace("\n", " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic was repeated above, why is it needed?
src/humanloop/eval_utils/run.py
Outdated
| ) | ||
|
|
||
| with ThreadPoolExecutor(max_workers=workers) as executor: | ||
| futures = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you explain the move to futures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was investigating some hanging threads - reverting the change
src/humanloop/eval_utils/run.py
Outdated
| def increment(self): | ||
| """Increment the progress bar by one finished task.""" | ||
| with self._lock: | ||
| # NOTE: There is a deadlock here that needs further investigation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this deadlock cause? how important is it to resolve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can no longer reproduce it now - will remove the comment
src/humanloop/eval_utils/run.py
Outdated
| # Wait for the Evaluation to complete then print the results | ||
| complete = False | ||
|
|
||
| wrote_explainer = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weirdly named variable, i have no idea what this could be for fromthe name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
| id=local_evaluator.id, | ||
| start_time=start_time, | ||
| end_time=datetime.now(), | ||
| log_dict = log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general assignment like this may cause issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fern side issue: LogResponse is polymorphic on the backend, so it's likely not parsing the json into a pedantic object
src/humanloop/eval_utils/run.py
Outdated
| end_time=datetime.now(), | ||
| ) | ||
| error_message = str(e).replace("\n", " ") | ||
| if len(error_message) > 100: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic is repeated twice above, why is it needed?
src/humanloop/flows/client.py
Outdated
| raise ApiError(status_code=_response.status_code, body=_response.text) | ||
| raise ApiError(status_code=_response.status_code, body=_response_json) | ||
|
|
||
| def update_log( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was surprised by this change in the diff.
Was this just added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a separate change from reordering the update_log path to be under/right after creating a flow log. Just part of fern autogenerating SDK; quick win came from talking w Jordan and Robin about agents DX last week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure - reverted the file to the version on main branch, works as expected - branch likely started before updating some docstrings on response models last week
src/humanloop/otel/exporter.py
Outdated
| self._shutdown = True | ||
| for thread in self._threads: | ||
| thread.join() | ||
| thread.join(timeout=5000 / 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5000/1000 - this is weird
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
| @@ -1,4 +1,4 @@ | |||
| import deepdiff | |||
| import deepdiff # type: ignore [import] | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary? or did we lose types from the version downgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point - the downgrade is not required. reverted deepdiff to higher version, removing the need for the ignore
| func=func, | ||
| output=None, | ||
| ) | ||
| output_stringified = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this change ok? what's the purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - output should be none when error is thrown, otherwise the backend complains we provided two output properties on a flow log (this constraint was relaxed in the /otel PR that's not merged yet)
EDIT: seems the reverse did not solve the error, I likely pushed the import deepdiff line without checking the lining previously
jamesbaskerville
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally lgtm
No description provided.