-
Notifications
You must be signed in to change notification settings - Fork 0
First Langfuse evaluations #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… tenacity for retrying mechanism
fcogidi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments:
- Will you switch to
google-adklater since Amrit and I are using that? Asking 'cause the langfuse integration may be different for that. - The
langfuse_uploadscript may be general enough to be in theaieng-eval-agentspackage
Yes. I will have one more PR that I will put out on monday with the trajectory evals and next in line is the move to google-adk.
Good point. I'm planning to move things around in follow up PRs as well and will keep that in mind. |
Summary
Adding Langfuse integration code and adding an evaluation script to the report generation agent.
Clickup Ticket(s): NA
Type of Change
Changes Made
langfuse.pyfileFor an example of how the evaluation results are looking like:
https://us.cloud.langfuse.com/project/cmkwsswke005dad07gxujnipq/datasets/cmkyev4nd000nad084ds2xm30/runs/27328bba-9843-4ccb-940f-6fe1b9e3b0ea
Testing
uv run pytest tests/)uv run mypy <src_dir>)uv run ruff check src_dir/)Manual testing details:
Performed manual testing by following the instructions in the README.md file.
Checklist