-
Notifications
You must be signed in to change notification settings - Fork 31
feat: Added custom judge support for ai configs #1073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@launchdarkly/js-sdk-common size report |
|
@launchdarkly/browser size report |
|
@launchdarkly/js-client-sdk size report |
|
@launchdarkly/js-client-sdk-common size report |
| let { success } = response.metrics; | ||
|
|
||
| const evals = this._parseEvaluationResponse(response.data); | ||
| const evals = this._parseEvaluationResponse(response.data, evaluationMetricKey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't call this out in python and but we used evals originally to support multiple metric keys in a judge response. We can likely flatten this structure out now that there will be only one. Doesn't need to be done in this PR but I think its more of a push to make the breaking change sooner than later so less people are relying on this code. We might consider adding the eval directly to the judge response and marking evals as deprecated.
Requirements
Related issues
Node version of launchdarkly/python-server-sdk-ai#86
Describe the solution you've provided
See launchdarkly/python-server-sdk-ai#86
Describe alternatives you've considered
Provide a clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context about the pull request here.
Note
Switches judge evaluation to a single metric key while preserving backward compatibility.
evaluationMetricKeytoLDAIJudgeConfig(Default)and deprecate array usage; examples updated inLDAIClient.tsevaluationMetricKeyand fallback to first valid entry inevaluationMetricKeys; include key when converting defaultsEvaluationSchemaBuildernow builds response schema for one required metric key_getEvaluationMetricKey; require messages; parse/validate only that key; marksuccess: falseif missing/invalid; updated warningsLDAIClientImplandJudgetests updated for new key semantics and legacy fallbacks; added tests for invalid/whitespace keys and samplingtrackJudgeResponsehandling single/multiple eval metricsWritten by Cursor Bugbot for commit 288ee6d. This will update automatically on new commits. Configure here.