First draft version for vectorize usage response#1865
First draft version for vectorize usage response#1865maheshrajamani wants to merge 12 commits intomainfrom
Conversation
| * | ||
| * @return | ||
| */ | ||
| public RestResponse<CommandResult> toRestResponse(String vectorizeHeader) { |
There was a problem hiding this comment.
Added method that accepts vectorize header to be returned.
There was a problem hiding this comment.
-1 do not want to code how the API returns this header in this PR, we are working on changes for how we do vecorize for the tables etc.
so lets make this PR just about what the embedding provider is doing
| EmbeddingProvider.EmbeddingRequestType.INDEX) | ||
| .map(res -> res.embeddings()); | ||
| .map( | ||
| res -> { |
There was a problem hiding this comment.
Setting value from the aggregated vectorize usage info.
There was a problem hiding this comment.
for now just want it on the response object from calling the embedding provider
There was a problem hiding this comment.
WIll remove the CDI related code.
| int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) { | ||
| public static Response of(int batchId, List<float[]> embeddings) { | ||
| return new Response(batchId, embeddings); | ||
| return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", "")); |
There was a problem hiding this comment.
Added default VectorizeUsageInfo until all embedding provider class get refactored.
There was a problem hiding this comment.
this is a record, so it has a constructor that will not enforce that VectorizeUsageInfo is non-null.
either: validate this in the canonical constructor or turn this into a class and validate in a ctor that takes all of the params
we should remove all the of() functions from the record, records should be simple structures with properties if they need a static factory they are prob not the right usage
There was a problem hiding this comment.
Adapted the existing code to avoid compilation error in all the embedding providers. Will remove the of method for this record.
| int totalToken = 0; | ||
| String provider = ""; | ||
| String modelName = ""; | ||
| for (Response vectorizedBatch : vectorizedBatches) { |
There was a problem hiding this comment.
Aggregating the vectorize size from different micro batches.
| import java.io.InputStream; | ||
| import java.util.logging.Logger; | ||
|
|
||
| public class NetworkUsageInterceptor implements ClientRequestFilter, ClientResponseFilter { |
There was a problem hiding this comment.
Interceptor to capture the byte size.
|
|
||
| Uni<EmbeddingResponse> response = | ||
| // ✅ Create an instance of NetworkUsageInfo and pass it to request properties | ||
| Uni<jakarta.ws.rs.core.Response> response = |
There was a problem hiding this comment.
Accept non serialized output as service response
| if (resp.data() == null) { | ||
| return Response.of(batchId, Collections.emptyList()); | ||
| res -> { | ||
| EmbeddingResponse embeddingResponse = res.readEntity(EmbeddingResponse.class); |
There was a problem hiding this comment.
Serialize as EmbeddingResponse
src/main/java/io/stargate/sgv2/jsonapi/service/embedding/operation/VectorizeUsageInfo.java
Outdated
Show resolved
Hide resolved
| new ByteArrayInputStream(byteArrayOutputStream.toByteArray())); | ||
| } | ||
| LOGGER.info("Received Bytes: " + receivedBytes); | ||
| responseContext.getHeaders().add("sent-bytes", String.valueOf(sentBytes)); |
There was a problem hiding this comment.
Setting to response header to parse in the embedding provider client class.
amorton
left a comment
There was a problem hiding this comment.
reasonable start, take out the Data API returning just make it the embedding provider for now because we are changing things, and some improvments on the code
| * | ||
| * @return | ||
| */ | ||
| public RestResponse<CommandResult> toRestResponse(String vectorizeHeader) { |
There was a problem hiding this comment.
-1 do not want to code how the API returns this header in this PR, we are working on changes for how we do vecorize for the tables etc.
so lets make this PR just about what the embedding provider is doing
| String vectorize = null; | ||
| try { | ||
|
|
||
| if (!Strings.isNullOrEmpty(vectorizeUsageBean.getModel())) { |
There was a problem hiding this comment.
-1 - just the embedding provider for now
| EmbeddingProvider.EmbeddingRequestType.INDEX) | ||
| .map(res -> res.embeddings()); | ||
| .map( | ||
| res -> { |
There was a problem hiding this comment.
for now just want it on the response object from calling the embedding provider
| return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", "")); | ||
| } | ||
|
|
||
| public static Response of( |
There was a problem hiding this comment.
this of() function has the same signature as the record constructor , why is it needed ?
| int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) { | ||
| public static Response of(int batchId, List<float[]> embeddings) { | ||
| return new Response(batchId, embeddings); | ||
| return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", "")); |
There was a problem hiding this comment.
this is a record, so it has a constructor that will not enforce that VectorizeUsageInfo is non-null.
either: validate this in the canonical constructor or turn this into a class and validate in a ctor that takes all of the params
we should remove all the of() functions from the record, records should be simple structures with properties if they need a static factory they are prob not the right usage
| public class VectorizeUsageInfo { | ||
| private int requestSize; | ||
| private int responseSize; | ||
| private int totalTokens; |
There was a problem hiding this comment.
can we also have the input and output tokens ?
There was a problem hiding this comment.
Not sure if there is split like that in all the embedding api response. Will check.
There was a problem hiding this comment.
Verified Not all provide these, Example cohere response is like `
"billed_units": {
2069 | "input_tokens": 2
2070 | },
`
Huggingface doesn't return any tokens size.
| private String model; | ||
|
|
||
| public VectorizeUsageInfo( | ||
| int requestSize, int responseSize, int totalTokens, String provider, String model) { |
There was a problem hiding this comment.
do we have an enum for the provider and/or model ?
There was a problem hiding this comment.
We don't have an enum as it's configured on the provider configuration yaml.
| modelName = vectorizedBatch.vectorizeUsageInfo().getModel(); | ||
| } | ||
| return Response.of(1, result); | ||
| return Response.of( |
There was a problem hiding this comment.
see other comments, do not need a static factory on a record
| } | ||
|
|
||
| @Override | ||
| public void filter(ClientRequestContext requestContext, ClientResponseContext responseContext) |
There was a problem hiding this comment.
if we do not change the way of counting, and stick with this for an initial release, why not count both the request and the response in this function ?
src/main/java/io/stargate/sgv2/jsonapi/service/embedding/operation/NetworkUsageInterceptor.java
Outdated
Show resolved
Hide resolved
maheshrajamani
left a comment
There was a problem hiding this comment.
I will make the change as suggested and let you know.
| sentBytes += vectorizedBatch.vectorizeUsageInfo().getRequestSize(); | ||
| receivedBytes += vectorizedBatch.vectorizeUsageInfo().getResponseSize(); | ||
| totalToken += vectorizedBatch.vectorizeUsageInfo().getTotalTokens(); | ||
| provider = vectorizedBatch.vectorizeUsageInfo().getProvider(); |
There was a problem hiding this comment.
Currently it is done considering there is only one vectorize per collection/table. Not sure of the code refactoring to support table's multi vectorize.
| EmbeddingProvider.EmbeddingRequestType.INDEX) | ||
| .map(res -> res.embeddings()); | ||
| .map( | ||
| res -> { |
There was a problem hiding this comment.
WIll remove the CDI related code.
| int batchId, List<float[]> embeddings, VectorizeUsageInfo vectorizeUsageInfo) { | ||
| public static Response of(int batchId, List<float[]> embeddings) { | ||
| return new Response(batchId, embeddings); | ||
| return new Response(batchId, embeddings, new VectorizeUsageInfo(0, 0, 0, "", "")); |
There was a problem hiding this comment.
Adapted the existing code to avoid compilation error in all the embedding providers. Will remove the of method for this record.
| public class VectorizeUsageInfo { | ||
| private int requestSize; | ||
| private int responseSize; | ||
| private int totalTokens; |
There was a problem hiding this comment.
Not sure if there is split like that in all the embedding api response. Will check.
| private String model; | ||
|
|
||
| public VectorizeUsageInfo( | ||
| int requestSize, int responseSize, int totalTokens, String provider, String model) { |
There was a problem hiding this comment.
We don't have an enum as it's configured on the provider configuration yaml.
What this PR does:
Vectorize usage header
Checklist