Is there any existing guidance for how libraries (or even application code) should (or shouldn't) write spans that represent generative ai content, but are not necessarily "model" calls an are not necessarily "agents"?
For example, Genkit has a generate method that abstracts away the underlying LLM call. It will run tool calls in a loop, etc. In theory, the underlying model will be instrumented (e.g. for token counts). However, it's not always the case. It might be useful for generate spans to track the aggregate token counts on their own, etc in a semantic way.
