Skip to content

Commit 869ed36

Browse files
authored
Add k8s.pod.phase and k8s.pod.status.reason metrics (#2488)
Signed-off-by: ChrsMark <[email protected]>
1 parent 4658d9b commit 869ed36

File tree

6 files changed

+257
-0
lines changed

6 files changed

+257
-0
lines changed

.chloggen/add_k8s_pod_phase.yaml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Use this changelog template to create an entry for release notes.
2+
#
3+
# If your change doesn't affect end users you should instead start
4+
# your pull request title with [chore] or use the "Skip Changelog" label.
5+
6+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
7+
change_type: enhancement
8+
9+
# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db)
10+
component: k8s
11+
12+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
13+
note: Add k8s.pod.status.phase and k8s.pod.status.reason metrics
14+
15+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
16+
# The values here must be integers.
17+
issues: [2075]
18+
19+
# (Optional) One or more lines of additional information to render under the primary note.
20+
# These lines will be padded with 2 spaces and then inserted directly into the document.
21+
# Use pipe (|) for multiline entries.
22+
subtext:

docs/non-normative/k8s-migration.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ and one for disabling the old schema called `semconv.k8s.disableLegacy`. Then:
6464
- [K8s Filesystem metrics](#k8s-filesystem-metrics)
6565
- [K8s Pod Volume metrics](#k8s-pod-volume-metrics)
6666
- [Container Runtime](#container-runtime)
67+
- [K8s Pod Status Phase and Reason](#k8s-pod-status-phase-and-reason)
6768

6869
<!-- tocstop -->
6970

@@ -436,3 +437,21 @@ The changes in their attributes are the following:
436437
| `container.runtime` | `container.runtime.name` |
437438

438439
<!-- prettier-ignore-end -->
440+
441+
### K8s Pod Status Phase and Reason
442+
443+
The K8s Pod Status Phase and Reason metrics implemented by the Collector and specifically the
444+
[k8scluster](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.115.0/receiver/k8sclusterreceiver/documentation.md)
445+
receiver were introduced as semantic conventions in
446+
[#2075](https://github.com/open-telemetry/semantic-conventions/issues/2075)
447+
448+
The changes in their metrics are the following:
449+
450+
<!-- prettier-ignore-start -->
451+
452+
| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New |
453+
|------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------|
454+
| `k8s.pod.status_reason` metric [1,6] | `k8s.pod.status.reason` metric [0,1] with attribute `k8s.pod.status.reason` for the different reasons |
455+
| `k8s.pod.phase` metric [1, 5] | `k8s.pod.status.phase` metric [0,1] with attribute `k8s.pod.phase` for the different phases |
456+
457+
<!-- prettier-ignore-end -->

docs/registry/attributes/k8s.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ Kubernetes resource attributes.
5555
| <a id="k8s-pod-annotation" href="#k8s-pod-annotation">`k8s.pod.annotation.<key>`</a> | string | The annotation placed on the Pod, the `<key>` being the annotation name, the value being the annotation value. [21] | `true`; `x64`; `` | ![Development](https://img.shields.io/badge/-development-blue) |
5656
| <a id="k8s-pod-label" href="#k8s-pod-label">`k8s.pod.label.<key>`</a> | string | The label placed on the Pod, the `<key>` being the label name, the value being the label value. [22] | `my-app`; `x64`; `` | ![Development](https://img.shields.io/badge/-development-blue) |
5757
| <a id="k8s-pod-name" href="#k8s-pod-name">`k8s.pod.name`</a> | string | The name of the Pod. | `opentelemetry-pod-autoconf` | ![Development](https://img.shields.io/badge/-development-blue) |
58+
| <a id="k8s-pod-status-phase" href="#k8s-pod-status-phase">`k8s.pod.status.phase`</a> | string | The phase for the pod. Corresponds to the `phase` field of the: [K8s PodStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.33/#podstatus-v1-core) | `Pending`; `Running` | ![Development](https://img.shields.io/badge/-development-blue) |
59+
| <a id="k8s-pod-status-reason" href="#k8s-pod-status-reason">`k8s.pod.status.reason`</a> | string | The reason for the pod state. Corresponds to the `reason` field of the: [K8s PodStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.33/#podstatus-v1-core) | `Evicted`; `NodeAffinity` | ![Development](https://img.shields.io/badge/-development-blue) |
5860
| <a id="k8s-pod-uid" href="#k8s-pod-uid">`k8s.pod.uid`</a> | string | The UID of the Pod. | `275ecb36-5aa8-4c2a-9c47-d8bb681b9aff` | ![Development](https://img.shields.io/badge/-development-blue) |
5961
| <a id="k8s-replicaset-annotation" href="#k8s-replicaset-annotation">`k8s.replicaset.annotation.<key>`</a> | string | The annotation placed on the ReplicaSet, the `<key>` being the annotation name, the value being the annotation value, even if the value is empty. [23] | `0`; `` | ![Development](https://img.shields.io/badge/-development-blue) |
6062
| <a id="k8s-replicaset-label" href="#k8s-replicaset-label">`k8s.replicaset.label.<key>`</a> | string | The label placed on the ReplicaSet, the `<key>` being the label name, the value being the label value, even if the value is empty. [24] | `guestbook`; `` | ![Development](https://img.shields.io/badge/-development-blue) |
@@ -311,6 +313,30 @@ When this occurs, the exact value as reported by the Kubernetes API SHOULD be us
311313

312314
---
313315

316+
`k8s.pod.status.phase` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
317+
318+
| Value | Description | Stability |
319+
|---|---|---|
320+
| `Failed` | All containers in the pod have terminated, and at least one container has terminated in a failure (exited with a non-zero exit code or was stopped by the system). | ![Development](https://img.shields.io/badge/-development-blue) |
321+
| `Pending` | The pod has been accepted by the system, but one or more of the containers has not been started. This includes time before being bound to a node, as well as time spent pulling images onto the host. | ![Development](https://img.shields.io/badge/-development-blue) |
322+
| `Running` | The pod has been bound to a node and all of the containers have been started. At least one container is still running or is in the process of being restarted. | ![Development](https://img.shields.io/badge/-development-blue) |
323+
| `Succeeded` | All containers in the pod have voluntarily terminated with a container exit code of 0, and the system is not going to restart any of these containers. | ![Development](https://img.shields.io/badge/-development-blue) |
324+
| `Unknown` | For some reason the state of the pod could not be obtained, typically due to an error in communicating with the host of the pod. | ![Development](https://img.shields.io/badge/-development-blue) |
325+
326+
---
327+
328+
`k8s.pod.status.reason` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
329+
330+
| Value | Description | Stability |
331+
|---|---|---|
332+
| `Evicted` | The pod is evicted. | ![Development](https://img.shields.io/badge/-development-blue) |
333+
| `NodeAffinity` | The pod is in a status because of its node affinity | ![Development](https://img.shields.io/badge/-development-blue) |
334+
| `NodeLost` | The reason on a pod when its state cannot be confirmed as kubelet is unresponsive on the node it is (was) running. | ![Development](https://img.shields.io/badge/-development-blue) |
335+
| `Shutdown` | The node is shutdown | ![Development](https://img.shields.io/badge/-development-blue) |
336+
| `UnexpectedAdmissionError` | The pod was rejected admission to the node because of an error during admission that could not be categorized. | ![Development](https://img.shields.io/badge/-development-blue) |
337+
338+
---
339+
314340
`k8s.volume.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
315341

316342
| Value | Description | Stability |

docs/system/k8s-metrics.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ and therefore inherit its attributes, like `k8s.pod.name` and `k8s.pod.uid`.
1919

2020
- [Pod metrics](#pod-metrics)
2121
- [Metric: `k8s.pod.uptime`](#metric-k8spoduptime)
22+
- [Metric: `k8s.pod.phase`](#metric-k8spodphase)
23+
- [Metric: `k8s.pod.status.reason`](#metric-k8spodstatusreason)
2224
- [Metric: `k8s.pod.cpu.time`](#metric-k8spodcputime)
2325
- [Metric: `k8s.pod.cpu.usage`](#metric-k8spodcpuusage)
2426
- [Metric: `k8s.pod.memory.usage`](#metric-k8spodmemoryusage)
@@ -152,6 +154,84 @@ The actual accuracy would depend on the instrumentation and operating system.
152154
<!-- END AUTOGENERATED TEXT -->
153155
<!-- endsemconv -->
154156

157+
### Metric: `k8s.pod.phase`
158+
159+
This metric is [recommended][MetricRecommended].
160+
161+
<!-- semconv metric.k8s.pod.status.phase -->
162+
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
163+
<!-- see templates/registry/markdown/snippet.md.j2 -->
164+
<!-- prettier-ignore-start -->
165+
<!-- markdownlint-capture -->
166+
<!-- markdownlint-disable -->
167+
168+
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
169+
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
170+
| `k8s.pod.status.phase` | UpDownCounter | `{pod}` | Describes number of K8s Pods that are currently in a given phase. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.pod`](/docs/registry/entities/k8s.md#k8s-pod) |
171+
172+
**[1]:** All possible pod phases will be reported at each time interval to avoid missing metrics.
173+
Only the value corresponding to the current phase will be non-zero.
174+
175+
| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability |
176+
|---|---|---|---|---|---|
177+
| [`k8s.pod.status.phase`](/docs/registry/attributes/k8s.md) | string | The phase for the pod. Corresponds to the `phase` field of the: [K8s PodStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.33/#podstatus-v1-core) | `Pending`; `Running` | `Required` | ![Development](https://img.shields.io/badge/-development-blue) |
178+
179+
---
180+
181+
`k8s.pod.status.phase` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
182+
183+
| Value | Description | Stability |
184+
|---|---|---|
185+
| `Failed` | All containers in the pod have terminated, and at least one container has terminated in a failure (exited with a non-zero exit code or was stopped by the system). | ![Development](https://img.shields.io/badge/-development-blue) |
186+
| `Pending` | The pod has been accepted by the system, but one or more of the containers has not been started. This includes time before being bound to a node, as well as time spent pulling images onto the host. | ![Development](https://img.shields.io/badge/-development-blue) |
187+
| `Running` | The pod has been bound to a node and all of the containers have been started. At least one container is still running or is in the process of being restarted. | ![Development](https://img.shields.io/badge/-development-blue) |
188+
| `Succeeded` | All containers in the pod have voluntarily terminated with a container exit code of 0, and the system is not going to restart any of these containers. | ![Development](https://img.shields.io/badge/-development-blue) |
189+
| `Unknown` | For some reason the state of the pod could not be obtained, typically due to an error in communicating with the host of the pod. | ![Development](https://img.shields.io/badge/-development-blue) |
190+
191+
<!-- markdownlint-restore -->
192+
<!-- prettier-ignore-end -->
193+
<!-- END AUTOGENERATED TEXT -->
194+
<!-- endsemconv -->
195+
196+
### Metric: `k8s.pod.status.reason`
197+
198+
This metric is [recommended][MetricRecommended].
199+
200+
<!-- semconv metric.k8s.pod.status.reason -->
201+
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
202+
<!-- see templates/registry/markdown/snippet.md.j2 -->
203+
<!-- prettier-ignore-start -->
204+
<!-- markdownlint-capture -->
205+
<!-- markdownlint-disable -->
206+
207+
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
208+
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
209+
| `k8s.pod.status.reason` | UpDownCounter | `{pod}` | Describes the number of K8s Pods that are currently in a state for a given reason. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.pod`](/docs/registry/entities/k8s.md#k8s-pod) |
210+
211+
**[1]:** All possible pod status reasons will be reported at each time interval to avoid missing metrics.
212+
Only the value corresponding to the current reason will be non-zero.
213+
214+
| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability |
215+
|---|---|---|---|---|---|
216+
| [`k8s.pod.status.reason`](/docs/registry/attributes/k8s.md) | string | The reason for the pod state. Corresponds to the `reason` field of the: [K8s PodStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.33/#podstatus-v1-core) | `Evicted`; `NodeAffinity` | `Required` | ![Development](https://img.shields.io/badge/-development-blue) |
217+
218+
---
219+
220+
`k8s.pod.status.reason` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
221+
222+
| Value | Description | Stability |
223+
|---|---|---|
224+
| `Evicted` | The pod is evicted. | ![Development](https://img.shields.io/badge/-development-blue) |
225+
| `NodeAffinity` | The pod is in a status because of its node affinity | ![Development](https://img.shields.io/badge/-development-blue) |
226+
| `NodeLost` | The reason on a pod when its state cannot be confirmed as kubelet is unresponsive on the node it is (was) running. | ![Development](https://img.shields.io/badge/-development-blue) |
227+
| `Shutdown` | The node is shutdown | ![Development](https://img.shields.io/badge/-development-blue) |
228+
| `UnexpectedAdmissionError` | The pod was rejected admission to the node because of an error during admission that could not be categorized. | ![Development](https://img.shields.io/badge/-development-blue) |
229+
230+
<!-- markdownlint-restore -->
231+
<!-- prettier-ignore-end -->
232+
<!-- END AUTOGENERATED TEXT -->
233+
<!-- endsemconv -->
234+
155235
### Metric: `k8s.pod.cpu.time`
156236

157237
This metric is [recommended][MetricRecommended].

model/k8s/metrics.yaml

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,45 @@ groups:
1515
The actual accuracy would depend on the instrumentation and operating system.
1616
instrument: gauge
1717
unit: "s"
18+
19+
# k8s.pod.status.* metrics
20+
- id: metric.k8s.pod.status.reason
21+
type: metric
22+
metric_name: k8s.pod.status.reason
23+
annotations:
24+
code_generation:
25+
metric_value_type: int
26+
stability: development
27+
brief: "Describes the number of K8s Pods that are currently in a state for a given reason."
28+
entity_associations:
29+
- k8s.pod
30+
note: |
31+
All possible pod status reasons will be reported at each time interval to avoid missing metrics.
32+
Only the value corresponding to the current reason will be non-zero.
33+
instrument: updowncounter
34+
unit: "{pod}"
35+
attributes:
36+
- ref: k8s.pod.status.reason
37+
requirement_level: required
38+
- id: metric.k8s.pod.status.phase
39+
type: metric
40+
metric_name: k8s.pod.status.phase
41+
annotations:
42+
code_generation:
43+
metric_value_type: int
44+
stability: development
45+
brief: "Describes number of K8s Pods that are currently in a given phase."
46+
entity_associations:
47+
- k8s.pod
48+
note: |
49+
All possible pod phases will be reported at each time interval to avoid missing metrics.
50+
Only the value corresponding to the current phase will be non-zero.
51+
instrument: updowncounter
52+
unit: "{pod}"
53+
attributes:
54+
- ref: k8s.pod.status.phase
55+
requirement_level: required
56+
1857
# k8s.pod.cpu.* metrics
1958
- id: metric.k8s.pod.cpu.time
2059
type: metric

model/k8s/registry.yaml

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -687,3 +687,74 @@ groups:
687687
[Kubernetes Resource Quotas documentation](https://kubernetes.io/docs/concepts/policy/resource-quotas/#object-count-quota)
688688
for more details.
689689
examples: [ 'count/replicationcontrollers' ]
690+
- id: k8s.pod.status.reason
691+
type:
692+
members:
693+
- id: evicted
694+
value: 'Evicted'
695+
brief: 'The pod is evicted.'
696+
stability: development
697+
- id: node_affinity
698+
value: 'NodeAffinity'
699+
brief: 'The pod is in a status because of its node affinity'
700+
stability: development
701+
- id: node_lost
702+
value: 'NodeLost'
703+
brief: >
704+
The reason on a pod when its state cannot be confirmed as kubelet is unresponsive
705+
on the node it is (was) running.
706+
stability: development
707+
- id: shutdown
708+
value: 'Shutdown'
709+
brief: 'The node is shutdown'
710+
stability: development
711+
- id: unexpected_admission_error
712+
value: 'UnexpectedAdmissionError'
713+
brief: >
714+
The pod was rejected admission to the node because of an error during admission
715+
that could not be categorized.
716+
stability: development
717+
stability: development
718+
brief: >
719+
The reason for the pod state. Corresponds to the `reason` field of the:
720+
[K8s PodStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.33/#podstatus-v1-core)
721+
examples: ['Evicted', 'NodeAffinity']
722+
- id: k8s.pod.status.phase
723+
type:
724+
members:
725+
- id: pending
726+
value: 'Pending'
727+
brief: >
728+
The pod has been accepted by the system, but one or more of the containers
729+
has not been started. This includes time before being bound to a node, as well as time spent
730+
pulling images onto the host.
731+
stability: development
732+
- id: running
733+
value: 'Running'
734+
brief: >
735+
The pod has been bound to a node and all of the containers have been started.
736+
At least one container is still running or is in the process of being restarted.
737+
stability: development
738+
- id: succeeded
739+
value: 'Succeeded'
740+
brief: >
741+
All containers in the pod have voluntarily terminated
742+
with a container exit code of 0, and the system is not going to restart any of these containers.
743+
stability: development
744+
- id: failed
745+
value: 'Failed'
746+
brief: >
747+
All containers in the pod have terminated, and at least one container has
748+
terminated in a failure (exited with a non-zero exit code or was stopped by the system).
749+
stability: development
750+
- id: unknown
751+
value: 'Unknown'
752+
brief: >
753+
For some reason the state of the pod could not be obtained, typically due
754+
to an error in communicating with the host of the pod.
755+
stability: development
756+
stability: development
757+
brief: >
758+
The phase for the pod. Corresponds to the `phase` field of the:
759+
[K8s PodStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.33/#podstatus-v1-core)
760+
examples: [ 'Pending', 'Running' ]

0 commit comments

Comments
 (0)