1apiVersion: monitoring.coreos.com/v1 2kind: ServiceMonitor 3metadata: 4 labels: 5 k8s-app: myapp-monitor 6 name: myapp-monitor 7 namespace: myapp 8spec: 9 endpoints: 10 - interval: 30s 11 path: /metrics 12 port: 9080-tcp 13 namespaceSelector: 14 matchNames: 15 - myapp 16 selector: 17 matchLabels: 18 app: myapp
Application Monitoring on OKD with Prometheus and Grafana
The following guide has been tested with OKD 3.11/Kabanero 0.2.0.
For application monitoring on OKD (OpenShift Origin Community Distribution), you need to set up your own Prometheus and Grafana deployments. There are two approaches for setting up Prometheus on OKD.
This guide explores both approaches to set up Prometheus on OKD.
Deploy a Sample Application with MP Metrics Endpoint
Prior to deploying Prometheus, ensure that there is a running application that has a service endpoint for outputting metrics in Prometheus format.
It is assumed such a running application has been deployed to the OKD cluster inside a project/namespace called
myapp, and that the Prometheus metrics endpoint is exposed on path
Option A: Deploy Prometheus - Prometheus Operator
1apiVersion: v1 2kind: ServiceAccount 3metadata: 4 name: prometheus 5--- 6apiVersion: rbac.authorization.k8s.io/v1beta1 7kind: ClusterRole 8metadata: 9 name: prometheus 10rules: 11- apiGroups: [""] 12 resources: 13 - nodes 14 - services 15 - endpoints 16 - pods 17 verbs: ["get", "list", "watch"] 18- apiGroups: [""] 19 resources: 20 - configmaps 21 verbs: ["get"] 22- nonResourceURLs: ["/metrics"] 23 verbs: ["get"] 24--- 25apiVersion: rbac.authorization.k8s.io/v1beta1 26kind: ClusterRoleBinding 27metadata: 28 name: prometheus 29roleRef: 30 apiGroup: rbac.authorization.k8s.io 31 kind: ClusterRole 32 name: prometheus 33subjects: 34- kind: ServiceAccount 35 name: prometheus 36 namespace: prometheus-operator 37--- 38apiVersion: monitoring.coreos.com/v1 39kind: Prometheus 40metadata: 41 name: prometheus 42 namespace: prometheus-operator 43spec: 44 serviceAccountName: prometheus 45 serviceMonitorNamespaceSelector: 46 matchLabels: 47 prometheus: monitoring 48 serviceMonitorSelector: 49 matchExpressions: 50 - key: k8s-app 51 operator: Exists 52 resources: 53 requests: 54 memory: 400Mi 55 enableAdminAPI: false 56--- 57apiVersion: v1 58kind: Service 59metadata: 60 name: prometheus 61 namespace: prometheus-operator 62spec: 63 type: NodePort 64 ports: 65 - name: web 66 port: 9090 67 protocol: TCP 68 targetPort: web 69 selector: 70 prometheus: prometheus
1serviceMonitorNamespaceSelector: 2 matchLabels: 3 prometheus: monitoring 4serviceMonitorSelector: 5 matchExpressions: 6 - key: k8s-app 7 operator: Exists
The Prometheus Operator is an open-source project originating from CoreOS and exists as as part of their Kubernetes Operator offering. The Kubernetes Operator framework is becoming the standard for Prometheus deployments on a Kubernetes system. When the Prometheus Operator is installed on the Kubernetes system, you no longer need to hand-configure the Prometheus configuration. Instead, you create ServiceMonitor resources for each of the service endpoints that needs to be monitored: this makes daily maintainenance of the Prometheus server a lot easier. An architecture overview of the Prometheus Operator is shown below:
There are two ways to install the Prometheus Operator:
One is through Operator Lifecycle Manager or OLM, (which is still in its technology preview phase in release 3.11 of OKD).
Another approach is to install Prometheus Operator by following the guide from the Prometheus Operator git repository.
Since OLM is still at its technical preview stage, this guide shows the installation without OLM. The guide will be updated with the OLM approach when Kabanero officially adopts OKD 4.x.
Prometheus Operator Installation
The following procedure is based on the Prometheus Getting Started guide maintained by the CoreOS team, with the added inclusion of OpenShift commands needed to complete each step.
Clone the Prometheus Operator repository
git clone https://github.com/coreos/prometheus-operator
Create a new namespace for our Prometheus Operator deployment.
oc new-project prometheus-operator
bundle.yamlfile and change all instances of namespace: default to the the newly created namespace namespace: prometheus-operator
Add the line
- --deny-namespaces=openshift-monitoringto the existing containers args section of Prometheus Operator’s Deployment definition in the
bundle.yamlfile. The --deny-namespaces argument allows the exclusion of certain namespaces watched by the Prometheus Operator. By default, Prometheus Operator oversees Prometheus deployments across all namespaces. This could be problematic if there are multiple Prometheus Operator deployments on the OKD cluster. For instance, the OKD’s Cluster Monitoring feature also deploys a Prometheus Operator in namespace openshift-monitoring. Therefore, openshift-monitoring namespace should be excluded by our Prometheus Operator to prevent undesired behaviors.
bundle.yamlfile and deploy the Prometheus Operator using the following command.
oc apply -f bundle.yaml
You may receive an error message like the one below when running the command.
Error creating: pods "prometheus-operator-5b8bfd696-" is forbidden: unable to validate against any security context constraint: [spec.containers.securityContext.securityContext.runAsUser: Invalid value: 65534: must be in the ranges: [1000070000, 1000079999]]
To correct the error, change the runAsUser: 65534 field in the
bundle.yamlfile to a valid value that is in the range specified in the error message. In this case, setting runAsUser: 1000070000 in the
bundle.yamlwould be in the valid range. Save the
bundle.yamlfile and re-deploy the Prometheus Operator.
oc delete -f bundle.yaml oc apply -f bundle.yaml
service_monitor.yamlfile defines a ServiceMonitor resource. A ServiceMonitor defines a service endpoint that needs to be monitored by the Prometheus instance. Take for example, an application with label app: myapp from namespace myapp, and metrics endpoints defined in spec.endpoints to be monitored by the Promtheus Operator. If the metrics endpoint is secured, you can define a secured endpoint with authentication configuration by following the endpoint API documentation of Prometheus Operator.
service_monitor.yamlfile to create the ServiceMonitor resource.
oc apply -f service_monitor.yaml
Define a Prometheus resource that can scrape the targets defined in the ServiceMonitor resource. Create a
prometheus.yamlfile that aggregates all the files from the git repository directory
prometheus-operator/example/rbac/prometheus/. NOTE: Make sure to change the
prometheus.yamlfile to deploy the Prometheus service. After all the resources are created, apply the Prometheus Operator
oc apply -f prometheus.yaml oc apply -f bundle.yaml
Verify that the Prometheus services have successfully started. The prometheus-operated service is created automatically by the prometheus-operator, and is used for registering all deployed Prometheus instances.
oc get svc -n prometheus-operator NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE prometheus NodePort 172.30.112.199 <none> 9090:30342/TCP 19h prometheus-operated ClusterIP None <none> 9090/TCP 19h prometheus-operator ClusterIP None <none> 8080/TCP 21h
Expose the prometheus-operated service to use the Prometheus console externally.
[root@rhel7-okd]# oc expose svc/prometheus-operated -n prometheus-operator [root@rhel7-okd]# oc get route -n prometheus-operator NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD prometheus prometheus-prometheus-operator.apps.184.108.40.206.nip.io prometheus web None
Visit the prometheus route and go to the Prometheus targets page. At this point, the page should be empty with no endpoints being discovered.
Look at the
prometheus.yamlfile and update the serviceMonitorNamespaceSelector and serviceMonitorSelector fields. The ServiceMonitor needs to satisfy the matching requirement for both selectors before it can be picked up by the Prometheus service, like in the
prometheus_snippet.yamlfile. In this case,our ServiceMonitor has the k8s-app label, but the target namespace "myapp" is missing the required prometheus: monitoring label.
prometheus.yamlto reflect the
Add the label to the "myapp" namespace.
[root@rhel7-okd]# oc label namespace myapp prometheus=monitoring
Check to see that the Prometheus targets page is picking up the target endpoints. If the service endpoint is discovered, but Prometheus is reporting a DOWN status, you need to make the prometheus-operator project globally accessible.
oc adm pod-network make-projects-global prometheus-operator
Option B: Deploy Prometheus - Legacy deployments
For users who just migrated their applications to OKD and define their own Prometheus configuration file, using the Prometheus Operator is not the only option for Prometheus deployments. You can deploy Prometheus by using the example yaml file provided by the OKD GitHub repository.
oc new-project prometheus
Deploy the Prometheus using the sample
prometheus.yamlfile from here
oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/prometheus/prometheus.yaml -p NAMESPACE=prometheus
Edit the "prometheus" ConfigMap resource from the prometheus namespace.
oc edit configmap/prometheus -n prometheus
Remove all existing jobs and add the following
Kill the existing Prometheus pod, or better yet, reload the Prometheus service gracefully using the command below for the new configuration to take effect.
oc exec prometheus-0 -c prometheus -- curl -X POST http://localhost:9090/-/reload
Make sure the monitored application's pods are started with the following annotations as specified in the prometheus ConfigMap's scrape_configs.
Verify the scrape target is up and available in Prometheus by using Prometheus’s web console as follows: Click Console → Status → Targets.
If the service endpoint is discovered, but Prometheus is reporting a DOWN status, you need to make the prometheus project globally accessible.
oc adm pod-network make-projects-global prometheus
Regardless of which approach was used to deploy Prometheus on OKD, use Grafana dashboards to visualize the metrics. Use the sample
grafana.yaml file provided by the OKD GitHub repository to install Grafana. NOTE: Perform the following steps to ensure that Prometheus endpoints are reachable as a data source in Grafana.
Create a new project called grafana.
oc new-project grafana
Deploy Grafana using the
grafana.yamlfile from the OKD GitHub repository.
oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/grafana/grafana.yaml -p NAMESPACE=grafana
Grant the grafana service account view access to the prometheus namespace
oc policy add-role-to-user view system:serviceaccount:grafana:grafana -n prometheus
For Grafana to add existing Prometheus datasources in OKD, define the datasources in a ConfigMap resource under the grafana namespace. Create a ConfigMap yaml file called
grafana-datasources.yamlfile to create the ConfigMap resource.
oc apply -f grafana-datasources.yaml
Acquire the [grafana-ocp token] by using the following command.
oc sa get-token grafana
Add the ConfigMap resource to the Grafana application and mount it to
Save and test the data source. You should see 'Datasource is working'.
You can now consume all the application metrics gathered by Prometheus on the Grafana dashboard.