Check out what's in the latest release of Kabanero Learn More

Application Monitoring on Red Hat OpenShift Container Platform (RHOCP) 4.2 with Prometheus and Grafana

duration 60 minutes

Introduction

The following guide has been tested with RHOCP 4.2/Kabanero 0.3.0.

For application monitoring on RHOCP, you need to set up your own Prometheus and Grafana deployments. Both Prometheus and Grafana can be set up via Operator Lifecycle Manager (OLM).

Deploy an Application with MP Metrics Endpoint

Prior to deploying Prometheus, ensure that there is a running application that has a service endpoint for outputting metrics in Prometheus format.

It is assumed such a running application has been deployed to the RHOCP cluster inside a project/namespace called myapp, and that the Prometheus metrics endpoint is exposed on path /metrics.

Deploy Prometheus - Prometheus Operator

service_monitor.yaml

 1apiVersion: monitoring.coreos.com/v1
 2kind: ServiceMonitor
 3metadata:
 4  name: myapp-monitor
 5  labels:
 6    k8s-app: myapp-monitor
 7  namespace: prometheus-operator
 8spec:
 9  namespaceSelector:
10    matchNames:
11      - myapp
12  selector:
13    matchLabels:
14      app: example-app
15  endpoints:
16    - interval: 30s
17      path: /metrics
18      port: 9080-tcp

service_account.yaml

 1apiVersion: v1
 2kind: ServiceAccount
 3metadata:
 4  name: prometheus
 5---
 6apiVersion: v1
 7kind: Service
 8metadata:
 9  name: prometheus
10  namespace: prometheus-operator
11spec:
12  type: NodePort
13  ports:
14  - name: web
15    port: 9090
16    protocol: TCP
17    targetPort: web
18  selector:
19    prometheus: prometheus
20---
21apiVersion: rbac.authorization.k8s.io/v1beta1
22kind: ClusterRole
23metadata:
24  name: prometheus
25rules:
26- apiGroups: [""]
27  resources:
28  - nodes
29  - services
30  - endpoints
31  - pods
32  verbs: ["get", "list", "watch"]
33- apiGroups: [""]
34  resources:
35  - configmaps
36  verbs: ["get"]
37- nonResourceURLs: ["/metrics"]
38  verbs: ["get"]
39---
40apiVersion: rbac.authorization.k8s.io/v1beta1
41kind: ClusterRoleBinding
42metadata:
43  name: prometheus
44roleRef:
45  apiGroup: rbac.authorization.k8s.io
46  kind: ClusterRole
47  name: prometheus
48subjects:
49- kind: ServiceAccount
50  name: prometheus
51  namespace: prometheus-operator

prometheus.yaml

 1apiVersion: monitoring.coreos.com/v1
 2kind: Prometheus
 3metadata:
 4  name: prometheus
 5  labels:
 6    prometheus: k8s
 7  namespace: prometheus-operator
 8spec:
 9  replicas: 2
10  serviceAccountName: prometheus
11  securityContext: {}
12  serviceMonitorSelector:
13    matchExpressions:
14      - key: k8s-app
15        operator: Exists
16  ruleSelector:
17    matchLabels:
18      role: prometheus-rulefiles
19      prometheus: k8s
20  alerting:
21    alertmanagers:
22      - namespace: prometheus-operator
23        name: alertmanager-main
24        port: web

The Prometheus Operator is an open-source project originating from CoreOS and exists as a part of their Kubernetes Operator framework. The Kubernetes Operator framework is the preferred way to deploy Prometheus on a Kubernetes system. When the Prometheus Operator is installed on the Kubernetes system, you no longer need to hand-configure the Prometheus configuration. Instead, you create CoreOS ServiceMonitor resources for each of the service endpoints that needs to be monitored: this makes daily maintenance of the Prometheus server a lot easier. An architecture overview of the Prometheus Operator is shown below:

Prometheus Operator

Using Operator Lifecycle Manager (OLM), Prometheus operator can be easily installed and configured in RHOCP Web Console.

Install Prometheus Operator Using Operator Lifecycle Manager (OLM)

The following procedure is based on Using the Operator Lifecycle Manager to deploy Prometheus on OpenShift, with the added inclusion of OpenShift commands needed to complete each step.

  1. Create a new namespace for our Prometheus Operator deployment

    oc new-project prometheus-operator
  2. Go to OpenShift Container Platform web console and Click on Operators > OperatorHub. Using the OLM, Operators can be easily pulled, installed and subscribed on the cluster. Ensure that the Project is set to prometheus-operator. Search for Prometheus Operator and install it. Choose prometheus-operator under A specific namespace on the cluster and subscribe.

  3. Click on Overview and create a service monitor instance. A ServiceMonitor defines a service endpoint that needs to be monitored by the Prometheus instance.

  4. Inside the Service Monitor YAML file, make sure metadata.namespace is your monitoring namespace. In this case, it will be prometheus-operator. spec.namespaceSelector and spec.selector for labels should be configured to match your app deployment’s namespace and label. For example, inside service_monitor.yaml file, an application with label app: example-app from namespace myapp will be monitored by the service monitor. If the metrics endpoint is secured, you can define a secured endpoint with authentication configuration by following the endpoint API documentation of Prometheus Operator.

    Refer to the service_monitor.yaml file
  5. Create a Service Account with Cluster role and Cluster role binding to ensure you have the permission to get nodes and pods in other namespaces at the cluster scope. Refer to service_account.yaml. Create the YAML file and apply it.

    oc apply -f service_account.yaml
  6. Click on Overview and create a Prometheus instance. A Prometheus resource can scrape the targets defined in the ServiceMonitor resource.

  7. Inside the Prometheus YAML file, make sure metadata.namespace is prometheus-operator. Ensure spec.serviceAccountName is the Service Account’s name that you have applied in the previous step. You can set the match expression to select which Service Monitors you are interested in under spec.serviceMonitorSelector.matchExpressions as in prometheus.yaml file.

    Refer to the prometheus.yaml file
  8. Verify that the Prometheus services have successfully started.

    [root@rhel7-ocp]# oc get svc -n prometheus-operator
    NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
    prometheus-operated   ClusterIP   None             <none>        9090/TCP         19h
  9. Check the server logs from one of the target pods to see if the services are running properly.

    [root@rhel7-ocp]# oc get pods -n prometheus-operator
    NAME                                   READY     STATUS    RESTARTS   AGE
    prometheus-operator-7fccbd7c74-48m6v   1/1       Running   0          19h
    prometheus-prometheus-0                3/3       Running   1          19h
    prometheus-prometheus-1                3/3       Running   1          19h
    [root@rhel7-ocp]# oc logs prometheus-prometheus-0 -c prometheus -n prometheus-operator
  10. Expose the prometheus-operated service to use the Prometheus console externally.

    [root@rhel7-ocp]# oc expose svc/prometheus-operated -n prometheus-operator
    route.route.openshift.io/prometheus-operated exposed
    [root@rhel7-ocp]# oc get route -n prometheus-operator
    NAME         HOST/PORT                                                 PATH      SERVICES     PORT      TERMINATION   WILDCARD
    prometheus   prometheus-prometheus-operator.apps.9.37.135.153.nip.io             prometheus   web                     None
  11. Visit the Prometheus route and go to the Prometheus targets page. Check to see that the Prometheus targets page is picking up the target endpoints.

Prometheus Target Page

Deploy Grafana

grafana_datasource.yaml

 1apiVersion: integreatly.org/v1alpha1
 2kind: GrafanaDataSource
 3metadata:
 4  name: grafana-datasource
 5  namespace: prometheus-operator
 6spec:
 7  datasources:
 8    - access: proxy
 9      editable: true
10      isDefault: true
11      jsonData:
12        timeInterval: 5s
13      name: Prometheus
14      type: prometheus
15      url: 'http://prometheus-operated:9090'
16      version: 1
17  name: grafana-datasources.yaml

grafana.yaml

 1apiVersion: integreatly.org/v1alpha1
 2kind: Grafana
 3metadata:
 4  name: grafana
 5  namespace: prometheus-operator
 6spec:
 7  ingress:
 8    enabled: true
 9  config:
10    auth:
11      disable_signout_menu: true
12    auth.anonymous:
13      enabled: true
14    log:
15      level: warn
16      mode: console
17  dashboardLabelSelector:
18    - matchExpressions:
19        - key: app
20          operator: In
21          values:
22            - grafana

grafana_dashboard.yaml

  1apiVersion: integreatly.org/v1alpha1
  2kind: GrafanaDashboard
  3metadata:
  4  labels:
  5    app: grafana
  6  name: template-dashboard
  7  namespace: prometheus-operator
  8spec:
  9  json: |
 10    {
 11      "__inputs": [
 12         {
 13           "name": "Prometheus",
 14           "label": "Prometheus",
 15           "description": "",
 16           "type": "datasource",
 17           "pluginId": "prometheus",
 18           "pluginName": "Prometheus"
 19         }
 20       ],
 21      "__requires": [
 22         {
 23           "type": "grafana",
 24           "id": "grafana",
 25           "name": "Grafana",
 26           "version": "5.2.0"
 27         },
 28         {
 29           "type": "panel",
 30           "id": "graph",
 31           "name": "Graph",
 32           "version": "5.0.0"
 33         },
 34         {
 35           "type": "datasource",
 36           "id": "prometheus",
 37           "name": "Prometheus",
 38           "version": "5.0.0"
 39         },
 40         {
 41           "type": "panel",
 42           "id": "table",
 43           "name": "Table",
 44           "version": "5.0.0"
 45         }
 46       ],
 47      "annotations": {
 48        "list": [
 49          {
 50            "builtIn": 1,
 51            "datasource": "-- Grafana --",
 52            "enable": true,
 53            "hide": true,
 54            "iconColor": "rgba(0, 211, 255, 1)",
 55            "name": "Annotations & Alerts",
 56            "type": "dashboard"
 57          }
 58        ]
 59      },
 60      "editable": true,
 61      "gnetId": null,
 62      "graphTooltip": 0,
 63      "id": 1,
 64      "iteration": 1569353980677,
 65      "links": [],
 66      "panels": [
 67        {
 68          "columns": [],
 69          "datasource": "Prometheus",
 70          "fontSize": "100%",
 71          "gridPos": {
 72            "h": 6,
 73            "w": 24,
 74            "x": 0,
 75            "y": 2
 76          },
 77          "id": 26,
 78          "links": [],
 79          "options": {},
 80          "pageSize": null,
 81          "scroll": true,
 82          "showHeader": true,
 83          "sort": {
 84            "col": 0,
 85            "desc": true
 86          },
 87          "styles": [
 88            {
 89              "alias": "Time",
 90              "dateFormat": "YYYY-MM-DD HH:mm:ss",
 91              "pattern": "Time",
 92              "type": "date"
 93            },
 94            {
 95              "alias": "Namespace",
 96              "pattern": "namespace",
 97              "type": "custom"
 98            },
 99            {
100              "alias": "Service",
101              "pattern": "service",
102              "type": "custom"
103            },
104            {
105              "alias": "Endpoint",
106              "pattern": "endpoint",
107              "type": "custom"
108            },
109            {
110              "alias": "",
111              "mappingType": 1,
112              "pattern": "Value",
113              "type": "hidden",
114              "unit": "short"
115            }
116          ],
117          "targets": [
118            {
119              "expr": "count(up) by (namespace, service, endpoint)",
120              "format": "table",
121              "instant": true,
122              "intervalFactor": 1,
123              "refId": "A"
124            }
125          ],
126          "title": "Endpoints",
127          "transform": "table",
128          "type": "table"
129        }
130      ],
131      "refresh": "10s",
132      "schemaVersion": 19,
133      "style": "dark",
134      "tags": [],
135      "time": {
136        "from": "now-15m",
137        "to": "now"
138      },
139      "timepicker": {
140        "refresh_intervals": [
141          "1m",
142          "5m",
143          "15m",
144          "1h",
145          "1d"
146        ]
147      },
148      "title": "Template-Dashboard",
149      "uid": "Template-Dashboard"
150    }
151  name: template-dashboard.json

Use Grafana dashboards to visualize the metrics. Perform the following steps to deploy Grafana and ensure that Prometheus endpoints are reachable as a data source in Grafana.

  1. Choose the same namespace as Prometheus Operator deployment.

    oc project prometheus-operator
  2. Go to OpenShift Container Platform web console and Click on Operators > OperatorHub. Search for Grafana Operator and install it. Choose prometheus-operator under A specific namespace on the cluster and subscribe.

  3. Click on Overview and create a Grafana Data Source instance.

  4. Inside the Grafana Data Source YAML file, make sure metadata.namespace is prometheus-operator. Set spec.datasources.url to the url of the target datasource. For example, inside grafana_datasource.yaml file, the Prometheus service is prometheus-operated on port 9090, so the url is set to 'http://prometheus-operated:9090'.

    Refer to the grafana_datasource.yaml file
  5. Click on Overview and create a Grafana instance.

  6. Inside the Grafana YAML file, make sure metadata.namespace is prometheus-operator. You can define the match expression to select which Dashboards you are interested in under spec.dashboardLabelSelector.matchExpressions. For example, inside grafana.yaml file, the Grafana will discover dashboards with app labels having a value of grafana.

    Refer to the grafana.yaml file
  7. Click on Overview and create a Grafana Dashboard instance.

  8. Copy grafana_dashboard.yaml to Grafana Dashboard YAML file to check the Data Source is connected and Prometheus endpoints are discoverable.

    Apply grafana_dashboard.yaml file to check
  9. Click on Networking > Routes and go to Grafana’s location to see the template dashboard. You can now consume all the application metrics gathered by Prometheus on the Grafana dashboard.

Template Dashboard
  1. When importing your own Grafana dashboard, your dashboard should be configured under spec.json in Grafana Dashboard YAML file. Make sure under "__inputs", the name matches with your Grafana Data Source’s spec.datasources. For example, inside grafana_dashboard.yaml file, name is set to "Prometheus".

Copy file contents
Git clone this repo to get going right away:
git clone https://github.com/Kabanero-io/guide-app-monitoring-ocp4.2.git
Copy github clone command

Way to go! What's next?

What could make this guide better?

Raise an issue to share feedback

Create a pull request to contribute to this guide

Need help?

Ask a question on Stack Overflow

Where to next?