Monitoring OpenShift pod restarts with Prometheus/AlertManager and kube-state-metrics

Prometheus is started to be the emerging solution to monitor OpenShift. We won’t discuss in this article how to set up Prometheus for OpenShift, because some articles already exist about this topic. You can check this git repository how to install it Prometheus on OpenShift with Grafana dashboards and Alert Manager enabled on how to install it Prometheus on OpenShift with Grafana dashboards and Alert Manager enabled.

When installed on OpenShift, Prometheus can run as a single pod and it will grab (or say scrap in the prometheus terminology) metrics from different providers (or exporters in the prometheus wording). In this git repository, we set up node-exporter as a provider from Prometheus to get metrics on nodes and have alerts and grafana dashboards to monitor them. It also comes with some basic alerts that checks node’s filesystem or CPU usage.

When you run OpenShift, it is very very valuable to monitor your pods restarts. Because, many restarts is often a sign of a malfunction. To do so, we deploy another exporter that exposes a convenient set of metrics from kubernetes API. Fortunately, there is a kubernetes project named kube-state-metrics which exposes these metrics.

The kube-state-metrics needs to be deployed as DeploymentConfig and exposed as a service. Then, annonate this service so it can be to be scraped by prometheus:

oc create -f - << EOF
apiVersion: v1
kind: DeploymentConfig
  namespace: monitoring
  name: kube-state-metrics
  replicas: 1
        name: kube-state-metrics
      - name: kube-state-metrics
        imagePullPolicy: IfNotPresent
        - containerPort: 8080
oc expose dc kube-state-metrics --port=8080
oc annotate svc kube-state-metrics'true'

Then, you can define the following alert, and you will be notified every time your pod restarts more than once in the last 5 minutes:

pod-restart.rules: |
  ALERT PodRestartingTooOften
    IF rate(kube_pod_container_status_restarts[2h]) * 7200 > 1
    FOR 1m
    LABELS {severity="page"}
    ANNOTATIONS {DESCRIPTION="Pod {{$labels.namespace}}/{{$labels.pod}} restarting more than once times during last 2 hours.",
    SUMMARY="Pod {{$labels.namespace}}/{{$labels.pod}} restarting more than once times during last 2 hours."}

Disk space not reclaimed after deleting log files

Hello World,

if you get out of disk space and delete log files, and you don’t see your disk space reclaimed, you may have found an issue that I faced with rsyslog not releasing rotatable files.

To be sure:

lsof | grep deleted

In the first column, you may see the process still handling the file descriptor unclosed and preventing the disk usage reclaim even after deletion.

The solution: systemctl restart rsyslog.service or kill the guilty process.

Run sshd and openshift-router on the same port using HAProxy on CentOS7

TL;DHTTW (Don’t Have Time To Write 🙂 )

Remove firewalls and only use iptables, because there are non trivial interactions that makes stuff complicated:
sudo systemctl stop firewalld && sudo systemctl start iptables; sudo systemctl start ip6tables

oc cluster up --version=v3.3 --metrics --public-hostname= --use-existing-config

Change router default port:

oc env dc/router ROUTER_SERVICE_HTTPS_PORT=9443

Also edit dc router and change hostNetwork: true to false and hostPort form 443 to 9443

Then, here is the haproxy.cfg that you may need:

    log local2
    chroot      /var/lib/haproxy
    pidfile     /var/run/
    maxconn     4000
    user        haproxy
    group       haproxy

    stats socket /var/lib/haproxy/stats
    log                     global
    option                  dontlognull
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000
listen ssl :443
  tcp-request inspect-delay 4s
  acl is_ssl req_ssl_ver 2:3.1
  tcp-request content accept if is_ssl
  use_backend ssh if !is_ssl
  server www-ssl
  timeout client 2h
backend ssh
  mode tcp
  server ssh :22
  timeout server 2h

And finally, you will need to allow HAProxy to use port 443 by adding the following SELinux boolean:

setsebool -P haproxy_connect_any 1

Improve your build speed: Run a proxy in OpenShift

Many build processes uses external source code or library repositories only available in the internet. That is the case for NPM (Node Package Manager, used for NodeJS applications compilation) or Maven (when building Java applications).

Thus running an HTTP Proxy inside of OpenShift could be helpful in many cases:
– in a corporate environment it is not an exception to face proxy that requires authentication. And even if builds mechanism in OpenShift supports it, you will have to put your credentials somewhere and they may be visible on logs or source code
– your corporate proxy will certainly not cache all the artefacts that you frequently use, so doing it inside of your own proxy may save you several minutes for build time and several gigs of downloads

In this blog, we will learn how to setup an HTTP/HTTPS proxy in OpenShift that is able to forward requests to an upstream corporate HTTP Proxy and also act as a cache but with no persistent volume.

A CentOS/Squid docker image

Unfortunately, I was not able to find a publicly available and reliable docker image that fits my needs so I decided to write my own based on CentOS7 and squid. The sources are available on my GitHub repository and the image is on Docker Hub.
Some important features about this image:
– it can run as any UID which is very good for OpenShift
– it exposes port 3128 as an usual squid proxy image
– it accepts an environment variable named CACHE_PEER which can handle an upstream proxy URL in the form url_encoded_username:url_encoded_password:proxy_hostname:port
– it allows CONNECT for any traffic, mainly used for SSL and it does not perform SSL interception (this is why we used CONNECT)
If you want to test it, simply run it with docker:

 docker run -d --name="proxy" -p 3128:3128 \
            -e "" \

Deploying the image in OpenShift

Now, let’s see how we can run this image in OpenShift so you can make it usable by other pods. We will be working in the “default” project to ensure that whatever your configuration all the pods can have access to this new service.

oc project default
=> Now using project "default" on server "".

As stated previously, the image accepts an environment variable to target the upstream proxy server.
This is the only thing we need to start it. We just need to use the great oc new-app command with some arguments:

oc new-app --name=proxy \
                     -e ''
--> Found Docker image 95aeb47 (About an hour old) from for ""
    * An image stream will be created as "proxy:latest" that will track this image
    * This image will be deployed in deployment config "proxy"
    * Ports 3128/tcp will be load balanced by service "proxy"
--> Creating resources with label app=proxy ...
    ImageStream "proxy" created
    DeploymentConfig "proxy" created
    Service "proxy" created
--> Success
    Run 'oc status' to view your app.

The installation only takes a few minutes required to pull the image from Docker Hub. Once done, we can see the relevant pod using the oc get pods command:

oc get pods
NAME                    READY     STATUS    RESTARTS   AGE
proxy-1-vz1g3    1/1       Running   0          6m

Accessing your proxy from inside of the cluster

In a multi-tenant OpenShift cluster, pods within different namespaces are isolated and can’t reach other thanks to the network isolation feature give by OpenShift-sdn. There is an exception to this: pods deployed in namespace “default” can be reached by all other pods.
Moreover, OpenShift have an internal DNS which allows processes in pods to performs name resolution within the cluster.
Thanks to this mechanism, our proxy pod cluster IP address will be resolved by the name squid.squid.svc.cluster.local at reachable on port 3128.

So, if you need for exemple to refer to a proxy in an STI based build, just put the following lines in your .sti/environment at the root level of your project on git:


Then, enjoy a speeder STI build.

Accessing your proxy from everywhere

In other cases, you want your proxy to be reachable from other system that don’t run on OpenShift, like an external Software Factory. For this specific scenario, we need a special setup.

Indeed, HTTP Proxies use a specific communication on HTTP which cannot be relayed across proxies themselves. Thus, it is not possible to use an OpenShift Route and the openshift-router to expose our brand new proxy to the rest of the world.

However, there is a very powerful feature available in OpenShift used to expose non HTTP/HTTPs/SNI services on all nodes of the cluster: it is called NodePort.
NodePort is a special Service configuration that opens a given port on all OpenShift nodes and redirect trafic to the underlying pods using iptables and kube proxy.

We will need to create a Service which does not handle a clusterIP but a nodePort on port 31280: OpenShift has a reserved (configurable) port range for nodePort services. Default values are between 30’000 and 32’767.

oc create -f - << EOF
apiVersion: v1
kind: Service
  annotations: {}
  creationTimestamp: null
    app: proxy
  name: proxy-node-port
  - name: 3128-tcp
    port: 3128
    protocol: TCP
    nodePort: 31280
    app: proxy
    deploymentconfig: proxy
  sessionAffinity: None
  type: NodePort
  loadBalancer: {}

=>  You have exposed your service on an external port on all nodes in your
cluster.  If you want to expose this service to the external internet, you may
need to set up firewall rules for the service port(s) (tcp:31280) to serve traffic.

See for more details.
service "proxy-node-port" created

Now, your proxy can be reached on any node, on port 31280. If you do have a VIP or a LoadBalancer in front of your nodes, your service will even be load balanced.

Keep in mind that you may need to restrict access to this service to avoid its usage by unwanted people.

Deploy GitLab on OpenShift

GitLab is a great web git repository application for everyone that wants to run his own Git repository at home or office. Unfortunately, the home made GitLab installation requires some skills that I don’t have to learn. The good thing is that some Docker images exists on Docker Hub, even the one from GitLab team. In this blog post, we will use
the sameersbn docker-gitlab image which demonstrated to working very well, supports volumes and also bring separate containers for postgresql and redis.

Installing postgres

For convenience reasons (and also for support if your are using OpenShift Enterprise), we are using the PostgreSQL image provided by OpenShift team to start ou postgresql instance. The image supports persistent volumes and will create the Persistent Volume Claim for you.

Simply use the oc new-app command to get your PostgreSQL instance up and running. Note that there an issue with this image that runs with a predefined user, you will have to allow it to run as AnyUid by using the corresponding Security Context Constraint.

oc new-app --template=postgresql-persistent \
--> Deploying template postgresql-persistent in project openshift for "postgresql-persistent"
     With parameters:
--> Creating resources ...
    Service "postgresql" created
    Persistentvolumeclaims "postgresql" created
    DeploymentConfig "postgresql" created

Configuring Security Context

Some of the containers that we will use need to run as root (GitLab) or any other user (Postgres has hardcoded user 26 but this will be fixed soon).
Hence, it is required to used the project’s service account (here we created a project called gitlab) to the SCC named anyuid.

oc edit scc anyuid
  type: RunAsAny
- system:serviceaccount:gitlab:default

This configuration will work for postgres and gitlab image.

Installing Redis

Use the oc new-app command here again and directly pass it the Docker image name and you will get a redis instance up and running in seconds.

oc new-app  sameersbn/redis
    Service "redis" created
--> Success
    Run 'oc status' to view your app.
The new-app command will create Services, EndPoints and associated pods.
It is still required to add a persistent volume to the Deployment Configuration using the following command:
oc volume dc/redis --add --overwrite -t persistentVolumeClaim \
                        --claim-name=redis-data --name=redis-volume-1 \

Installing GitLab itself

The sameersbn image allows several parameters to be injected in order to configure the GitLab instance to be created.
For some reasons Services name resolutions are not working with the provided startup script, although going into the container and pinging the services works.
So, we will inject the PostgreSQL and Redis Services IP addresses manually using the parameters.
To get the Services IP addresses:

oc get svc postgresql redis
NAME         CLUSTER_IP     EXTERNAL_IP   PORT(S)    SELECTOR                                     AGE
postgresql           5432/TCP   app=postgresql,deploymentconfig=postgresql   1h
redis           6379/TCP   app=redis,deploymentconfig=redis             1h

Use these IP addresses to start the GitLab container, again by using the new-app command:
One important thing to note: You need to use the --name parameter and the name to anything else than gitlab otherwise all your OpenShift injected environment variables will be named GITLAB_* , and gitlab already uses some of those. In our case the variables will be name GITLAB_CE_* which fixes troubles.

oc new-app sameersbn/gitlab --name=gitlab-ce 
                             -e 'GITLAB_HOST=' \
                             -e 'DB_TYPE=postgres' -e 'DB_HOST=' \ 
                             -e 'DB_PORT=5432'    -e 'DB_NAME=gitlab'   -e 'DB_USER=admin' \
                             -e 'DB_PASS=admin'   -e 'REDIS_HOST= -e 'REDIS_PORT=6379' \
                             -e 'GITLAB_SECRETS_DB_KEY_BASE=1234567890' -e 'SMTP_ENABLED=true' \
                             -e '' -e 'SMTP_PORT=25' \
                             -e ''
    Service "gitlab-ce" created
--> Success
    Run 'oc status' to view your app.

Of course, do not forget to add the volumes to make your repositories and logs persistent:

oc volumes dc/gitlab-ce --add --claim-name=gitlab-log --mount-path=/var/log/gitlab \
                     -t persistentVolumeClaim --overwrite
oc volumes dc/gitlab-ce --add --claim-name=gitlab-data --mount-path=/home/git/data \
                     -t persistentVolumeClaim --overwrite

A word on persistent volumes

The persistent volumes that you will have to create may require specific configuration: This is because both postgresql and postgres uses some hardcoded uid/gid and tries to make chown on some files.
If you are using NFS backed Persistent Volume, you will run into permission denied issues on chmod and chown.
To bypass this, you will have to add supplementalGroups in the DeploymentConfig's SecurityContext:
- for postgres: add 26
- for gitlab-ce: add 1000

You will have then to create the tow persistent volumes and chown to those UID/GID and use all_squash option.

chown -R 26:26 /srv/nfs/pv0001
chow -R 1000:1000 /srv/nfs/pv0002

cat >> /etc/exports << EOF
/srv/nfs/pv0001 *(rw,all_squash)
/srv/nfs/pv0002 *(rw,all_squash)

exportfs -a

Create then your PV and PVC using the following definitions:

apiVersion: v1
kind: PersistentVolume
  creationTimestamp: null
  name: pv0005
  - ReadWriteOnce
  - ReadWriteMany
    storage: 8Gi
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: gitlab-data
    namespace: gitlab
    path: /srv/nfs/pv0005
  persistentVolumeReclaimPolicy: Retain
status: {}

And for the PVC:

apiVersion: v1
kind: PersistentVolumeClaim
  creationTimestamp: null
  name: gitlab-data
  - ReadWriteOnce
  - ReadWriteMany
      storage: 5Gi
status: {}

Now, you have your GitLab running in OpenShift on the URL ! enjoy

Run OpenShift console on port 443

One thing that I really like on OpenShift, is that it very often eat its own food. To my opinion, it is generally a sign of a good design, but that’s another story.
In this blog, I wanted to give a clue on how to make the OpenShift console run on port 443 by using the openshift-router facilities, service and endpoints. This could be very useful for example, if you do have some network setup preventing access to port 8443, which is often the case on corporate networks.

As a disclaimer, I want just to state that this is not (well for now) a production-proof design but, at least you can use it for demonstration purposes or simply to understand the way OpenShift external services works.

You will guess that the idea here, is to create an OpenShift external service pointing to the OpenShift master URL and then create a route that will be served by openshift-router to forward request to the OpenShift master itself. It this road, need to create and OpenShift Endpoint as stated by documentation.
And the final trick, is to change your masterPublicURL and master publicURL parameters in master-config.yaml OpenShift configuration to match the route’s URL.

Here is the configuration: You will need to get:
– Your master internal IP address
– A wildcard entry or DNS entry pointing to your openshift-router nodes (can also the be the master itself if you are running the router on master)
– That’s all

So, let’s assume the following settings:
My master’s domaine name is:
My master’s internal IP address is:
My openshift-router runs on IP and my DNS entry points to it

So you need to create a Service:

apiVersion: v1
kind: Service
  creationTimestamp: null
  name: openshift-master
  - name: 8443-tcp
    port: 8443
    protocol: TCP
    targetPort: 8443
  selector: {}
  loadBalancer: {}

and create manually the corresponding Endpoint

apiVersion: v1
kind: Endpoints
  creationTimestamp: null
  name: openshift-master
- addresses:
  - ip:
  - name: 8443-tcp
    port: 8443
    protocol: TCP

And then, you need a route with a host entry point to

apiVersion: v1
kind: Route
  creationTimestamp: null
  name: openshift-master
    targetPort: 8443
    kind: Service
    name: openshift-master
    termination: passthrough
  ingress: null

and the last point, is to modify your master-config.yaml to change any occurrences to masterPublicURL or publicURL to
Keep in mind that the certificates that you have generated for the console must be valid for the host URL you are pointing to, and must update your corsAllowedOrigins to add the new domain you are pointing to.

- v1
apiVersion: v1
  extensionDevelopment: false
  extensionScripts: null
  extensionStylesheets: null
  extensions: null
  loggingPublicURL: ""
  logoutURL: ""
    bindNetwork: tcp4
    certFile: master.server.crt
    clientCA: ""
    keyFile: master.server.key
    maxRequestsInFlight: 0
    namedCertificates: null
    requestTimeoutSeconds: 0
controllerLeaseTTL: 0
controllers: '*'
- localhost
disabledFeatures: null

Et voilĂ !
Your OpenShift master console should now be available on port 443

VPN tunnels through HTTP proxy using SSH

The title of this post is beatufill: 7 words, 3 acronyms equally distributed composed of 3,4 and 3 letters.
But that’s not the topic….instead, today, we will learn how to setup a VPN tunnel using SSH when you are behind proxy.

You will need:
– a first tool: sshuttle
– an SSH client able to receive and process the ProxyCommand directive
– a remote SSH server running on port 443 or 80
– another tool: corkscrew
– a proxy server only allowing HTTP(S) traffic

Let’s describe each in reverse order

Proxy Server

You should not have too much control on it, but if the proxy server requires authentication you should get your pair of credentials. Also, if using corkscrew like here, the proxy must supports CONNECT command, otherwise, you should use httptunnel instead.


corkscrew is a simple tool to tunnel TCP connections through an HTTP proxy supporting the CONNECT method. It reads stdin and writes to std- out during the connection, just like nectat.
We will use it to connect to an SSH server running on a remote 443 port through the HTTPS proxy. To do so, we will need to set corkscrew as the ProxyCommand for our SSH client. If your proxy requires authentication, you have to set the credentials in a separate file, lets say ~/.ssh/corkscrew-authfile with the patten username:password


SSH Server

A raspberry Pi hidden at home or even an AWS Free Tier machine should be sufficient. The required configuration parameter needs the following:

# What ports, IPs and protocols we listen for
Port 22
Port 443

# Authentication:
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile	%h/.ssh/authorized_keys

SSH Client

The SSH client configuration will be setup in your .ssh/config file, so you don’t need to type it every time you want to use your tunnel.

  ProxyCommand corkscrew 8080 443  /Users/Akram/.ssh/corkscrew-authfile

Then, every time you will do:


You will be automagically connected to your SSH box, because the SSH client will delegate its connexion management to corkscrew that will connect to on port 8080 using the credentials in file /Users/Akram/.ssh/corkscrew-authfile and then convert the SSH commands into HTTP+CONNECT request going to on port 443.

That was the most difficult part. Once you are connected to your SSH box, the world is then open to you!


sshuttle is the ultimate tool that we will use: It is a transparent VPN proxy through SSH. sshuttle documentation describes briefly the way it works and gives many example of usages. The one that I uses if simply this command line:

sshuttle  --dns -r 0/0

Juste not here that is the address of the server for which you have setup ProxyCommand configuration. Since sshuttle will use SSH under the cover, you have made the sufficient work to make the connection work (even through HTTPS Proxy).
In my case, I added the –dns option to also allow DNS traffic to go through my tunnel because corporate DNS traffic is blocked.
If the connection succeeds, you will see a message “client connected”.
Et voilĂ ….all your connections will go through sshuttle to reach the internet

OpenShift 3 cheatsheet

Here are a few useful commands that you may very often use on OpenShift 3.

Mark a node as non schedulable: Useful once you’ve created OpenShift router and registry to avoid any other scheduling on these nodes:

oadm manage-node --schedulable=false

Deploy OpenShift integrated docker registry:

 oadm registry --config=/etc/openshift/master/admin.kubeconfig \
    --credentials=/etc/openshift/master/openshift-registry.kubeconfig \

Deploy an OpenShift router:

oadm router myrouter --replicas= \
    --credentials='/etc/openshift/master/openshift-router.kubeconfig' \

Adding/setting insecure-registry to docker machine afterwards

Running docker on non-Linux based environment became very convenient and easy with docker-machine which is the successor of docker-boot.

Basically, docker-machine allows you to manage multiple virtual machines running Linux to host your docker installation and then allows you to run your containers.
More than a fantastic tool for OSX and Windows, it is also a very clever and practical way to develop multiple container images or several applications (for different project for examples) using containers.

If you want your docker-machine to use an your own in-house registry or any other, it is not a big issue, until the registry uses HTTPS, and in most of the cases you will get the following error:

docker tag -f my-app/my-app-server:v1.0.14-25-gfefb196
docker push
The push refers to a repository [] (len: 1)
unable to ping registry endpoint
v2 ping attempt failed with error: Get x509: certificate signed by unknown authority
 v1 ping attempt failed
 with error: Get x509: certificate signed by unknown authority

For this case, docker-machine has a fantastic option which is available on creation of a machine:

docker-machine create --driver virtualbox --engine-insecure-registry myregistry:5000 mycompany

But, suppose that you want to add another registry once your docker-machine is created: Unfortunately, I can’t find an option yet to edit the existing configuration of a VM.
You will have to edit your configuration file which is located on your host system (your OSX or Windows home) and add it manually:

vim  ~/.docker/machine/machines/mycompany/config.json

Then, you’ll have to edit the config.json file and locate the array named:InsecureRegistry and simply append an element on it.
It should looks like this:

  "ConfigVersion": 1,
 // Truncated for readability 
  "DriverName": "virtualbox",
  "HostOptions": {
    "Driver": "",
    "Memory": 0,
    "Disk": 0,
    "EngineOptions": {
      "ArbitraryFlags": [],
      "Dns": null,
      "GraphDir": "",
      "Env": [],
      "Ipv6": false,
      "InsecureRegistry": [
      // Truncated for readability 
  "StorePath": "/Users/Akram/.docker/machine/machines/mycompany"