Improve your build speed: Run a proxy in OpenShift

Many build processes uses external source code or library repositories only available in the internet. That is the case for NPM (Node Package Manager, used for NodeJS applications compilation) or Maven (when building Java applications).

Thus running an HTTP Proxy inside of OpenShift could be helpful in many cases:
– in a corporate environment it is not an exception to face proxy that requires authentication. And even if builds mechanism in OpenShift supports it, you will have to put your credentials somewhere and they may be visible on logs or source code
– your corporate proxy will certainly not cache all the artefacts that you frequently use, so doing it inside of your own proxy may save you several minutes for build time and several gigs of downloads

In this blog, we will learn how to setup an HTTP/HTTPS proxy in OpenShift that is able to forward requests to an upstream corporate HTTP Proxy and also act as a cache but with no persistent volume.

A CentOS/Squid docker image

Unfortunately, I was not able to find a publicly available and reliable docker image that fits my needs so I decided to write my own based on CentOS7 and squid. The sources are available on my GitHub repository and the image is on Docker Hub.
Some important features about this image:
– it can run as any UID which is very good for OpenShift
– it exposes port 3128 as an usual squid proxy image
– it accepts an environment variable named CACHE_PEER which can handle an upstream proxy URL in the form url_encoded_username:url_encoded_password:proxy_hostname:port
– it allows CONNECT for any traffic, mainly used for SSL and it does not perform SSL interception (this is why we used CONNECT)
If you want to test it, simply run it with docker:

 docker run -d --name="proxy" -p 3128:3128 \
            -e "CACHE_PEER=user:secret@upstream-proxy.corp.mycompany.com:8080" \
            akrambenaissi/docker-squid

Deploying the image in OpenShift

Now, let’s see how we can run this image in OpenShift so you can make it usable by other pods. We will be working in the “default” project to ensure that whatever your configuration all the pods can have access to this new service.

oc project default
=> Now using project "default" on server "https://paas.mycompany.com:443".

As stated previously, the image accepts an environment variable to target the upstream proxy server.
This is the only thing we need to start it. We just need to use the great oc new-app command with some arguments:

oc new-app docker.io/akrambenaissi/docker-squid --name=proxy \
                     -e 'CACHE_PEER=user:secret@upstream-proxy.corp.mycompany.com:8080'
--> Found Docker image 95aeb47 (About an hour old) from docker.io for "docker.io/akrambenaissi/docker-squid"
    * An image stream will be created as "proxy:latest" that will track this image
    * This image will be deployed in deployment config "proxy"
    * Ports 3128/tcp will be load balanced by service "proxy"
--> Creating resources with label app=proxy ...
    ImageStream "proxy" created
    DeploymentConfig "proxy" created
    Service "proxy" created
--> Success
    Run 'oc status' to view your app.

The installation only takes a few minutes required to pull the image from Docker Hub. Once done, we can see the relevant pod using the oc get pods command:

oc get pods
NAME                    READY     STATUS    RESTARTS   AGE
proxy-1-vz1g3    1/1       Running   0          6m

Accessing your proxy from inside of the cluster

In a multi-tenant OpenShift cluster, pods within different namespaces are isolated and can’t reach other thanks to the network isolation feature give by OpenShift-sdn. There is an exception to this: pods deployed in namespace “default” can be reached by all other pods.
Moreover, OpenShift have an internal DNS which allows processes in pods to performs name resolution within the cluster.
Thanks to this mechanism, our proxy pod cluster IP address will be resolved by the name squid.squid.svc.cluster.local at reachable on port 3128.

So, if you need for exemple to refer to a proxy in an STI based build, just put the following lines in your .sti/environment at the root level of your project on git:

HTTP_PROXY=http://squid.squid.svc.cluster.local:3128
HTTPS_PROXY=http://squid.squid.svc.cluster.local:3128

Then, enjoy a speeder STI build.

Accessing your proxy from everywhere

In other cases, you want your proxy to be reachable from other system that don’t run on OpenShift, like an external Software Factory. For this specific scenario, we need a special setup.

Indeed, HTTP Proxies use a specific communication on HTTP which cannot be relayed across proxies themselves. Thus, it is not possible to use an OpenShift Route and the openshift-router to expose our brand new proxy to the rest of the world.

However, there is a very powerful feature available in OpenShift used to expose non HTTP/HTTPs/SNI services on all nodes of the cluster: it is called NodePort.
NodePort is a special Service configuration that opens a given port on all OpenShift nodes and redirect trafic to the underlying pods using iptables and kube proxy.

We will need to create a Service which does not handle a clusterIP but a nodePort on port 31280: OpenShift has a reserved (configurable) port range for nodePort services. Default values are between 30’000 and 32’767.

oc create -f - << EOF
apiVersion: v1
kind: Service
metadata:
  annotations: {}
  creationTimestamp: null
  labels:
    app: proxy
  name: proxy-node-port
spec:
  ports:
  - name: 3128-tcp
    port: 3128
    protocol: TCP
    nodePort: 31280
  selector:
    app: proxy
    deploymentconfig: proxy
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}
EOF

=>  You have exposed your service on an external port on all nodes in your
cluster.  If you want to expose this service to the external internet, you may
need to set up firewall rules for the service port(s) (tcp:31280) to serve traffic.

See http://releases.k8s.io/HEAD/docs/user-guide/services-firewalls.md for more details.
service "proxy-node-port" created

Now, your proxy can be reached on any node, on port 31280. If you do have a VIP or a LoadBalancer in front of your nodes, your service will even be load balanced.

Keep in mind that you may need to restrict access to this service to avoid its usage by unwanted people.