boclog[1] - Set up Traefik, Apache Airflow, monitoring tools(prometheus, grafana, statsd-exporter).

boclog[1] - Set up Traefik, Apache Airflow, monitoring tools(prometheus, grafana, statsd-exporter).

ยท

3 min read

Most of the code on these logs is on my GitHub ( mrdvince ).

I opted to use traefik as a reverse proxy for my containers, after looking through other alternatives e.g Nginx settled for traefik for its straightforward set up especially since I will be setting everything up in the docker-compose file.

Apache airflow for some automation(dag runs), some of the dags are going to be:

  1. Spinning up ec2 instances on AWS
  2. Pulling images from an s3 bucket
  3. Monitoring a given database (using redshift for my case) for a given number of new records.
  4. Triggering retraining jobs based on a number of new images.
  5. Trigger training job on new images based on data drift using new augmentations and concept drift. (a bit over-ambitious I know ๐Ÿ˜‚)
  6. Others are too ambitious to post until I know it's actually possible to figure them out.

Going to be monitoring these runs and other metrics using grafana, hence the setting up of prometheus, grafana, statsd-exporter, all running as docker containers.

Setting up traefik as a reverse proxy.

The following steps should get you up and running:

  • create a docker-compose.traefik.yml file and add traefik as one of the services
version: '3.9'
services:
  traefik:
    image: traefik:v2.4.13
    ports:
      - 80:80
      - 443:443
    restart: always

Using traefik version 2.4.13 (you can use latest to always pull the latest image), and exposing ports 80 and 443, enabling traefik to handle all our HTTP incoming requests and route it according to the labels set in our services.

  • Traefik commands
   command:
      - --providers.docker
      - --providers.docker.exposedbydefault=false
      - --entrypoints.http.address=:80
      - --entrypoints.https.address=:443
      - --certificatesresolvers.le.acme.email=${LE?Variable not set}
      - --certificatesresolvers.le.acme.storage=/certificates/acme.json
      - --certificatesresolvers.le.acme.tlschallenge=true
      - --accesslog
      - --log
      - --api

Set docker as the provider, traefik also supports other providers like Kubernetes, etc. and then explicitly tell traefik not to expose all our services by default, going to be exposing intended services using labels. Traefik also is to handle our SSL certificates and redirects, setting up redirects defined using labels.

  • traefik labels
    labels:
      - traefik.enable=true
      - traefik.http.services.traefik-dashboard.loadbalancer.server.port=8080
      - traefik.http.routers.traefik-dashboard-http.entrypoints=http
      - traefik.http.routers.traefik-dashboard-http.rule=Host(`traefik.${DOMAIN?Variable not set}`)
      - traefik.docker.network=traefik-public
      - traefik.http.routers.traefik-dashboard-https.entrypoints=https
      - traefik.http.routers.traefik-dashboard-https.rule=Host(`traefik.${DOMAIN?Variable not set}`)
      - traefik.http.routers.traefik-dashboard-https.tls=true
      - traefik.http.routers.traefik-dashboard-https.tls.certresolver=le
      - traefik.http.routers.traefik-dashboard-https.service=api@internal
      - traefik.http.middlewares.https-redirect.redirectscheme.scheme=https
      - traefik.http.middlewares.https-redirect.redirectscheme.permanent=true
      - traefik.http.routers.traefik-dashboard-http.middlewares=https-redirect
      - traefik.http.middlewares.admin-auth.basicauth.users=${USERNAME?Variable not set}:${HASHED_PASSWORD?Variable not set}
      - traefik.http.routers.traefik-dashboard-https.middlewares=admin-auth

We can use labels to "tell" traefik what we want it to do without having to restart it, which makes it really cool. So we enable traefik (remember we "told" it not to expose our services by default).

Then enable the dashboard and set basic auth, also set up https-redirect middleware , this is going to be handling our HTTP to HTTPS redirects.

And since we don't want everyone to have access to our dashboard we also include and admin-auth middleware.

  • volumes

      volumes:
        - /var/run/docker.sock:/var/run/docker.sock:ro
        - traefik-public-certificates:/certificates
    

    For traefik to have access to our running containers we bind it to the docker sock, and also mount a volume for the let's encrypt SSL certificates.

  • networks

      networks:
        - ${TRAEFIK_PUBLIC_NETWORK?Variable not set}
        - default
    

    Using an external network, in case you are running other services they can connect to the traefik network using the available public network, which can be created by running docker network create traefik-public . and just like that, you have traefik running.

Setting up apache airflow and monitoring tools

To set this up I found a really nice GitHub repo from this link . Check out and follow it to get airflow running.

to have airflow and the services managed by traefik just include traefik labels in the docker-compose file and traefik will "automagically" pick these up without you having to restart the service.

Next boclog - the next log will probably be about how this was going to be a classifier but instead opted to go the object detection path and now I have to label a bunch of images.

I registered for the PyTorch hackathon, this should give me a little motivation to keep this going ๐Ÿ˜‚๐Ÿ˜‚.

ย