boclog[1] - Set up Traefik, Apache Airflow, monitoring tools(prometheus, grafana, statsd-exporter).
Most of the code on these logs is on my GitHub ( mrdvince ).
I opted to use traefik as a reverse proxy for my containers, after looking through other alternatives e.g Nginx settled for traefik for its straightforward set up especially since I will be setting everything up in the docker-compose file.
Apache airflow for some automation(dag runs), some of the dags are going to be:
- Spinning up ec2 instances on AWS
- Pulling images from an s3 bucket
- Monitoring a given database (using redshift for my case) for a given number of new records.
- Triggering retraining jobs based on a number of new images.
- Trigger training job on new images based on data drift using new augmentations and concept drift. (a bit over-ambitious I know 😂)
- Others are too ambitious to post until I know it's actually possible to figure them out.
Going to be monitoring these runs and other metrics using grafana, hence the setting up of prometheus, grafana, statsd-exporter, all running as docker containers.
Setting up traefik as a reverse proxy.
The following steps should get you up and running:
- create a docker-compose.traefik.yml file and add traefik as one of the services
version: '3.9'
services:
traefik:
image: traefik:v2.4.13
ports:
- 80:80
- 443:443
restart: always
Using traefik version 2.4.13 (you can use latest to always pull the latest image), and exposing ports 80 and 443, enabling traefik to handle all our HTTP incoming requests and route it according to the labels set in our services.
- Traefik commands
command:
- --providers.docker
- --providers.docker.exposedbydefault=false
- --entrypoints.http.address=:80
- --entrypoints.https.address=:443
- --certificatesresolvers.le.acme.email=${LE?Variable not set}
- --certificatesresolvers.le.acme.storage=/certificates/acme.json
- --certificatesresolvers.le.acme.tlschallenge=true
- --accesslog
- --log
- --api
Set docker as the provider, traefik also supports other providers like Kubernetes, etc. and then explicitly tell traefik not to expose all our services by default, going to be exposing intended services using labels. Traefik also is to handle our SSL certificates and redirects, setting up redirects defined using labels.
- traefik labels
labels:
- traefik.enable=true
- traefik.http.services.traefik-dashboard.loadbalancer.server.port=8080
- traefik.http.routers.traefik-dashboard-http.entrypoints=http
- traefik.http.routers.traefik-dashboard-http.rule=Host(`traefik.${DOMAIN?Variable not set}`)
- traefik.docker.network=traefik-public
- traefik.http.routers.traefik-dashboard-https.entrypoints=https
- traefik.http.routers.traefik-dashboard-https.rule=Host(`traefik.${DOMAIN?Variable not set}`)
- traefik.http.routers.traefik-dashboard-https.tls=true
- traefik.http.routers.traefik-dashboard-https.tls.certresolver=le
- traefik.http.routers.traefik-dashboard-https.service=api@internal
- traefik.http.middlewares.https-redirect.redirectscheme.scheme=https
- traefik.http.middlewares.https-redirect.redirectscheme.permanent=true
- traefik.http.routers.traefik-dashboard-http.middlewares=https-redirect
- traefik.http.middlewares.admin-auth.basicauth.users=${USERNAME?Variable not set}:${HASHED_PASSWORD?Variable not set}
- traefik.http.routers.traefik-dashboard-https.middlewares=admin-auth
We can use labels to "tell" traefik what we want it to do without having to restart it, which makes it really cool. So we enable traefik (remember we "told" it not to expose our services by default).
Then enable the dashboard and set basic auth, also set up https-redirect
middleware , this is going to be handling our HTTP to HTTPS redirects.
And since we don't want everyone to have access to our dashboard we also include and admin-auth
middleware.
volumes
volumes: - /var/run/docker.sock:/var/run/docker.sock:ro - traefik-public-certificates:/certificates
For traefik to have access to our running containers we bind it to the docker sock, and also mount a volume for the let's encrypt SSL certificates.
networks
networks: - ${TRAEFIK_PUBLIC_NETWORK?Variable not set} - default
Using an external network, in case you are running other services they can connect to the traefik network using the available public network, which can be created by running
docker network create traefik-public
. and just like that, you have traefik running.
Setting up apache airflow and monitoring tools
To set this up I found a really nice GitHub repo from this link . Check out and follow it to get airflow running.
to have airflow and the services managed by traefik just include traefik labels in the docker-compose file and traefik will "automagically" pick these up without you having to restart the service.
Next boclog - the next log will probably be about how this was going to be a classifier but instead opted to go the object detection path and now I have to label a bunch of images.
I registered for the PyTorch hackathon, this should give me a little motivation to keep this going 😂😂.