Let’s start out by saying that sharing the database between microservices is bad. If you have two services talking to the same data source, they immediately become coupled to it’s schema. If one of them has a different needs and would like to adjust it, it needs to think of all the other users so they continue to operate properly.

Of course, this makes it impossibile to have independent releases which, in my opinion, are the main selling point of doing microservices at all. Sharing databases means you are essentially implementing what’s called a “distributed monolith”. So, if you want to do microservices, you need to isolate databases between services.

How do you run databases in k8s ?

Lucky for us, Kubernetes makes it damn easy to spin up databases thanks to StatefulSets. You could have a database deployed side by side with your application, so that each servie has it’s own datasource. While this sounds awesome in theory, you’ll probabbly quickly encounter some issues of the practical nature, the main one being: avaialable resources in the cluster.

The thing is, Databases love RAM. Most of them will attempt to consume as much memory as there is available on the machine. Even when you configure the database to limit this amount, if you’d like to deploy db per microservice, it quickly adds up to enormous amounts of memory required. Let’s say we are dealing with MongoDb - it consumes at least 2Gi of ram per instance. If you have 3 environments and 5 services, we are already talking about 45 gigabytes. And that’s just to run a process which will mostly do nothing, since these are development environments.

How do I manage mongo databases?

So, I’ve been puzzled with this problem for a while, when deploying services using MongoDb to kubernetes cluster. How could I ensure the isolation, where each service owns it’s own database while also running a single mongo cluster ?

The easiest option was of course to just faciliate mongo’s access level control and create a user per service. That user will then have permissions to access single database only. However, I didn’t enjoy the fact that this was a very much manual process - for each service, somebody would need to manually maintain the state of users and databases. That’s not very much in the devops spirit, is it ? ..

Then, I started thinking whether I can automate this process somehow. I wanted to have a Kubernetes resource declaring the intent of needing a databse. Having such intent (or resource) in place, there’d be something taking care of actually doing the work and provisioning things in mongo.

Thankfully, there is an option to define custom resources in k8s! Just like you can declare a Deployment or ConfigMap, you could have own resources in the cluster like MongoDatabase. It seemed like I was getting closer!

Still, I needed something simple to listen for the custom resources and actually create the databases and users. My first though was to use the operator sdk and code something simple in GoLang. However, to my taste, it just seemed to complex for such a simple thing I wanted. So, then I stumbled upon the kopf python library made by zalando and it turned out to work just beatifully for such use case!

Introducing k8s-mongo-db-provider

This is how I ended up with a small project in python. You can find out the sources at GitHub. It works out of the box with mongo replicaset helm chart, being able to autodiscover MongoDb service within given namespace and find the secret containing credentials. Then, it listens for creation/deletion of custom resource kind MongoDatabase. When it detects such, it performs necessary operations in the mongo replicaset using credentials from the secret.

So, if you place the following resource in your yaml:

apiVersion: mongo-db-provider.urbanonsoftware.com/v1alpha1
kind: MongoDatabase
metadata:
  name: my-mongo-database

It’ll create my-mongo-database for you, using mongo from wthin namespace of the resource itself. Besides, it’ll create two secrets as well: my-mongo-database-reader and my-mongo-database-writer. Reader will have readonly permissions, while writer will have .. well, the writer permissions. These secrets contain two things: ConnectionString and DbName.

You can use them like so:

containers:
  - name: kubernetes-hello
    image: mdb_provider_test:local
    imagePullPolicy: Never
    ports:
      - containerPort: 80
    env:
      - name: ASPNETCORE_ENVIRONMENT
        value: "Development"
      - name: MongoDb__ConnectionString
        valueFrom:
          secretKeyRef:
            name: my-mongo-database-writer
            key: ConnectionString
      - name: MongoDb__DbName
        valueFrom:
          secretKeyRef:
            name: my-mongo-database-writer
            key: DbName

You can also list these resources just like any other kind of resource within kubernetes:

kubectl get mongodb

Or drop the database along with the secrets:

kubectl delete mongodb my-mongo-database

Summary

Maintaining isolation in your database is important and can be tedious when done manually. If you are deling with mongodb, using the little python script I created might be an option for you as well (and any contributions are greatly welcome!). If you have some resources needed by services in the cluster, which you provision manually at the moment, I highly recommend you to take a look at custom resource definitions and Kopf. With just few lines of code, you can automate it away!