Data stack anatomy

There are various ways to represent a data stack, anatomically speaking. Texturally, I like docker compose, a feature of docker that allows you to attach various services together like LEGOs. Clearly it is not always convenient to have all the components of the enterprise data stack via docker.

You will see the example of n8n, an alternative to Zapier, so not directly related to statistics but more to IT, because it allows, for example, data to be exchanged between various applications. This simple example contains a database, which is not strictly necessary, and the n8n application where you can design and execute your own automations, which of course are stored.

 

version: '3.8'

volumes:
  db_storage:
  n8n_storage:

services:
  postgres:
    image: postgres:16
    restart: always
    environment:
      - POSTGRES_USER
      - POSTGRES_PASSWORD
      - POSTGRES_DB
      - POSTGRES_NON_ROOT_USER
      - POSTGRES_NON_ROOT_PASSWORD
      # security settings
    volumes:
      - db_storage:/var/lib/postgresql/data
    healthcheck:
      test: ['CMD-SHELL', 'pg_isready -h localhost -U ${POSTGRES_USER} -d ${POSTGRES_DB}']
      interval: 5s
      timeout: 5s
      retries: 10
      # another security setting 

  n8n:
    image: docker.n8n.io/n8nio/n8n
    restart: always
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=${POSTGRES_DB}
      - DB_POSTGRESDB_USER=${POSTGRES_NON_ROOT_USER}
      - DB_POSTGRESDB_PASSWORD=${POSTGRES_NON_ROOT_PASSWORD}
      # configuration to access to postgre database
    ports:
      - 5678:5678
      # how we can consume the service internally. It should be connected to a domain
      # for example from localhost:5678 --> n8n.mybusiness.com
    links:
      - postgres
    volumes:
      - n8n_storage:/home/node/.n8n
    depends_on:
      postgres:
        condition: service_healthy
        # safety

 

 

Volumes means storage or memory, clearly the metaphor of the database as a warehouse for the service can be confusing. Observe that as a database we have postgre, mentioned already in the article and elsewhere.

I also bring you a very particular example of shiny connecting to a MySQL database. Usually this is done from the R code that has shiny rather than from the system code , i.e., the docker-compose.yml

 

version: '3.8'

services:
  shiny:
    image: rocker/shiny
    # data dashboard service based on R
    ports:
      - "3838:3838"
    volumes:
      - ./shiny_app:/srv/shiny-server
    depends_on:  
      - db

  db:
    image: mysql:5.7
    environment:
      MYSQL_ROOT_PASSWORD: rootpassword
      MYSQL_DATABASE: shinydb
      MYSQL_USER: user
      MYSQL_PASSWORD: userpassword
      # there is a smarter way to keep these secrets

    volumes:
      - mysql_data:/var/lib/mysql

 

Here is an example of an open source code CRM, Odoo. You will understand more directly why I call CRMs particular databases. I have mentioned other open CRMs in the past, such as vTiger and Mautic.

 

version: '3.1'
services:
  web:
	image: odoo:17.0
	depends_on:
  	- mydb
	ports:
  	- "8069:8069"
	environment:
	- HOST=mydb
	- USER=odoo
	- PASSWORD=myodoo
  mydb:
	image: postgres:15 
	# here is our database
	environment:
  	- POSTGRES_DB=postgres
  	- POSTGRES_PASSWORD=myodoo
  	- POSTGRES_USER=odoo

 

The keen eye has observed that each image has a name and version. What does one do to update the system to the respective latest versions of the components? One installs another container (watch first mentioned article), named watchtower, which automates this rather delicate procedure.

In reality, of course, an enterprise data stack turns out to be much more articulate than the docker-compose.yaml files I showed to you. And not for all components one can find open solutions. For example some companies, for data visualization, prefer Tableau or MicroStrategy, or something free but not docker-izable like Google Looker.

And what data stack does your company have in place? Let’s review it together in a free first call. Maybe some of its components could cost less. Or maybe there are components missing that help your turnover.

 

 

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Privacy Policy