Dataristix™ for Kubernetes
Dataristix for Kubernetes
This article describes configuration steps and considerations when deploying Dataristix on Kubernetes. We assume that you already have a working cluster. For deployment into a Microsoft Azure container group instead, please see the example here.
Please also see our announcement of Dataristix for Azure™ Kubernetes Services!
Dataristix instances
Each Dataristix instance is installed as a singleton in a StatefulSet with an attached persistent volume containing configuration data including the identity of the instance. The identity of each instance comprises instance-specific identifiers and certificates. In many applications Dataristix initiates the connection as a client and uses certificates to identiy itself to the external service. Dataristix may also act as the server (OPC UA reverse-connect server or MQTT broker), in which case the specific instance needs to be made accessible to external clients; this will need consideration when configuring your ingress controller. Ingress and external access configuration is not covered in this article; please refer to your ingress controller documentation.
Scaling
Depending on allocated resources, you may be able to process tens of thousands of data points per second with a single instance of Dataristix. To scale out, additional instances may be deployed with their own identity. In simple terms, you can then export the project from the first instance, import the project into the second instance, and then run only half of the tasks on the first instance and the other half on the second instance. Use MQTT for inter-instance communications should it be required.
Redundancy
Redundancy is achieved by attaching a redundant persistent volume to each Dataristix instance. Should a node fail, then the failed instance can be re-instated from the persistent volume. Tasks that are configured to start automatically will begin data processing on the new instance.
Service configuration
Each instance has its own service configuration. Here we simply call it dataristix-1, in the anticipation that we may want to add additional instances dataristix-2, dataristix-3, and so forth, in the future.
Create a file dataristix-1-service.yml and edit as follows.
apiVersion: v1kind: Servicemetadata:name: dataristix-1labels:app: dataristix-1spec:ports:- port: 8282clusterIP: Noneselector:app: dataristix-1
Apply to create the service.
kubectl apply -f dataristix-1-service.yml
Persistent volume configuration
Your preferred persistent volume configuration will depend on your environment. Adjust the configuration so that your chosen persistent volume is redundant and secure. In particular, the dataristix-secret volume mount used in Dataristix pods (see below) may contain sensitive data, and the dataristix-data volume mount should have restricted access. In this example, we define the persistent volume file dataristix-1-volume.yml with a simple hostPath as follows. Notably, we use the ReadWriteOncePod access mode to ensure that only a single Dataristix instance has access to the volume. This feature requires Kubernetes 1.22 or later.
apiVersion: v1
kind: PersistentVolume
metadata:
name: dataristix-1-volume
labels:
type: local
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOncePod
hostPath:
path: /mnt/data
Apply to create the persistent volume.
kubectl apply -f dataristix-1-volume.yml
Persistent volume claim configuration
We claim the volume for the single Dataristix instance in file dataristix-1-volume-claim.yml as follows.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dataristix-1-volume-claim
spec:
accessModes:
- ReadWriteOncePod
resources:
requests:
storage: 1Gi
Apply to create the persistent volume claim.
kubectl apply -f dataristix-1-volume-claim.yml
Pod configuration
Dataristix uses multi-container pods, consisting of the Core and Proxy containers plus selected connector module containers. The following example configures most available connectors, but chances are that you only need some. Remove or comment out any connector modules that are not required to save resources. The dataristix-1-pod.yml file contains:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: dataristix-1-instance
spec:
selector:
matchLabels:
app: dataristix-1 # has to match .spec.template.metadata.labels
serviceName: "dataristix-1"
replicas: 1 # by default is 1
minReadySeconds: 10 # by default is 0
template:
metadata:
labels:
app: dataristix-1 # has to match .spec.selector.matchLabels
spec:
securityContext:
runAsNonRoot: true
terminationGracePeriodSeconds: 10
volumes:
- name: dataristix-1-volume
persistentVolumeClaim:
claimName: dataristix-1-volume-claim
containers:
- name: dataristix-core
image: docker.io/dataristix/dataristix-core:latest
# Define connector modules that Dataristix should expect to be available
# and include corresponding containers in this configuration.
# Remove module arguments and corresponding containers that are not required:
args:
- --modules="CSV, E-Mail, Excel, Google Sheets, MySQL, MQTT, OPC UA, Oracle, PostgreSQL, Power BI, REST, Script, SQL Server, SQLite"
volumeMounts: &commonVolumeMounts
- name: dataristix-1-volume
mountPath: /dataristix-data
- name: dataristix-1-volume
mountPath: /dataristix-secret
- name: dataristix-proxy
image: docker.io/dataristix/dataristix-proxy:latest
ports:
- containerPort: 8282
name: dataristix-port
volumeMounts: *commonVolumeMounts
# CSV
- name: dataristix-for-csv
image: docker.io/dataristix/dataristix-for-csv:latest
volumeMounts: *commonVolumeMounts
- name: dataristix-for-email
image: docker.io/dataristix/dataristix-for-email:latest
volumeMounts: *commonVolumeMounts
# Excel
- name: dataristix-for-excel
image: docker.io/dataristix/dataristix-for-excel:latest
# Map remote RTD server port if required
# ports:
# - containerPort: 22783
# name: dx-excel-rtd
volumeMounts: *commonVolumeMounts
# Google Sheets
- name: dataristix-for-googlesheets
image: docker.io/dataristix/dataristix-for-googlesheets:latest
volumeMounts: *commonVolumeMounts
# MQTT
- name: dataristix-for-mqtt
image: docker.io/dataristix/dataristix-for-mqtt:latest
ports:
- containerPort: 1883
name: dx-mqtt-tcp
- containerPort: 8883
name: dx-mqtt-tls
# add WebSockets ports if required
volumeMounts: *commonVolumeMounts
# MySQL
- name: dataristix-for-mysql
image: docker.io/dataristix/dataristix-for-mysql:latest
volumeMounts: *commonVolumeMounts
# OPC UA
- name: dataristix-for-opcua
image: docker.io/dataristix/dataristix-for-opcua:latest
# Map reverse-connect port if required
# ports:
# - containerPort: 7999
# name: dx-opcua
volumeMounts: *commonVolumeMounts
# Oracle
- name: dataristix-for-oracle
image: docker.io/dataristix/dataristix-for-oracle:latest
volumeMounts: *commonVolumeMounts
# PostgreSQL
- name: dataristix-for-postgresql
image: docker.io/dataristix/dataristix-for-postgresql:latest
volumeMounts: *commonVolumeMounts
# Power BI
- name: dataristix-for-powerbi
image: docker.io/dataristix/dataristix-for-powerbi:latest
volumeMounts: *commonVolumeMounts
# REST
- name: dataristix-for-rest
image: docker.io/dataristix/dataristix-for-rest:latest
volumeMounts: *commonVolumeMounts
# Script
- name: dataristix-for-script
image: docker.io/dataristix/dataristix-for-script:latest
volumeMounts: *commonVolumeMounts
# SQL Server
- name: dataristix-for-sqlserver
image: docker.io/dataristix/dataristix-for-sqlserver:latest
volumeMounts: *commonVolumeMounts
# SQLite
- name: dataristix-for-sqlite
image: docker.io/dataristix/dataristix-for-sqlite:latest
volumeMounts: *commonVolumeMounts
Apply to create the pod.
kubectl apply -f dataristix-1-pod.yml
Ingress and port forwarding
The Dataristix pod is now available at port 8282. You may already have an ingress controller that is suitable as a reverse proxy to forward requests to the Dataristix service. Note that any proxy must also support WebSockets.
For testing in a local setup (i.e., minikube), you can simply use port forwarding:
kubectl port-forward dataristix-1-instance-0 8282:8282
Browse to http://localhost:8282 to view your Dataristix instance!
Helm charts
We hope to provide helm charts here soon! Stay tuned.
Feeback
We welcome any feedback you may have. Please contact support@rensen.io.
Available connector modules
The following modules are available as containers, for use in a Kubernetes or Docker Compose deployments:
| Connector for | Container Support |
| CSV | ✓ |
| ✓ | |
| Excel | ✓ |
| Google Sheets | ✓ |
| InfluxDB™ | TBA |
| IoT Devices | ✓ |
| Kafka | ✓ |
| MQTT | ✓ |
| MySQL™ | ✓ |
| ODBC | - |
| ODBC (32-bit) | - |
| OPC DA | - |
| OPC UA | ✓ |
| Oracle™ | ✓ |
| PostgreSQL™ | ✓ |
| Power BI™ | ✓ |
| REST | ✓ |
| Script | ✓ |
| SOAP | - |
| SQL Server™ | ✓ |
| SQLite | ✓ |
Dataristix is a trademark of Rensen Information Services Limited. Microsoft, Azure, Microsoft Access, Excel, Power BI, and SQL Server are trademarks of Microsoft Corporation. Oracle and MySQL are registered of Oracle. PostgreSQL is a registered trademark of the PostgreSQL Community Association of Canada. SAP, SAP HANA are registered trademarks of SAP. IBM, IBM DB2 are trademarks of IBM. InfluxDB is a trademark of InfluxData. All other product names, trademarks and registered trademarks are the property of their respective owners.
