Skywalking oap-server init mode

Experimenting with Skywalking in our Lab every now and then we are coming back to the broken installation.
All components seem running but no Data is displayed in the UI. Further investigation shows error on oap-server
2022-07-22 15:28:51,809 org.apache.skywalking.oap.server.core.storage.model.ModelInstaller 43 [main] INFO [] - table: alarm_record does not exist. OAP is running in 'no-init' mode, waiting... retry 3s later.
This points us to the “no-init” mode of running the oap-server.
Going further into documentation it appears that Skywalking Oap Server should run with -init mode when it starts for the first time against a new Datastore. Oap Server in Init mode creates all necessary indexes and structures for it to store the data.
By the looks of it, the Deployment of elasticsearch we use in our Lab does not persist the Data by default. We used it mostly for POCs which rarely exist for longer than a few hours.
At some point, elasticsearch got restarted by Kubernetes on a different node and at the same time, all its data got lost along with indexes and metadata.
It’s easy enough to recover from this situation. We just need to run the oap-agent with init mode again to create all necessary elements in the elastcsearch.
Kubernetes Job to the rescue:
apiVersion: batch/v1
kind: Job
metadata:
name: oap-init-job # @feature: cluster; set up an init job to initialize ES templates and indices
spec:
template:
metadata:
name: oap-init-job
annotations:
sidecar.istio.io/inject: "false"
spec:
serviceAccountName: skywalking-oap-sa-cluster
restartPolicy: Never
initContainers:
- name: wait-for-es
image: busybox:1.30
command:
- 'sh'
- '-c'
- 'for i in $(seq 1 60); do nc -z -w3 elasticsearch 9200 && exit 0 || sleep 5; done; exit 1'
containers:
- name: oap-init
image: ghcr.io/apache/skywalking/oap:b695983fc58ae17bc6993898afb671ff1e19be12
imagePullPolicy: Always
env: # @feature: cluster; make sure all env vars are the same with the cluster nodes as this will affect templates / indices
- name: JAVA_OPTS
value: "-Dmode=init" # @feature: cluster; set the OAP mode to "init" so the job can complete
- name: SW_OTEL_RECEIVER
value: default
- name: SW_OTEL_RECEIVER_ENABLED_OC_RULES
value: vm,oap
- name: SW_STORAGE
value: elasticsearch
- name: SW_STORAGE_ES_CLUSTER_NODES
value: elasticsearch:9200
- name: SW_STORAGE_ES_INDEX_REPLICAS_NUMBER
value: "0"
- name: SW_TELEMETRY
value: prometheus
volumeMounts:
- name: config-volume
mountPath: /skywalking/ext-config
volumes:
- name: config-volume
configMap:
name: oap-static-config
after completion of the job
oap-init-job-kbn9s 0/1 Completed 0 20m
Skywalking is running again and storing Data in the elastcsearch!
- What is TCP Proxy Protocol and why do you need to know about it? - March 30, 2023
- Highlights of OpenUK Conference in London - February 13, 2023
- Applied Observability - January 25, 2023