oneuptime/config.example.env
Nawaz Dhandala f49b1995df
feat(telemetry): add new Telemetry service (OTel, Syslog, Fluent, Metrics, Traces) and unified ingestion pipeline
- Add Telemetry service entrypoint
  - Telemetry/Index.ts: app bootstrap, routes mounting, infrastructure init and Telemetry SDK init.

- Unified queue + worker
  - Telemetry/Jobs/TelemetryIngest/ProcessTelemetry.ts: single worker that dispatches queued jobs to specific processors (logs, traces, metrics, syslog, fluent logs).
  - Telemetry/Services/Queue/TelemetryQueueService.ts: central queue API and job payload types.
  - Per-type Queue wrappers (LogsQueueService, MetricsQueueService, TracesQueueService, FluentLogsQueueService, SyslogQueueService).

- OpenTelemetry ingestion middleware and proto support
  - Telemetry/Middleware/OtelRequestMiddleware.ts: detect OTLP endpoint (logs/traces/metrics), decode protobuf bodies using protobufjs and set product type.
  - Telemetry/ProtoFiles/OTel/v1/*.proto: include common.proto, logs.proto, metrics.proto, resource.proto, traces.proto for OTLP v1 messages.

- Ingest services
  - Telemetry/Services/OtelLogsIngestService.ts: parse incoming OTLP logs, map attributes, convert timestamps, batch insert logs.
  - Telemetry/Services/OtelTracesIngestService.ts: parse OTLP traces, build span rows, extract exceptions, batch insert spans and exceptions, save telemetry exception summary.
  - Telemetry/Services/OtelMetricsIngestService.ts: parse OTLP metrics, normalize datapoints, batch insert metrics and index metric name -> service map.
  - Telemetry/Services/SyslogIngestService.ts: syslog ingestion endpoints, parser integration, map syslog fields to attributes and logs.
  - Telemetry/Services/FluentLogsIngestService.ts: ingest Fluentd style logs, normalize entries and insert into log backend.
  - Telemetry/Services/OtelIngestBaseService.ts: helpers to resolve service name from attributes/headers.

- Syslog parser and utilities
  - Telemetry/Utils/SyslogParser.ts: robust RFC5424 and RFC3164 parser, structured data extraction and sanitization.
  - Telemetry/Tests/Utils/SyslogParser.test.ts: unit tests for parser behavior.

- Telemetry exception utilities
  - Telemetry/Utils/Exception.ts: generate exception fingerprint and upsert telemetry exception status (saveOrUpdateTelemetryException).

- Queue & job integration
  - New integration with Common/Server/Infrastructure/Queue and QueueWorker, job id generation and telemetry job types.
  - Telemetry services add ingestion jobs instead of processing synchronously.

- Config, build and dev tooling
  - Add Telemetry/package.json, package-lock.json, tsconfig.json, nodemon.json, jest config.
  - New script configs and dependencies (protobufjs, ts-node, jest, nodemon, etc).

- Docker / environment updates
  - docker-compose.base.yml, docker-compose.dev.yml, docker-compose.yml: rename service from open-telemetry-ingest -> telemetry and wire TELEMETRY_* envs.
  - config.example.env: rename and consolidate environment variables (OPEN_TELEMETRY_* -> TELEMETRY_*, update hostnames and ports).
  - Tests/Scripts/status-check.sh: update ready-check target to telemetry/status/ready.

- Other
  - Telemetry/Services/Queue/*: export helpers and legacy-compatible job interface shims.
  - Memory cleanup and batching safeguards across ingest services.
  - Logging and capture spans added to key code paths.

BREAKING CHANGES / MIGRATION NOTES:
- Environment variables and docker service names changed:
  - Replace OPEN_TELEMETRY_... vars with TELEMETRY_... (PORT, HOSTNAME, CONCURRENCY, DISABLE_TELEMETRY, etc).
  - docker-compose entries moved from "open-telemetry-ingest" to "telemetry" and image name changed to oneuptime/telemetry.
  - Update any deployment automation and monitoring checks referencing the old service name or endpoints.
- Consumers: OTLP endpoints and behavior remain supported, but ingestion is now queued and processed asynchronously.

Testing / Running:
- Install deps in Telemetry/ (npm install) after syncing Common workspace.
- Run dev: npx nodemon (nodemon.json) or build & start using provided scripts.
- Run tests with jest (Telemetry test suite includes SyslogParser unit tests).

Files added/modified (high level):
- Added many files under Telemetry/: Index, Jobs, Middleware, ProtoFiles, Services, Utils, Tests, package and config artifacts.
- Modified docker-compose.* and config.example.env and status check script to use new TELEMETRY service/vars.
2025-11-07 21:36:47 +00:00

358 lines
No EOL
13 KiB
Bash

#!/usr/bin/env bash
# Please change this to domain of the server where oneuptime is hosted on.
HOST=localhost
PROVISION_SSL=false
# OneUptime Port. This is the port where OneUptime will be hosted on.
ONEUPTIME_HTTP_PORT=80
# ==============================================
# SETTING UP TLS/SSL CERTIFICATES
# ==============================================
# OneUptime can automatically provision SSL certificates for the HOST when PROVISION_SSL=true.
# This requires port 80/443 to be reachable for Let's Encrypt validation and the HOST domain pointing to this server.
# If you prefer to terminate TLS on an external reverse proxy, leave PROVISION_SSL=false and manage certificates yourself.
HTTP_PROTOCOL=http
# Secrets - PLEASE CHANGE THESE. Please change these to something random. All of these can be different values.
ONEUPTIME_SECRET=please-change-this-to-random-value
DATABASE_PASSWORD=please-change-this-to-random-value
CLICKHOUSE_PASSWORD=please-change-this-to-random-value
REDIS_PASSWORD=please-change-this-to-random-value
ENCRYPTION_SECRET=please-change-this-to-random-value
GLOBAL_PROBE_1_KEY=please-change-this-to-random-value
GLOBAL_PROBE_2_KEY=please-change-this-to-random-value
# If you are connecting Status Pages to custom domains, then this will be the port where the status page will be hosted on.
# This should be https port because oneuptime automatically generates ssl certs from lets encrypt.
STATUS_PAGE_HTTPS_PORT=443
# If you would like to attach status page to custom domains use this setting.
# For example, lets say you would like the status page to be hosted on status.yourcompany.com, then
# 1. Create a A record in your DNS provider with the name "oneuptime.yourcompany.com" and value to Public IP of the server oneuptime is deployed on.
# 2. Set the STATUS_PAGE_CNAME_RECORD to "oneuptime.yourcompany.com"
# 3. Create CNAME record in your DNS provider with the name "status.yourcompany.com" and value "oneuptime.yourcompany.com"
STATUS_PAGE_CNAME_RECORD=oneuptime.yourcompany.com
# --------------------------------------------- #
# You can safely ignore anything below this line. Keep them as default to make things work.
# --------------------------------------------- #
# This supports test | production | development | ci.
# Development is used for local development. Test is used for insider / beta / staging builds. Production is used for production ready app. ci is for testing in the CI/CD.
ENVIRONMENT=production
# What image should we pull from docker hub. This only applies when the ENVIRONMENT is production or test
APP_TAG=release
# Change this to true if you are using enterprise edition. Keep it false if you are using community edition.
IS_ENTERPRISE_EDITION=false
# What is the name of the docker compose project. This is used to prefix the docker containers.
COMPOSE_PROJECT_NAME=oneuptime
# OTEL HOST - if you like the collector to be hosted on a different server then change this to the IP of the server.
OTEL_COLLECTOR_HOST=
# Clickhouse Settings
CLICKHOUSE_USER=default
CLICKHOUSE_DATABASE=oneuptime
CLICKHOUSE_HOST=clickhouse
CLICKHOUSE_PORT=8123
# Postgres DB Settings.
DATABASE_PORT=5432
DATABASE_USERNAME=postgres
DATABASE_NAME=oneuptimedb
DATABASE_HOST=postgres
# Used to connect to managed postgres providers.
# Fill only what your provider needs.
DATABASE_SSL_REJECT_UNAUTHORIZED=false
DATABASE_SSL_CA=
DATABASE_SSL_KEY=
DATABASE_SSL_CERT=
# Redis DB Settings.
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
REDIS_USERNAME=default
REDIS_IP_FAMILY=
REDIS_TLS_CA=
REDIS_TLS_SENTINEL_MODE=false
# Hostnames. Usually does not need to change.
PROBE_INGEST_HOSTNAME=probe-ingest:3400
INCOMING_REQUEST_INGEST_HOSTNAME=incoming-request-ingest:3402
TELEMETRY_HOSTNAME=telemetry:3403
SERVER_ACCOUNTS_HOSTNAME=accounts
SERVER_APP_HOSTNAME=app
SERVER_PROBE_INGEST_HOSTNAME=probe-ingest
SERVER_SERVER_MONITOR_INGEST_HOSTNAME=server-monitor-ingest
SERVER_TELEMETRY_HOSTNAME=telemetry
SERVER_INCOMING_REQUEST_INGEST_HOSTNAME=incoming-request-ingest
SERVER_TEST_SERVER_HOSTNAME=test-server
SERVER_STATUS_PAGE_HOSTNAME=status-page
SERVER_DASHBOARD_HOSTNAME=dashboard
SERVER_ADMIN_DASHBOARD_HOSTNAME=admin-dashboard
SERVER_OTEL_COLLECTOR_HOSTNAME=otel-collector
SERVER_API_REFERENCE_HOSTNAME=reference
SERVER_WORKER_HOSTNAME=worker
SERVER_DOCS_HOSTNAME=docs
#Ports. Usually they don't need to change.
APP_PORT=3002
PROBE_INGEST_PORT=3400
SERVER_MONITOR_INGEST_PORT=3404
TELEMETRY_PORT=3403
INCOMING_REQUEST_INGEST_PORT=3402
TEST_SERVER_PORT=3800
ACCOUNTS_PORT=3003
STATUS_PAGE_PORT=3105
DASHBOARD_PORT=3009
ADMIN_DASHBOARD_PORT=3158
OTEL_COLLECTOR_HTTP_PORT=4318
ISOLATED_VM_PORT=4572
HOME_PORT=1444
WORKER_PORT=1445
WORKFLOW_PORT=3099
API_REFERENCE_PORT=1446
DOCS_PORT=1447
# Plans
# This is in the format of PlanName,PlanIdFromBillingProvider,MonthlySubscriptionPlanAmountInUSD,YearlySubscriptionPlanAmountInUSD,Order,TrialPeriodInDays
# Enterprise plan will have -1 which means custom pricing.
SUBSCRIPTION_PLAN_BASIC=Basic,priceMonthlyId,priceYearlyId,0,0,1,0
SUBSCRIPTION_PLAN_GROWTH=Growth,priceMonthlyId,priceYearlyId,0,0,2,14
SUBSCRIPTION_PLAN_SCALE=Scale,priceMonthlyId,priceYearlyId,0,0,3,0
SUBSCRIPTION_PLAN_ENTERPRISE=Enterprise,priceMonthlyId,priceYearlyId,-1,-1,4,14
# If you want to run the backup script, then you need to fill these values.
DATABASE_BACKUP_DIRECTORY=/Backups
DATABASE_BACKUP_HOST=localhost
DATABASE_BACKUP_PORT=5400
DATABASE_BACKUP_NAME=oneuptimedb
DATABASE_BACKUP_USERNAME=postgres
DATABASE_BACKUP_PASSWORD=${DATABASE_PASSWORD}
# If you want to run the restore script, then you need to fill these values. Use host.docker.internal if you want to use the host machine's IP.
DATABASE_RESTORE_HOST=host.docker.internal
DATABASE_RESTORE_DIRECTORY=/Backups
DATABASE_RESTORE_PORT=5400
DATABASE_RESTORE_NAME=oneuptimedb
DATABASE_RESTORE_USERNAME=postgres
DATABASE_RESTORE_PASSWORD=${DATABASE_PASSWORD}
DATABASE_RESTORE_FILENAME=db-31.backup
ANALYTICS_KEY=
ANALYTICS_HOST=
DATABASE_MIGRATIONS_HOST=localhost
DATABASE_MIGRATIONS_PORT=5400
# Global Probes
# This is in the format of GLOBAL_PROBE_NAME=ProbeName,ProbeDescription,ProbeKey
GLOBAL_PROBE_1_NAME="Probe-1"
GLOBAL_PROBE_1_DESCRIPTION="Global probe to monitor oneuptime resources"
GLOBAL_PROBE_1_MONITORING_WORKERS=5
GLOBAL_PROBE_1_MONITOR_FETCH_LIMIT=10
GLOBAL_PROBE_1_ONEUPTIME_URL=http://localhost
GLOBAL_PROBE_1_SYNTHETIC_MONITOR_SCRIPT_TIMEOUT_IN_MS=60000
GLOBAL_PROBE_1_CUSTOM_CODE_MONITOR_SCRIPT_TIMEOUT_IN_MS=60000
GLOBAL_PROBE_1_PORT=3874
# (Optional) If you want to use a proxy for the probe, then you can set the proxy URL here. For example, if you're using a proxy server like Caddy or Nginx, then you can set the proxy URL here.
GLOBAL_PROBE_1_PROXY_URL=
GLOBAL_PROBE_2_NAME="Probe-2"
GLOBAL_PROBE_2_DESCRIPTION="Global probe to monitor oneuptime resources"
GLOBAL_PROBE_2_MONITORING_WORKERS=5
GLOBAL_PROBE_2_MONITOR_FETCH_LIMIT=10
GLOBAL_PROBE_2_ONEUPTIME_URL=http://localhost
GLOBAL_PROBE_2_SYNTHETIC_MONITOR_SCRIPT_TIMEOUT_IN_MS=60000
GLOBAL_PROBE_2_CUSTOM_CODE_MONITOR_SCRIPT_TIMEOUT_IN_MS=60000
GLOBAL_PROBE_2_PORT=3875
# (Optional) If you want to use a proxy for the probe, then you can set the proxy URL here. For example, if you're using a proxy server like Caddy or Nginx, then you can set the proxy URL here.
GLOBAL_PROBE_2_PROXY_URL=
SMS_DEFAULT_COST_IN_CENTS=
CALL_DEFAULT_COST_IN_CENTS_PER_MINUTE=
SMS_HIGH_RISK_COST_IN_CENTS=
WHATSAPP_TEXT_DEFAULT_COST_IN_CENTS=
CALL_HIGH_RISK_COST_IN_CENTS_PER_MINUTE=
# IS BILLING ENABLED for this installer.
BILLING_ENABLED=false
# Public and private key for billing provider, usually stripe.
BILLING_PUBLIC_KEY=
BILLING_PRIVATE_KEY=
# Average telemetry row sizes in bytes used to estimate usage when reporting to the billing provider.
AVERAGE_SPAN_ROW_SIZE_IN_BYTES=1024
AVERAGE_LOG_ROW_SIZE_IN_BYTES=1024
AVERAGE_METRIC_ROW_SIZE_IN_BYTES=1024
AVERAGE_EXCEPTION_ROW_SIZE_IN_BYTES=1024
# Use this when you want to disable incident creation.
DISABLE_AUTOMATIC_INCIDENT_CREATION=false
# Use this when you want to disable incident creation.
DISABLE_AUTOMATIC_ALERT_CREATION=false
# If you're using an extrenal open telemetry collector, you can set the endpoint here - both server and client endpoint can be the same in this case.
# You can set the env var to http://otel-collector:4318 if you want instrumentation to be sent to otel collector.
OPENTELEMETRY_EXPORTER_OTLP_ENDPOINT=
# You can set the env var to "x-oneuptime-token=<YOUR_ONEUPTIME_TELEMETRY_INGEST_TOKEN>"
OPENTELEMETRY_EXPORTER_OTLP_HEADERS=
# This can be one of ERROR, WARN, INFO, DEBUG
LOG_LEVEL=ERROR
# Thse env vars are for E2E tests
E2E_TEST_IS_USER_REGISTERED=false
E2E_TEST_REGISTERED_USER_EMAIL=
E2E_TEST_REGISTERED_USER_PASSWORD=
# If you want to run the E2E tests on a status page, then you need to fill in the URL.
E2E_TEST_STATUS_PAGE_URL=
# This URL will be called when the E2E tests fail. This should be a GET endpoint.
E2E_TESTS_FAILED_WEBHOOK_URL=
# This is the timeout for the workflow script in milliseconds.
# How long do we wait for "Scripts" (like Custom Code Components) running in workflow to complete.
WORKFLOW_SCRIPT_TIMEOUT_IN_MS=5000
# How long do we wait for entire workflow to complete.
WORKFLOW_TIMEOUT_IN_MS=5000
# Concurrency settings
# Max number of telemetry jobs processed concurrently by OpenTelemetry Ingest worker
TELEMETRY_CONCURRENCY=100
# Max number of jobs processed concurrently by Fluent Logs worker
FLUENT_LOGS_CONCURRENCY=100
# Max number of jobs processed concurrently by Incoming Request Ingest worker
INCOMING_REQUEST_INGEST_CONCURRENCY=100
# Max number of jobs processed concurrently by Server Monitor Ingest worker
SERVER_MONITOR_INGEST_CONCURRENCY=100
# Max number of jobs processed concurrently by Probe Ingest worker
PROBE_INGEST_CONCURRENCY=100
# Max number of jobs processed concurrently by Worker service
WORKER_CONCURRENCY=100
# Lets encrypt notification email. This email will be used when certs are about to expire
LETS_ENCRYPT_NOTIFICATION_EMAIL=
# Generate a private key via openssl, encode it to base64 and paste it here.
# Example: "LS0tLS....1cbg=="
LETS_ENCRYPT_ACCOUNT_KEY=
# This is the number of active monitors allowed in the free plan.
ALLOWED_ACTIVE_MONITOR_COUNT_IN_FREE_PLAN=10
# Notifications Webhook (Slack)
# This webhook notifies slack when the new user signs up or is created.
NOTIFICATION_SLACK_WEBHOOK_ON_CREATED_USER=
# This webhook notifies slack when the new project is created.
NOTIFICATION_SLACK_WEBHOOK_ON_CREATED_PROJECT=
# This webhook notifies slack when the project is deleted.
NOTIFICATION_SLACK_WEBHOOK_ON_DELETED_PROJECT=
# This webhook notifies slack when the subscription is updated.
NOTIFICATION_SLACK_WEBHOOK_ON_SUBSCRIPTION_UPDATE=
# VAPID keys for Web Push Notifications
# Generate using: npx web-push generate-vapid-keys
VAPID_PUBLIC_KEY=
VAPID_PRIVATE_KEY=
VAPID_SUBJECT=mailto:support@oneuptime.com
# Copilot Environment Variables
COPILOT_ONEUPTIME_URL=http://localhost
COPILOT_ONEUPTIME_REPOSITORY_SECRET_KEY=
COPILOT_CODE_REPOSITORY_PASSWORD=
COPILOT_CODE_REPOSITORY_USERNAME=
COPILOT_ONEUPTIME_LLM_SERVER_URL=
# Set this to false if you want to enable copilot.
DISABLE_COPILOT=true
COPILOT_OPENAI_API_KEY=
# LLM Environment Variables
# Hugging Face Token for LLM Server to downlod models from Hugging Face
LLM_SERVER_HUGGINGFACE_TOKEN=
# Hugging Face Model Name for LLM Server to download.
LLM_SERVER_HUGGINGFACE_MODEL_NAME=
# By default telemetry is disabled for all services in docker compose. If you want to enable telemetry for a service, then set the env var to false.
DISABLE_TELEMETRY_FOR_ACCOUNTS=true
DISABLE_TELEMETRY_FOR_APP=true
DISABLE_TELEMETRY_FOR_PROBE_INGEST=true
DISABLE_TELEMETRY_FOR_TELEMETRY=true
DISABLE_TELEMETRY_FOR_FLUENT_LOGS=true
DISABLE_TELEMETRY_FOR_INCOMING_REQUEST_INGEST=true
DISABLE_TELEMETRY_FOR_TEST_SERVER=true
DISABLE_TELEMETRY_FOR_STATUS_PAGE=true
DISABLE_TELEMETRY_FOR_DASHBOARD=true
DISABLE_TELEMETRY_FOR_PROBE=true
DISABLE_TELEMETRY_FOR_ADMIN_DASHBOARD=true
DISABLE_TELEMETRY_FOR_OTEL_COLLECTOR=true
DISABLE_TELEMETRY_FOR_ISOLATED_VM=true
DISABLE_TELEMETRY_FOR_INGRESS=true
DISABLE_TELEMETRY_FOR_WORKER=true
DISABLE_TELEMETRY_FOR_SERVER_MONITOR_INGEST=true
# OPENTELEMETRY_COLLECTOR env vars
OPENTELEMETRY_COLLECTOR_SENDING_QUEUE_ENABLED=true
OPENTELEMETRY_COLLECTOR_SENDING_QUEUE_SIZE=1000
OPENTELEMETRY_COLLECTOR_SENDING_QUEUE_NUM_CONSUMERS=3
# Connect OneUptime with Slack App
SLACK_APP_CLIENT_ID=
SLACK_APP_CLIENT_SECRET=
SLACK_APP_SIGNING_SECRET=
# Example -
# IPv6 only:
# NGINX_LISTEN_ADDRESS=[::]:
# NGINX_LISTEN_OPTIONS=
# dual stack:
# NGINX_LISTEN_ADDRESS=[::]:
# NGINX_LISTEN_OPTIONS=ipv6only=off
NGINX_LISTEN_ADDRESS=
NGINX_LISTEN_OPTIONS=
# Microsoft Teams / Azure AD App Configuration
# IMPORTANT: Use the SECRET VALUE, not the SECRET ID from Azure App Registration
# The secret value is typically longer and includes more characters
MICROSOFT_TEAMS_APP_CLIENT_ID=
MICROSOFT_TEAMS_APP_CLIENT_SECRET=