Commit graph

94 commits

Author SHA1 Message Date
Nawaz Dhandala
f49b1995df
feat(telemetry): add new Telemetry service (OTel, Syslog, Fluent, Metrics, Traces) and unified ingestion pipeline
- Add Telemetry service entrypoint
  - Telemetry/Index.ts: app bootstrap, routes mounting, infrastructure init and Telemetry SDK init.

- Unified queue + worker
  - Telemetry/Jobs/TelemetryIngest/ProcessTelemetry.ts: single worker that dispatches queued jobs to specific processors (logs, traces, metrics, syslog, fluent logs).
  - Telemetry/Services/Queue/TelemetryQueueService.ts: central queue API and job payload types.
  - Per-type Queue wrappers (LogsQueueService, MetricsQueueService, TracesQueueService, FluentLogsQueueService, SyslogQueueService).

- OpenTelemetry ingestion middleware and proto support
  - Telemetry/Middleware/OtelRequestMiddleware.ts: detect OTLP endpoint (logs/traces/metrics), decode protobuf bodies using protobufjs and set product type.
  - Telemetry/ProtoFiles/OTel/v1/*.proto: include common.proto, logs.proto, metrics.proto, resource.proto, traces.proto for OTLP v1 messages.

- Ingest services
  - Telemetry/Services/OtelLogsIngestService.ts: parse incoming OTLP logs, map attributes, convert timestamps, batch insert logs.
  - Telemetry/Services/OtelTracesIngestService.ts: parse OTLP traces, build span rows, extract exceptions, batch insert spans and exceptions, save telemetry exception summary.
  - Telemetry/Services/OtelMetricsIngestService.ts: parse OTLP metrics, normalize datapoints, batch insert metrics and index metric name -> service map.
  - Telemetry/Services/SyslogIngestService.ts: syslog ingestion endpoints, parser integration, map syslog fields to attributes and logs.
  - Telemetry/Services/FluentLogsIngestService.ts: ingest Fluentd style logs, normalize entries and insert into log backend.
  - Telemetry/Services/OtelIngestBaseService.ts: helpers to resolve service name from attributes/headers.

- Syslog parser and utilities
  - Telemetry/Utils/SyslogParser.ts: robust RFC5424 and RFC3164 parser, structured data extraction and sanitization.
  - Telemetry/Tests/Utils/SyslogParser.test.ts: unit tests for parser behavior.

- Telemetry exception utilities
  - Telemetry/Utils/Exception.ts: generate exception fingerprint and upsert telemetry exception status (saveOrUpdateTelemetryException).

- Queue & job integration
  - New integration with Common/Server/Infrastructure/Queue and QueueWorker, job id generation and telemetry job types.
  - Telemetry services add ingestion jobs instead of processing synchronously.

- Config, build and dev tooling
  - Add Telemetry/package.json, package-lock.json, tsconfig.json, nodemon.json, jest config.
  - New script configs and dependencies (protobufjs, ts-node, jest, nodemon, etc).

- Docker / environment updates
  - docker-compose.base.yml, docker-compose.dev.yml, docker-compose.yml: rename service from open-telemetry-ingest -> telemetry and wire TELEMETRY_* envs.
  - config.example.env: rename and consolidate environment variables (OPEN_TELEMETRY_* -> TELEMETRY_*, update hostnames and ports).
  - Tests/Scripts/status-check.sh: update ready-check target to telemetry/status/ready.

- Other
  - Telemetry/Services/Queue/*: export helpers and legacy-compatible job interface shims.
  - Memory cleanup and batching safeguards across ingest services.
  - Logging and capture spans added to key code paths.

BREAKING CHANGES / MIGRATION NOTES:
- Environment variables and docker service names changed:
  - Replace OPEN_TELEMETRY_... vars with TELEMETRY_... (PORT, HOSTNAME, CONCURRENCY, DISABLE_TELEMETRY, etc).
  - docker-compose entries moved from "open-telemetry-ingest" to "telemetry" and image name changed to oneuptime/telemetry.
  - Update any deployment automation and monitoring checks referencing the old service name or endpoints.
- Consumers: OTLP endpoints and behavior remain supported, but ingestion is now queued and processed asynchronously.

Testing / Running:
- Install deps in Telemetry/ (npm install) after syncing Common workspace.
- Run dev: npx nodemon (nodemon.json) or build & start using provided scripts.
- Run tests with jest (Telemetry test suite includes SyslogParser unit tests).

Files added/modified (high level):
- Added many files under Telemetry/: Index, Jobs, Middleware, ProtoFiles, Services, Utils, Tests, package and config artifacts.
- Modified docker-compose.* and config.example.env and status check script to use new TELEMETRY service/vars.
2025-11-07 21:36:47 +00:00
Simon Larsen
a80b7ba88c
chore(fluent-ingest): migrate fluent log ingest into open-telemetry-ingest and remove legacy fluent-ingest service
- Move Fluent/Fluent Bit logs ingestion into open-telemetry-ingest:
  - Add OpenTelemetryIngest/API/Fluent.ts (routes for /fluentd and queue endpoints)
  - Add Queue service, job worker and processor:
    - OpenTelemetryIngest/Services/Queue/FluentLogsQueueService.ts
    - OpenTelemetryIngest/Jobs/TelemetryIngest/ProcessFluentLogs.ts
  - Register Fluent API and job processing in OpenTelemetryIngest/Index.ts
  - Introduce QueueName.FluentLogs and related queue usage

- Remove legacy FluentIngest service and configuration:
  - Delete fluent-ingest docker-compose/dev/base entries and docker-compose.yml service
  - Remove fluent-ingest related helm values, KEDA scaledobject, ingress host and schema entries
  - Remove FLUENTD_HOST env/values and replace FLUENT_INGEST_HOSTNAME -> FLUENT_LOGS_HOSTNAME (pointing to open-telemetry-ingest)
  - Update config.example.env keys (FLUENT_LOGS_CONCURRENCY, DISABLE_TELEMETRY_FOR_FLUENT_LOGS)
  - Remove FluentIngestRoute and FLUENT_INGEST_URL/hostname usages from UI config/templates
  - Remove VSCode launch debug config for Fluent Ingest
  - Remove Fluent ingest E2E status check entry in Tests/Scripts/status-check.sh
  - Update docs/architecture diagram and Helm templates to reflect "FluentLogs" / Fluent Bit flow

- Misc:
  - Remove FLUENTD_HOST environment injection from docker-compose.base.yml
  - Cleanup related values.schema.json and values.yaml entries

This consolidates log ingestion under the OpenTelemetry ingest service and removes the separate FluentIngest service and its configuration.
2025-11-07 19:37:31 +00:00
Simon Larsen
1ac6e71f7e
chore(config,docker,ci,ui): rename IS_ENTERPRISE to IS_ENTERPRISE_EDITION across env, Dockerfiles, compose and workflows 2025-11-03 11:25:12 +00:00
Nawaz Dhandala
1c1a48b78f
chore(ci): build/publish enterprise image variants and add IS_ENTERPRISE arg to Dockerfiles 2025-10-31 14:49:07 +00:00
Nawaz Dhandala
987f30e5c7
feat: add PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD environment variable to Dockerfiles for improved build performance 2025-10-06 19:45:46 +01:00
Simon Larsen
77de0a1116
feat: Upgrade Node.js base image to version 24.9-alpine3.21 in multiple Dockerfiles and remove debugging flag from nodemon configurations 2025-09-30 11:27:03 +01:00
Nico Aymet
b7ea97c246 Set permission to write logs and cache on /tmp/npm in case container run as non root 2025-06-10 19:11:37 +01:00
Simon Larsen
c667fddc64
chore: update Node.js base image to version 23.8 in Dockerfile templates 2025-02-21 22:38:02 +00:00
Simon Larsen
325fa0eb7a
Add SERVER_OPEN_TELEMETRY_INGEST_HOSTNAME to Helm template and update tag replacement in change-release-to-test-tag script 2024-11-22 10:23:56 +00:00
Simon Larsen
815ae7161d
Rename Ingestor to ProbeIngest; update configurations, routes, and Docker support; add new request types and workflows 2024-11-21 17:18:22 +00:00
Simon Larsen
3a1f5c7120
Refactor OpenTelemetry Ingest Dockerfile and configuration; update environment variables and docker-compose for new service integration 2024-11-21 17:08:35 +00:00
Simon Larsen
945cef653c
Add Incoming Request Ingest service with configuration, Docker support, and tests 2024-11-21 14:41:37 +00:00
Simon Larsen
b9b5ca3325
switch base image to ecr 2024-10-16 15:54:57 +01:00
Simon Larsen
e7377f6c8f
refactor: Enable lazy loading for images in BlogPostUtil and remove unnecessary whitespace in Copilot/Init.ts 2024-09-05 14:30:30 +01:00
Jack Veney
17a0b65a4b
Update endpoint-status.sh
Included the -L option in the curl command, ensuring that it will follow any 301 redirects until the final URL is reached.
2024-06-20 15:24:27 -04:00
Simon Larsen
a66a04456b
refactor: Update endpoint URLs for status check script 2024-06-13 21:44:49 +01:00
Simon Larsen
f42291a428
refactor: Update status check URLs for Dashboard, Status Page, and Accounts
The status-check.sh script has been modified to update the endpoint URLs for checking the status of the Dashboard, Status Page, and Accounts services. The URLs now include "/status/ready" at the end, ensuring that the services are ready to handle requests. This change improves the accuracy and reliability of the status check script.
2024-06-13 21:05:37 +01:00
Simon Larsen
8325c06ca3
refactor: Update status check URLs for Dashboard and Status Page
The status-check.sh script has been modified to update the endpoint URLs for checking the status of the Dashboard and Status Page services. The URLs now include "/status/ready" at the end, ensuring that the services are ready to handle requests. This change improves the accuracy and reliability of the status check script.
2024-06-12 17:27:22 +01:00
Simon Larsen
b66f56bc12
refactor: Update endpoint status check URLs to include "/ready"
The status-check.sh script has been updated to modify the endpoint URLs for checking the status of various services. The URLs now include "/ready" at the end, indicating that the services are ready to handle requests. This change ensures that the status check accurately reflects the readiness of the services and improves the reliability of the script.
2024-06-12 17:25:08 +01:00
Simon Larsen
597aeb74f4
add e2e to test release 2024-06-09 19:29:23 +01:00
Simon Larsen
a24bf077ce
refactor: Improve error message in endpoint-status.sh
This code change updates the endpoint-status.sh script to improve the error message when the endpoint returns an HTTP status other than 200. The previous message mentioned that it usually takes a few minutes for the app to boot, which is not accurate. The updated message removes this misleading information and provides a more accurate description of the retry behavior.
2024-06-09 14:49:48 +01:00
snyk-bot
65f024c3f6
fix: Tests/Dockerfile.tpl to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6593964
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6593964
2024-05-09 21:10:47 +00:00
Simon Larsen
1be827741e
Update E2E Config.ts to use E2E_TEST_STATUS_PAGE_URL instead of E2E_TEST_REGISTERED_USER_PASSWORD 2024-04-27 16:55:22 +01:00
Simon Larsen
445a8d3f35
Update Dockerfile.tpl files to set APP_VERSION to 1.0.0 if not set 2024-04-09 12:53:42 +01:00
Simon Larsen
356bacf9a0
Update Dockerfile.tpl files to set APP_VERSION to 2.0.0 if not set 2024-04-09 12:53:13 +01:00
snyk-bot
59fff01663
fix: Tests/Dockerfile.tpl to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-UPSTREAM-NODE-6564548
- https://snyk.io/vuln/SNYK-UPSTREAM-NODE-6564550
2024-04-05 20:35:00 +00:00
Simon Larsen
519daba294
Remove root user from Dockerfiles 2024-02-16 07:40:22 +00:00
snyk-bot
033f3503f1
fix: Tests/Dockerfile.tpl to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6152404
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6152404
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6160000
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6160000
2024-01-27 11:15:48 +00:00
Simon Larsen
580fd97030
Update retry interval for endpoint status script 2024-01-25 11:32:16 +00:00
Simon Larsen
224943206e
Remove unnecessary endpoint status checks 2024-01-25 11:30:48 +00:00
Simon Larsen
04dcb80124
Refactor endpoint status checks in status-check.sh script 2023-12-29 14:06:46 +00:00
Simon Larsen
4d2afa7cf4
Add new endpoint status checks 2023-12-28 16:51:21 +00:00
Simon Larsen
284752631e
Remove unnecessary endpoint status checks 2023-12-25 13:30:46 +00:00
Simon Larsen
526df139b1
Update Dockerfile and launch configurations 2023-12-11 19:37:18 +00:00
snyk-bot
737ee28528
fix: Tests/Dockerfile.tpl to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6032386
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6032386
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6055795
- https://snyk.io/vuln/SNYK-ALPINE318-OPENSSL-6055795
2023-12-01 17:36:21 +00:00
Simon Larsen
640ce525c5
Update Probe npm install and test commands 2023-11-21 17:50:09 +00:00
snyk-bot
d26b348f0d
fix: Tests/Dockerfile.tpl to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-3314647
- https://snyk.io/vuln/SNYK-UPSTREAM-NODE-3326683
- https://snyk.io/vuln/SNYK-UPSTREAM-NODE-5741793
- https://snyk.io/vuln/SNYK-UPSTREAM-NODE-5843454
- https://snyk.io/vuln/SNYK-UPSTREAM-NODE-5848038
2023-11-20 20:36:37 +00:00
Simon Larsen
6804e94850
add ingestor status check 2023-10-16 12:55:54 +01:00
Simon Larsen
ed7708ba7c
remove change in config from npm 2023-10-04 19:11:51 +01:00
Jordan Jones
515b8ba94c
chore(tests): sneak in the tiny misspelling 2023-10-01 08:29:47 -07:00
Simon Larsen
c06c0f8b38 fix helm test 2023-10-01 09:13:44 +00:00
Simon Larsen
0d09047454
add endpoint telemetry 2023-09-29 14:26:09 +01:00
Simon Larsen
8a3b893521
add status check script 2023-09-29 14:24:59 +01:00
Simon Larsen
36860e6ee9
fix status check scirpt 2023-09-29 14:16:09 +01:00
Simon Larsen
53efbaf7a0
add bash and curl to test docker 2023-09-29 13:59:08 +01:00
Simon Larsen
8863c6a209
add chmod to scripts 2023-09-29 13:52:56 +01:00
Simon Larsen
d7081c1bae
add bash 2023-09-28 21:27:11 +01:00
Simon Larsen
36cfc317a4
fix typo in file 2023-09-28 18:57:15 +01:00
Simon Larsen
3714c2c91a add test container 2023-09-28 11:16:50 +00:00
Simon Larsen
5ac2786f7e
fix changes 2022-09-27 12:24:19 +01:00