Today we are open sourcing one of our internal tools kafka-schema-sync.

One of the problems we face at Astradot is keeping our staging and prod environments in sync with respect to Kafka topics. Engineers can create tons of Kafka topics, with different configurations for each. We wanted a central place for engineers to store the names and configuration of the topics, without needing to get that information by directly querying Kafka itself. We also wanted to easily understand what exact topics and configs are deployed in our various environments like staging and prod.

kafka-schema-sync solves that problem by taking in a simple yaml file containing the description of all the kafka topics in a cluster. It then runs against your kafka cluster and creates/updates the kafka topics as decribed in the yaml. We keep separate yaml files for separate environments, eg staging.yml , prod.yml, thus allowing us to easily see what topics are deployed in each environment.  Here is a sample yaml config:

bootstrapServers: "localhost:9092"
replication: 3
defaultCompression: "zstd"
 - name: server-metrics
   retentionHours: 6
 - name: user-events
   retentionHours: 48

It is super easy to install:

npm install --global @astradot/kafka-schema-sync

To run:

kafka-schema-sync --config staging.yml

Feel free to send feedback and PRs to our Github repo!