diff --git a/antora.yml b/antora.yml index a4b4008a..afb5c0b4 100644 --- a/antora.yml +++ b/antora.yml @@ -8,7 +8,13 @@ nav: asciidoc: attributes: + product: 'Luna Streaming' luna-version: '2.10' pulsar-version: '2.10' admin-console-version: '2.0.4' heartbeat-version: '1.0.12' + starlight-kafka: 'Starlight for Kafka' + starlight-rabbitmq: 'Starlight for RabbitMQ' + pulsar-reg: 'Apache Pulsar(TM)' + pulsar: 'Apache Pulsar' + pulsar-short: 'Pulsar' \ No newline at end of file diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index 735e9ab4..6ba9b6ec 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -16,8 +16,7 @@ * xref:components:heartbeat-vm.adoc[] * xref:components:pulsar-beam.adoc[] * xref:components:pulsar-sql.adoc[] -* xref:components:starlight-for-kafka.adoc[] -* xref:components:starlight-for-rabbitmq.adoc[] +* xref:components:starlight.adoc[] .Operations * xref:operations:auth.adoc[] diff --git a/modules/ROOT/pages/faqs.adoc b/modules/ROOT/pages/faqs.adoc index 216e283f..2706b796 100644 --- a/modules/ROOT/pages/faqs.adoc +++ b/modules/ROOT/pages/faqs.adoc @@ -1,124 +1,111 @@ = Luna Streaming FAQs :navtitle: FAQs -If you are new to DataStax Luna Streaming and its Apache Pulsar enhancements, these FAQs are for you. +If you are new to {company} Luna Streaming and its {pulsar} enhancements, these FAQs are for you. -== Introduction +== What is {company} Luna Streaming? -=== What is DataStax Luna Streaming? +{company} Luna Streaming is a new Kubernetes-based distribution of {pulsar}, based on the technology that https://kesque.com/[Kesque] built to run its {pulsar-short}-as-a-service. -DataStax Luna Streaming is a new Kubernetes-based distribution of Apache Pulsar, based on the technology that https://kesque.com/[Kesque] built to run its Pulsar-as-a-service. +== What components and features are provided by {company} Luna Streaming? -=== What components and features are provided by DataStax Luna Streaming? - -In addition to Apache Pulsar itself, DataStax Luna Streaming provides: +In addition to {pulsar} itself, {company} Luna Streaming provides: * An installer that can stand up a dev or production cluster on bare metal or VMs without a pre-existing Kubernetes environment -* A helm chart that can deploy and manage Pulsar on your current Kubernetes infrastructure +* A Helm chart that can deploy and manage {pulsar-short} on your current Kubernetes infrastructure * Cassandra, Elastic, Kinesis, Kafka, and JDBC connectors * A management dashboard * A monitoring and alerting system -=== On which version of Apache Pulsar is DataStax Luna Streaming based? +== On which version of {pulsar} is {company} Luna Streaming based? -DataStax Luna Streaming {luna-version} is based on its distribution of Apache Pulsar {pulsar-version}, plus features and additional enhancements from DataStax contributors. +{company} Luna Streaming {luna-version} is based on its distribution of {pulsar} {pulsar-version}, plus features and additional enhancements from {company} contributors. -=== What does DataStax Luna Streaming provide that I cannot get with open-source Apache Pulsar? +== What does {company} Luna Streaming provide that I cannot get with open-source {pulsar}? -DataStax Luna Streaming is a hardened version of Apache Pulsar that been run through additional testing to ensure it is ready for production use. It also includes additional tooling to help monitor your system, including an enhanced Admin Console and a Heartbeat service to monitor the system health. +{company} Luna Streaming is a hardened version of {pulsar} that been run through additional testing to ensure it is ready for production use. It also includes additional tooling to help monitor your system, including an enhanced Admin Console and a Heartbeat service to monitor the system health. -=== Is DataStax Luna Streaming an open-source project? +== Is {company} Luna Streaming an open-source project? -Yes, DataStax Luna Streaming is open source. See the <>. +Yes, {company} Luna Streaming is open source. See the <>. -=== Which Kubernetes platforms are supported by DataStax Luna Streaming? +== Which Kubernetes platforms are supported by {company} Luna Streaming? They include Minikube, K8d, Kind, Google Kubernetes Engine (GKE), Microsoft Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (EKS), and other commonly used platforms. [#gitHubRepos] -=== Where are the DataStax Luna Streaming public GitHub repos? +== Where are the {company} Luna Streaming public GitHub repos? There are several public repos, each with a different purpose. See: * https://github.com/datastax/pulsar[https://github.com/datastax/pulsar] : This is the distro repo (a fork of apache/pulsar). -* https://github.com/datastax/pulsar-admin-console[https://github.com/datastax/pulsar-admin-console] : This is the repo for the Pulsar admin console, which allows for the configuration and monitoring of Pulsar. -* https://github.com/datastax/pulsar-heartbeat[https://github.com/datastax/pulsar-heartbeat] : This is a monitoring/observability tool for Pulsar that tracks the health of the cluster and can generate alerts in Slack and OpsGenie. -* https://github.com/datastax/pulsar-helm-chart[https://github.com/datastax/pulsar-helm-chart] : This is the Helm chart for deploying the DataStax Pulsar Distro in an existing Kubernetes cluster. -* https://github.com/datastax/pulsar-sink[https://github.com/datastax/pulsar-sink] : This is the DataStax Apache Pulsar Connector (`pulsar-sink` for Cassandra) repo. -* https://github.com/datastax/burnell[https://github.com/datastax/burnell] : This is a utility for Pulsar that provides various functions, such as key initialization for authentication, and JWT token creation API. - -== Installation +* https://github.com/datastax/pulsar-admin-console[https://github.com/datastax/pulsar-admin-console] : This is the repo for the {pulsar-short} admin console, which allows for the configuration and monitoring of {pulsar-short}. +* https://github.com/datastax/pulsar-heartbeat[https://github.com/datastax/pulsar-heartbeat] : This is a monitoring/observability tool for {pulsar-short} that tracks the health of the cluster and can generate alerts in Slack and OpsGenie. +* https://github.com/datastax/pulsar-helm-chart[https://github.com/datastax/pulsar-helm-chart] : This is the Helm chart for deploying the {company} {pulsar-short} Distro in an existing Kubernetes cluster. +* https://github.com/datastax/pulsar-sink[https://github.com/datastax/pulsar-sink] : This is the {company} {pulsar} Connector (`pulsar-sink` for Cassandra) repo. +* https://github.com/datastax/burnell[https://github.com/datastax/burnell] : This is a utility for {pulsar-short} that provides various functions, such as key initialization for authentication, and JWT token creation API. -=== Is there a prerequisite version of Java needed for the DataStax Luna Streaming installation? +== Is there a prerequisite version of Java needed for the {company} Luna Streaming installation? -The DataStax Luna Streaming distribution is designed for Java 11. However, because the product releases Docker images, you do not need to install Java (8 or 11) in advance. Java 11 is bundled in the Docker image. +The {company} Luna Streaming distribution is designed for Java 11. However, because the product releases Docker images, you do not need to install Java (8 or 11) in advance. Java 11 is bundled in the Docker image. -=== What are the install options for DataStax Luna Streaming? +== What are the install options for {company} Luna Streaming? -* Use the Helm chart provided at https://github.com/apache/pulsar-helm-chart[https://github.com/datastax/pulsar-helm-chart] to install DataStax Luna Streaming in an existing Kubernetes cluster on your laptop or hosted by a cloud provider. -* Use the tarball provided at https://github.com/datastax/pulsar/releases[https://github.com/datastax/pulsar/releases] to install DataStax Luna Streaming on a server or VM. -* Use the DataStax Ansible scripts provided at https://github.com/datastax/pulsar-ansible[https://github.com/datastax/pulsar-ansible] to install DataStax Luna Streaming on a server or VM with our provided playbooks. +* Use the Helm chart provided at https://github.com/apache/pulsar-helm-chart[https://github.com/datastax/pulsar-helm-chart] to install {company} Luna Streaming in an existing Kubernetes cluster on your laptop or hosted by a cloud provider. +* Use the tarball provided at https://github.com/datastax/pulsar/releases[https://github.com/datastax/pulsar/releases] to install {company} Luna Streaming on a server or VM. +* Use the {company} Ansible scripts provided at https://github.com/datastax/pulsar-ansible[https://github.com/datastax/pulsar-ansible] to install {company} Luna Streaming on a server or VM with our provided playbooks. -=== How do I install DataStax Luna Streaming in my Kubernetes cluster? +== How do I install {company} Luna Streaming in my Kubernetes cluster? Follow the full instructions in xref:install-upgrade:quickstart-helm-installs.adoc[Quick Start for Helm Chart installs]. -=== How do I install DataStax Luna Streaming on my server or VM? +== How do I install {company} Luna Streaming on my server or VM? Follow the full instructions in xref:install-upgrade:quickstart-server-installs.adoc[Quick Start for Server/VM installs]. -== What task can I perform in the DataStax Luna Streaming Admin Console? +== What task can I perform in the {company} Luna Streaming Admin Console? From the Admin Console, you can: -* Add and run Pulsar clients +* Add and run {pulsar-short} clients * Establish credentials for secure connections * Define topics that can be published for streaming apps -* Set up Pulsar sinks that publish topics and make them available to subscribers, such as for a Cassandra database table -* Control namespaces used by Pulsar +* Set up {pulsar-short} sinks that publish topics and make them available to subscribers, such as for a Cassandra database table +* Control namespaces used by {pulsar-short} * Use the Admin API -== What is Pulsar Heartbeat? - -https://github.com/datastax/pulsar-heartbeat[Pulsar Heartbeat] monitors the availability, tracks the performance, and reports failures of the Pulsar cluster. It produces synthetic workloads to measure end-to-end message pubsub latency. Pulsar Heartbeat is a cloud-native application that can be installed by Helm within the Pulsar Kubernetes cluster. - -== What is Prometheus? +== What is {pulsar-short} Heartbeat? -https://prometheus.io/docs/introduction/overview/[Prometheus] is an open-source tool to collect metrics on a running app, providing real-time monitoring and alerts. +https://github.com/datastax/pulsar-heartbeat[{pulsar-short} Heartbeat] monitors the availability, tracks the performance, and reports failures of the {pulsar-short} cluster. It produces synthetic workloads to measure end-to-end message pubsub latency. {pulsar-short} Heartbeat is a cloud-native application that can be installed by Helm within the {pulsar-short} Kubernetes cluster. -== What is Grafana? +== What are the features provided by {company} {pulsar} Connector (`pulsar-sink`) that are not supported in `kafka-sink`? -https://grafana.com/[Grafana] is a visualization tool that helps you make sense of metrics and related data coming from your apps via Prometheus, for example. +The https://pulsar.apache.org/docs/en/io-overview/[{pulsar-short} IO framework] provides many features that are not possible in Kafka, and has different compression formats and auth/security features. The features are handled by {pulsar-short}. For more, see xref:operations:io-connectors.adoc[Luna Streaming IO Connectors]. -== Pulsar Connector +The {company} {pulsar} Connector allows single-record acknowledgement and negative acknowledgements. -=== What are the features provided by DataStax Apache Pulsar Connector (`pulsar-sink`) that are not supported in `kafka-sink`? - -The https://pulsar.apache.org/docs/en/io-overview/[Pulsar IO framework] provides many features that are not possible in Kafka, and has different compression formats and auth/security features. The features are handled by Pulsar. For more, see xref:operations:io-connectors.adoc[Luna Streaming IO Connectors]. - -The DataStax Apache Pulsar Connector allows single-record acknowledgement and negative acknowledgements. - -=== What features are missing in DataStax Apache Pulsar Connector (`pulsar-sink`) compared with `kafka-sink`? +== What features are missing in {company} {pulsar} Connector (`pulsar-sink`) compared with `kafka-sink`? * No support for `tinyint` (`int8bit`) and `smallint` (`int16bit`). -* The key is always a String, but you can write JSON inside it; the support is implemented in pulsar-sink, but not in Pulsar IO. +* The key is always a String, but you can write JSON inside it; the support is implemented in pulsar-sink, but not in {pulsar-short} IO. * The “value” of a “message property” is always a String; for example, you cannot map the message property to `__ttl` or to `__timestamp`. * Field names inside structures must be valid for Avro, even in case of JSON structures. For example, field names like `Int.field` (with dot) or `int field` (with space) are not valid. -=== How is DataStax Apache Pulsar Connector distributed? +== How is {company} {pulsar} Connector distributed? There are two packages: -* The `pulsar-sink` functionality of DataStax Apache Pulsar Connector is included with DataStax Luna Streaming. It's built in! -* You can optionally download the DataStax Apache Pulsar Connector tarball from the https://downloads.datastax.com/#pulsar-sink[DataStax Downloads] site, and then use it as its own product with your open-source Apache Pulsar install. - -If you're using open-source software (OSS) Apache Pulsar, you can use DataStax Apache Pulsar Connector with the OSS to take advantage of this `pulsar-sink` for Cassandra. -See the xref:pulsar-connector:ROOT:index.adoc[DataStax Apache Pulsar Connector documentation]. +* The `pulsar-sink` functionality of {company} {pulsar} Connector is included with {company} Luna Streaming. It's built in! +* You can optionally download the {company} {pulsar} Connector tarball from the https://downloads.datastax.com/#pulsar-sink[{company} Downloads] site, and then use it as its own product with your open-source {pulsar} install. -== APIs +If you're using open-source software (OSS) {pulsar}, you can use {company} {pulsar} Connector with the OSS to take advantage of this `pulsar-sink` for Cassandra. +See the xref:pulsar-connector:ROOT:index.adoc[{company} {pulsar} Connector documentation]. -=== What client APIs does DataStax Luna Streaming provide? +== What is the {company} Change Data Capture (CDC) for Cassandra connector? -The same as for Apache Pulsar. See https://pulsar.apache.org/docs/en/client-libraries/. +This source connector streams data changes from Cassandra tables to Pulsar topics. +For more information, see the xref:cdc-for-cassandra:ROOT:index.adoc[{company} CDC for Cassandra connector documentation]. +== What client APIs does {company} Luna Streaming provide? +The same as for {pulsar}. See https://pulsar.apache.org/docs/en/client-libraries/. \ No newline at end of file diff --git a/modules/ROOT/pages/index.adoc b/modules/ROOT/pages/index.adoc index 70590f35..adbf15de 100644 --- a/modules/ROOT/pages/index.adoc +++ b/modules/ROOT/pages/index.adoc @@ -1,27 +1,27 @@ -= Welcome to DataStax Luna Streaming += Welcome to {company} Luna Streaming :navtitle: Luna Streaming -DataStax Luna Streaming is a production-ready distribution of Apache Pulsar built to run seamlessly on any CNCF conformant version of Kubernetes. DataStax Luna Streaming provides all of the core capabilities included in the Apache Community version of Apache Pulsar, plus a number of additional tools and features to facilitate administration and operational tasks associated with running Apache Pulsar in production. +{company} Luna Streaming is a production-ready distribution of {pulsar} built to run seamlessly on any CNCF conformant version of Kubernetes. {company} Luna Streaming provides all of the core capabilities included in the Apache Community version of {pulsar}, plus a number of additional tools and features to facilitate administration and operational tasks associated with running {pulsar} in production. == Release notes -The latest release of DataStax Luna Streaming is {luna-version}, which matches the supported, distributed Pulsar version numbers. +The latest release of {company} Luna Streaming is {luna-version}, which matches the supported, distributed {pulsar-short} version numbers. -Refer to the DataStax Luna Streaming https://github.com/datastax/release-notes/blob/master/Luna_Streaming_2.10_Release_Notes.md[release notes], which are hosted in our public GitHub repo, for information & linked commit IDs that were implemented in the latest Luna Streaming {luna-version} release. +Refer to the {company} Luna Streaming https://github.com/datastax/release-notes/blob/master/Luna_Streaming_2.10_Release_Notes.md[release notes], which are hosted in our public GitHub repo, for information & linked commit IDs that were implemented in the latest Luna Streaming {luna-version} release. == Components -In addition to the distribution of https://pulsar.apache.org/en/versions/[Apache Pulsar {pulsar-version}], DataStax Luna Streaming provides: +In addition to the distribution of https://pulsar.apache.org/en/versions/[{pulsar} {pulsar-version}], {company} Luna Streaming provides: -* A xref:install-upgrade:quickstart-helm-installs.adoc[Helm chart] that deploys and manages Pulsar on your current CNCF-conformant Kubernetes infrastructure +* A xref:install-upgrade:quickstart-helm-installs.adoc[Helm chart] that deploys and manages {pulsar-short} on your current CNCF-conformant Kubernetes infrastructure * Cassandra, Elastic, Kinesis, Kafka, and JDBC xref:operations:io-connectors.adoc[connectors] -* The xref:streaming-learning:use-cases-architectures:starlight/index.adoc[Starlight] suite of Pulsar protocol handlers for Kafka, RabbitMQ, and JMS +* The xref:components:starlight.adoc[Starlight suite of {pulsar-short} protocol handlers for Kafka, RabbitMQ, and JMS] -* xref:components:admin-console-vm.adoc[Pulsar Admin Console] for simplified administration of your Pulsar environment +* xref:components:admin-console-vm.adoc[{pulsar-short} Admin Console] for simplified administration of your {pulsar-short} environment -* xref:components:heartbeat-vm.adoc[Pulsar Heartbeat] to observe and monitor your Pulsar instances +* xref:components:heartbeat-vm.adoc[{pulsar-short} Heartbeat] to observe and monitor your {pulsar-short} instances == Features @@ -57,5 +57,5 @@ In addition to the distribution of https://pulsar.apache.org/en/versions/[Apache * If you have an existing Kubernetes environment, deploy Luna Streaming with a xref:install-upgrade:quickstart-helm-installs.adoc[Helm Installation]. * If you have a bare metal or a cloud environment, see xref:install-upgrade:quickstart-server-installs.adoc[Server/VM Installation]. -* If you want to learn about monitoring with Pulsar Heartbeat, see xref:components:pulsar-monitor.adoc[Pulsar Heartbeat]. +* If you want to learn about monitoring with {pulsar-short} Heartbeat, see xref:components:pulsar-monitor.adoc[{pulsar-short} Heartbeat]. * If you have questions about Luna Streaming, see xref::faqs.adoc[Luna Streaming FAQs]. \ No newline at end of file diff --git a/modules/components/pages/admin-console-tutorial.adoc b/modules/components/pages/admin-console-tutorial.adoc index b2de6cac..ace206e4 100644 --- a/modules/components/pages/admin-console-tutorial.adoc +++ b/modules/components/pages/admin-console-tutorial.adoc @@ -1,6 +1,6 @@ -= Pulsar Admin Console += {pulsar-short} Admin Console -The *DataStax Admin Console for Apache Pulsar(R)* is a web-based UI from DataStax that administers topics, namespaces, sources, sinks, and various aspects of Apache Pulsar features. +The *{company} Admin Console for {pulsar-reg}* is a web-based UI from {company} that administers topics, namespaces, sources, sinks, and various aspects of {pulsar} features. * xref:components:admin-console-tutorial.adoc#getting-started[] * xref:components:admin-console-tutorial.adoc#features[] @@ -11,11 +11,11 @@ The *DataStax Admin Console for Apache Pulsar(R)* is a web-based UI from DataSta * xref:components:admin-console-tutorial.adoc#video[] [#getting-started] -== Getting Started in Pulsar Admin Console +== Getting Started in {pulsar-short} Admin Console -In the *Luna Streaming Pulsar Admin Console*, you can use Pulsar clients to send and receive pub/sub messages. +In the *Luna Streaming {pulsar-short} Admin Console*, you can use {pulsar-short} clients to send and receive pub/sub messages. -If you installed the Admin console with the xref:install-upgrade:quickstart-helm-installs.adoc[DataStax Helm chart], access the Admin console via the `pulsar-adminconsole` external load balancer endpoint in your cloud provider: +If you installed the Admin console with the xref:install-upgrade:quickstart-helm-installs.adoc[{company} Helm chart], access the Admin console via the `pulsar-adminconsole` external load balancer endpoint in your cloud provider: image::GCP-all-pods.png[GCP Pods] @@ -24,9 +24,9 @@ Log in with username `admin`. If you're running a xref:install-upgrade:quickstart-server-installs.adoc[server or VM] deployment, see xref:admin-console-vm.adoc[Admin Console on Server/VM] for instructions on deploying and accessing the Admin console. [#features] -== Pulsar Admin Console features +== {pulsar-short} Admin Console features -To try out your service, use the built-in WebSocket test clients on the Pulsar Admin Console's *Test Clients* page. +To try out your service, use the built-in WebSocket test clients on the {pulsar-short} Admin Console's *Test Clients* page. To see currently available namespaces, go to *Namespaces*, or select the button in the upper right corner. @@ -39,13 +39,13 @@ image::luna-streaming-admin-console.png[Luna Streaming Admin Console] For interactive code samples, go to *Code Samples*. [#send-receive] -== Sending and receiving Pulsar messages +== Sending and receiving {pulsar-short} messages -Go to the Pulsar Admin Console's **Test Clients** page. +Go to the {pulsar-short} Admin Console's **Test Clients** page. The quickest way to try your service is to use the test clients and send messages from one client to the other. In the WebSocket Test Client 1 section, click **Connect**. -This action creates a connection from the Pulsar Admin Console that's running in your browser to the Pulsar instance on your server. +This action creates a connection from the {pulsar-short} Admin Console that's running in your browser to the {pulsar-short} instance on your server. Scroll down to the Consume tab. In this simple example, which verifies that the service is running properly, add a `hello world` message and click Send. For example: @@ -54,7 +54,7 @@ image::test-message.png[Send a message using a test client] In doing so, you published a message to your server, and in the Test Client you're listening to your own topic. -Your client is working with the Pulsar server. +Your client is working with the {pulsar-short} server. [#create-topics] == Create new topics and tenants @@ -73,20 +73,20 @@ To see detailed information about your topics, go to *Topics*. [#code-samples] == Code samples -On the Pulsar Admin Console's *Code Samples* page, there are examples for Java, Python, Golang, Node.js, WebSocket, and HTTP clients. +On the {pulsar-short} Admin Console's *Code Samples* page, there are examples for Java, Python, Golang, Node.js, WebSocket, and HTTP clients. Each example shows Producer, Consumer, and Reader code, plus language-specific examples of setting project properties and dependencies. -For example, selecting Java will show you how to connect your Java project to Pulsar by modifying your Maven's `pom.xml` file. +For example, selecting Java will show you how to connect your Java project to {pulsar-short} by modifying your Maven's `pom.xml` file. [#connect-to-pulsar] -== Connecting to Pulsar +== Connecting to {pulsar-short} -This section describes how to connect Pulsar components to the Admin console. +This section describes how to connect {pulsar-short} components to the Admin console. === Creating and showing credentials -When connecting clients, you'll need to provide your connect token to identify your account. In the Pulsar APIs, you specify the token when creating the client object. The token is your password to your account, so keep it safe. +When connecting clients, you'll need to provide your connect token to identify your account. In the {pulsar-short} APIs, you specify the token when creating the client object. The token is your password to your account, so keep it safe. The code samples automatically add your client token as part of the source code for convenience. However, a more secure practice would be to read the token from an environment variable or a file. @@ -96,7 +96,7 @@ If you previously created a token, use the Credentials page to get its value. === Connecting Clients -To connect using the Pulsar binary protocol, use the following URL format with port 6651: +To connect using the {pulsar-short} binary protocol, use the following URL format with port 6651: `pulsar+ssl://:6651` @@ -124,7 +124,7 @@ For example: `https://ip-10-101-32-250.srv101.dsinternal.org:8085` -=== Connect to Pulsar admin API +=== Connect to {pulsar-short} admin API To connect to the admin API, use the following URL format with port 8443: @@ -147,17 +147,17 @@ pulsar-admin --admin-url https://ip-10-101-32-250.srv101.dsinternal.org:8443 \ --auth-params file:///token.jwt ---- -You can get the token from the Pulsar Admin Console's *Credentials* page. +You can get the token from the {pulsar-short} Admin Console's *Credentials* page. Alternatively, you can save the URL authentication parameters in your `client.conf` file. [#video] == Admin console video -You can also follow along with this video from our *Five Minutes About Pulsar* series to get started with the admin console. +You can also follow along with this video from our *Five Minutes About {pulsar-short}* series to get started with the admin console. video::1IwblLfPiPQ[youtube, list=PL2g2h-wyI4SqeKH16czlcQ5x4Q_z-X7_m] == Next steps -For more on building and running a standalone Pulsar Admin console, see the xref:admin-console-vm.adoc[Admin Console on Server/VM] or the https://github.com/datastax/pulsar-admin-console#dev[Pulsar Admin console README]. \ No newline at end of file +For more on building and running a standalone {pulsar-short} Admin console, see the xref:admin-console-vm.adoc[Admin Console on Server/VM] or the https://github.com/datastax/pulsar-admin-console#dev[{pulsar-short} Admin console README]. \ No newline at end of file diff --git a/modules/components/pages/admin-console-vm.adoc b/modules/components/pages/admin-console-vm.adoc index f4b622fc..799034a7 100644 --- a/modules/components/pages/admin-console-vm.adoc +++ b/modules/components/pages/admin-console-vm.adoc @@ -1,6 +1,6 @@ -= Install Pulsar Admin Console on Server/VM += Install {pulsar-short} Admin Console on Server/VM -*Pulsar Admin Console* is a web-based UI that administrates topics, namespaces, sources, sinks and various aspects of Apache Pulsar(TM) features. +*{pulsar-short} Admin Console* is a web-based UI that administrates topics, namespaces, sources, sinks and various aspects of {pulsar-reg} features. The Admin Console is a VueJS application that runs in a browser. It also includes a web server that serves up the files for the Admin Console as well as providing configuration and authentication services. @@ -15,7 +15,7 @@ This document covers: * <> [#install] -== Install Pulsar Admin Console +== Install {pulsar-short} Admin Console . Ensure Node version 14.18 or higher is installed. You can find the most recent Node release https://nodejs.org/en/download/[here], or use wget: + @@ -25,7 +25,7 @@ wget https://nodejs.org/dist/v14.18.3/node-v14.18.3-linux-x64.tar.xz / tar -xf node-v14.18.3-linux-x64.tar.xz ---- -. Download and install the Pulsar Admin console tarball to the VM. You can find the most recent Pulsar Admin Console release https://github.com/datastax/pulsar-admin-console/releases[here]. +. Download and install the {pulsar-short} Admin console tarball to the VM. You can find the most recent {pulsar-short} Admin Console release https://github.com/datastax/pulsar-admin-console/releases[here]. + The tarball is also available with `wget`: + @@ -59,11 +59,11 @@ To change the default Admin Console configuration, see <`. +You need to configure `pulsar_url` to point to one of your brokers or a proxy/loadbalancer in front of the brokers (can be {pulsar-short} proxy). The Admin Console server must be able to directly reach each broker by the IP/hostname that is returned by the {pulsar-short} CLI command `pulsar-admin brokers list `. [NOTE] ==== @@ -81,24 +81,24 @@ These values can be modified in the JSON configuration file. |=== |Setting | Default | Description -| api_version | 2.8.3 | Version of the Pulsar client API to recommend under Samples. +| api_version | 2.8.3 | Version of the {pulsar-short} client API to recommend under Samples. | auth_mode | none | Authentication mode. One of `none`, `user`, `k8s`, or `openidconnect`. See <> for details. | ca_certificate | | String of CA certificate to display in the console under Credentials. -| clients_disabled | false | Disable test clients. Test clients depend on WebSocket proxy, so if this is not running in Pulsar cluster you may want to disable them. -| cluster_name | standalone | Name of Pulsar cluster connecting to. The cluster name can be retrieved with the CLI command `pulsar-admin clusters list`. +| clients_disabled | false | Disable test clients. Test clients depend on WebSocket proxy, so if this is not running in {pulsar-short} cluster you may want to disable them. +| cluster_name | standalone | Name of {pulsar-short} cluster connecting to. The cluster name can be retrieved with the CLI command `pulsar-admin clusters list`. | functions_disabled | false | If functions are not enabled in the cluster, disable the function sections (Functions, Sinks, Sources). | grafana_url | | If `render_monitoring_tab` is enabled, URL for Grafana. -| host_overrides.http | \http://localhost:8964 | URL to display in console to connect to Pulsar Beam HTTP proxy. -| host_overrides.pulsar | \http://localhost:6650 | URL to display in console to connect to Pulsar. +| host_overrides.http | \http://localhost:8964 | URL to display in console to connect to {pulsar-short} Beam HTTP proxy. +| host_overrides.pulsar | \http://localhost:6650 | URL to display in console to connect to {pulsar-short}. | host_overrides.ws | //localhost:8080 | URL to display in console to connect to WebSocket proxy. | notice_text | | Custom notice to appear at top of console. | oauth_client_id || This is the client ID that the console will use when authenticating with authentication provider. -| polling_interval | 10000 | How often the console polls Pulsar for updated values. In milliseconds. +| polling_interval | 10000 | How often the console polls {pulsar-short} for updated values. In milliseconds. | render_monitoring_tab | false | Enable tab that includes links to Grafana dashboards. -| server_config.admin_token | | When using `user` or `k8s` auth mode, a Pulsar token is used to connect to the Pulsar cluster. This specifies the token as a string. For full access, a superuser token is recommended. The `token_path` setting will override this value if present. +| server_config.admin_token | | When using `user` or `k8s` auth mode, a {pulsar-short} token is used to connect to the {pulsar-short} cluster. This specifies the token as a string. For full access, a superuser token is recommended. The `token_path` setting will override this value if present. | server_config.log_level | info | Log level for the console server. | server_config.port | 6454 | The listen port for the console server. -| server_config.pulsar_url | \http://localhost:8080 | URL for connecting to the Pulsar cluster. Should point to either a broker or Pulsar proxy. The console server must be able to reach this URL. +| server_config.pulsar_url | \http://localhost:8080 | URL for connecting to the {pulsar-short} cluster. Should point to either a broker or {pulsar-short} proxy. The console server must be able to reach this URL. | server_config.ssl.ca_path | | Path to the CA certificate. To enable HTTPS, `ca_path`, `cert_path`, and `key_path` must all be set. | server_config.ssl.cert_path | | Path to the server certificate. To enable HTTPS, `ca_path`, `cert_path`, and `key_path` must all be set. | server_config.ssl.hostname_validation | | Verify hostname matches the TLS certificate. @@ -107,12 +107,12 @@ These values can be modified in the JSON configuration file. | server_config.kubernetes.k8s_namespace | pulsar | When using `k8s` auth_mode, Kubernetes namespace that contains the username/password secrets. | server_config.kubernetes.service_host| | When using `k8s` auth_mode, specify a custom Kubernetes host name. | server_config.kubernetes.service_port | | When using `k8s` auth_mode, specify a custom Kubernetes port. -| server_config.token_path | | When using `user` or `k8s` auth mode, a Pulsar token is used to connect to the Pulsar cluster. This specifies the path to a file that contains the token to use. For full access, a superuser token is recommended. Alternatively, use `admin_token`. +| server_config.token_path | | When using `user` or `k8s` auth mode, a {pulsar-short} token is used to connect to the {pulsar-short} cluster. This specifies the path to a file that contains the token to use. For full access, a superuser token is recommended. Alternatively, use `admin_token`. | server_config.token_secret| | Secret used when signing access token for logging into the console. If not specified, a default secret is used. | server_config.user_auth.username | | When using `user` auth_mode, the login user name. | server_config.user_auth.password | | When using `user` auth_mode, the login password. -| server_config.websocket_url | https://websocket.example.com:8500 | URL for WebSocket proxy. Used by Test Clients to connect to Pulsar. The console server must be able to reach this URL. -| tenant | public | The default Pulsar tenant to view when starting the console. +| server_config.websocket_url | https://websocket.example.com:8500 | URL for WebSocket proxy. Used by Test Clients to connect to {pulsar-short}. The console server must be able to reach this URL. +| tenant | public | The default {pulsar-short} tenant to view when starting the console. |=== [#auth-modes] @@ -122,12 +122,12 @@ The `auth_mode` setting has four available configurations. === "auth_mode": "none" -No login screen is presented. Authentication must be disabled in Pulsar because the Admin Console will not attempt to authenticate. +No login screen is presented. Authentication must be disabled in {pulsar-short} because the Admin Console will not attempt to authenticate. === "auth_mode": "user" The Admin Console is protected by a login screen. Credentials are configured using the `username` and `password` settings in the `/config/default.json` file. -Once authenticated with these credentials, the token for connecting to Pulsar is retrieved from the server (configured using `token_path` or `admin_token`) and used to authenticate with the Pulsar cluster. +Once authenticated with these credentials, the token for connecting to {pulsar-short} is retrieved from the server (configured using `token_path` or `admin_token`) and used to authenticate with the {pulsar-short} cluster. === "auth_mode": "k8" @@ -142,20 +142,20 @@ The password must be stored in the secret with a key of `password` and a value o Multiple secrets with the prefix can be configured to set up multiple users for the Admin Console. A password can be reset by patching the corresponding Kubernetes secret. -Once the user is authenticated using one of the Kubernetes secrets, the token for connecting to Pulsar is retrieved from the server (configured using `token_path` or `admin_token`) and used to authenticate with the Pulsar cluster. +Once the user is authenticated using one of the Kubernetes secrets, the token for connecting to {pulsar-short} is retrieved from the server (configured using `token_path` or `admin_token`) and used to authenticate with the {pulsar-short} cluster. === "auth_mode": "openidconnect" In this auth mode, the dashboard will use your login credentials to retrieve a JWT from an authentication provider. -In the *DataStax Pulsar Helm Chart*, this is implemented by integrating the Pulsar Admin Console with Keycloak. Upon successful retrieval of the JWT, the Admin Console will use the retrieved JWT as the bearer token when making calls to Pulsar. +In the *{company} {pulsar-short} Helm Chart*, this is implemented by integrating the {pulsar-short} Admin Console with Keycloak. Upon successful retrieval of the JWT, the Admin Console will use the retrieved JWT as the bearer token when making calls to {pulsar-short}. -In addition to configuring the `auth_mode`, you must also configure the `oauth_client_id` (see <>). This is the client id that the Console will use when authenticating with Keycloak. Note that in Keycloak, it is important that this client exists and that it has the sub claim properly mapped to your desired Pulsar subject. Otherwise, the JWT won't work as desired. +In addition to configuring the `auth_mode`, you must also configure the `oauth_client_id` (see <>). This is the client id that the Console will use when authenticating with Keycloak. Note that in Keycloak, it is important that this client exists and that it has the sub claim properly mapped to your desired {pulsar-short} subject. Otherwise, the JWT won't work as desired. ==== Connecting to an OpenID Connect Auth/Identity Provider When opening the Admin Console, the first page is the login page. When using the `openidconnect` auth mode, the auth call needs to go to the Provider's server. -In the current design, nginx must be configured to route the call to the provider. The *DataStax Pulsar Helm Chart* does this automatically. +In the current design, nginx must be configured to route the call to the provider. The *{company} {pulsar-short} Helm Chart* does this automatically. == Next steps diff --git a/modules/components/pages/heartbeat-vm.adoc b/modules/components/pages/heartbeat-vm.adoc index 306b1815..039f878a 100644 --- a/modules/components/pages/heartbeat-vm.adoc +++ b/modules/components/pages/heartbeat-vm.adoc @@ -1,6 +1,6 @@ = Heartbeat on VM/Server -This document describes how to install Pulsar Heartbeat on a virtual machine (VM) or server. For installation with the Docker image, see xref:install-upgrade:quickstart-helm-installs.adoc[Helm Chart Installation]. +This document describes how to install {pulsar-short} Heartbeat on a virtual machine (VM) or server. For installation with the Docker image, see xref:install-upgrade:quickstart-helm-installs.adoc[Helm Chart Installation]. == Install Heartbeat Binary @@ -23,7 +23,7 @@ ls ~/Downloads/pulsar-heartbeat-{heartbeat-version}-linux-amd64 == Execute Heartbeat binary -The Pulsar Heartbeat configuration is defined by a `.yaml` file. A yaml template for Heartbeat is available at https://github.com/datastax/pulsar-heartbeat/blob/master/config/runtime-template.yml[]. In this file, the environmental variable `PULSAR_OPS_MONITOR_CFG` tells the application where to source the file. +The {pulsar-short} Heartbeat configuration is defined by a `.yaml` file. A yaml template for Heartbeat is available at https://github.com/datastax/pulsar-heartbeat/blob/master/config/runtime-template.yml[]. In this file, the environmental variable `PULSAR_OPS_MONITOR_CFG` tells the application where to source the file. Run the binary file `pulsar-heartbeat---`. diff --git a/modules/components/pages/pulsar-beam.adoc b/modules/components/pages/pulsar-beam.adoc index ae75a3a0..36dd1048 100644 --- a/modules/components/pages/pulsar-beam.adoc +++ b/modules/components/pages/pulsar-beam.adoc @@ -1,13 +1,13 @@ -= Pulsar Beam with Luna Streaming -:navtitle: Pulsar Beam -:description: Install a minimal Luna Streaming helm chart that includes Pulsar Beam += {pulsar-short} Beam with Luna Streaming +:navtitle: {pulsar-short} Beam +:description: Install a minimal Luna Streaming Helm chart that includes {pulsar-short} Beam :helmValuesPath: https://raw.githubusercontent.com/datastaxdevs/luna-streaming-examples/main/beam/values.yaml -The https://github.com/kafkaesque-io/pulsar-beam[Pulsar Beam] project is an HTTP-based streaming and queueing system for use with Apache Pulsar. +The https://github.com/kafkaesque-io/pulsar-beam[{pulsar-short} Beam] project is an HTTP-based streaming and queueing system for use with {pulsar}. -With Pulsar Beam, you can send messages over HTTP, push messages to a webhook or cloud function, chain webhooks and functions together, or stream messages through server-sent events (SSE). +With {pulsar-short} Beam, you can send messages over HTTP, push messages to a webhook or cloud function, chain webhooks and functions together, or stream messages through server-sent events (SSE). -In this guide, you'll install a minimal DataStax Pulsar Helm chart that includes Pulsar Beam. +In this guide, you'll install a minimal {company} {pulsar-short} Helm chart that includes {pulsar-short} Beam. == Prerequisites @@ -28,7 +28,7 @@ In a separate terminal window, port forward the Beam endpoint service: kubectl port-forward -n datastax-pulsar service/pulsar-proxy 8085:8085 ---- -The forwarding service will map the URL:PORT https://127.0.0.1:8085 to Pulsar Proxy running in the new cluster. +The forwarding service will map the URL:PORT https://127.0.0.1:8085 to {pulsar-short} Proxy running in the new cluster. Because Beam was enabled, the Proxy knows to forward on to the Beam service. [source,shell] @@ -75,7 +75,7 @@ id: {9 0 0 0 0xc002287ad0} data: Hi there ---- -You have now completed the basics of using Beam in a Pulsar Cluster. Refer to the project's https://github.com/kafkaesque-io/pulsar-beam/blob/master/README.md[readme] to see all the possibilities! +You have now completed the basics of using Beam in a {pulsar-short} Cluster. Refer to the project's https://github.com/kafkaesque-io/pulsar-beam/blob/master/README.md[readme] to see all the possibilities! == A Python producer and consumer @@ -157,6 +157,6 @@ include::partial$cleanup-terminal-and-helm.adoc[] Here are links to resources and guides you might be interested in: -* https://github.com/kafkaesque-io/pulsar-beam[Learn more] about the Pulsar Beam project -* https://kafkaesque-io.github.io/pulsar-beam-swagger[Pulsar Beam API] +* https://github.com/kafkaesque-io/pulsar-beam[Learn more] about the {pulsar-short} Beam project +* https://kafkaesque-io.github.io/pulsar-beam-swagger[{pulsar-short} Beam API] * xref:pulsar-sql.adoc[] \ No newline at end of file diff --git a/modules/components/pages/pulsar-monitor.adoc b/modules/components/pages/pulsar-monitor.adoc index 36a67d13..767d226d 100644 --- a/modules/components/pages/pulsar-monitor.adoc +++ b/modules/components/pages/pulsar-monitor.adoc @@ -1,26 +1,26 @@ -= Pulsar Heartbeat += {pulsar-short} Heartbeat -Pulsar Heartbeat monitors the availability, tracks the performance, and reports failures of the Pulsar cluster. +{pulsar-short} Heartbeat monitors the availability, tracks the performance, and reports failures of the {pulsar-short} cluster. It produces synthetic workloads to measure end-to-end message pubsub latency. -Pulsar Heartbeat is a cloud native application that can be installed by Helm within a Pulsar Kubernetes cluster. It can also monitor multiple Pulsar clusters. +{pulsar-short} Heartbeat is a cloud native application that can be installed by Helm within a {pulsar-short} Kubernetes cluster. It can also monitor multiple {pulsar-short} clusters. -TIP: Pulsar Heartbeat is installed automatically for server/VM installations as described in xref:install-upgrade:quickstart-server-installs.adoc[]. +TIP: {pulsar-short} Heartbeat is installed automatically for server/VM installations as described in xref:install-upgrade:quickstart-server-installs.adoc[]. -Pulsar Heartbeat supports the following features: +{pulsar-short} Heartbeat supports the following features: * Monitor message pubsub and admin REST API endpoint * Measure end-to-end message latency from producing to consuming messages -* Measure message latency over the websocket interface, and Pulsar function -* Monitor instance availability of broker, proxy, bookkeeper, and zookeeper in a Pulsar Kubernetes cluster -* Monitor individual Pulsar broker's health +* Measure message latency over the websocket interface, and {pulsar-short} function +* Monitor instance availability of broker, proxy, bookkeeper, and zookeeper in a {pulsar-short} Kubernetes cluster +* Monitor individual {pulsar-short} broker's health * Incident alert integration with OpsGenie * Customer configurable alert thresholds and probe test intervals * Slack alerts == Configuration -Pulsar Heartbeat is a data driven tool that sources configuration from a YAML or JSON file. The configuration JSON file can be specified in the following order of precedence: +{pulsar-short} Heartbeat is a data driven tool that sources configuration from a YAML or JSON file. The configuration JSON file can be specified in the following order of precedence: * An environment variable `PULSAR_OPS_MONITOR_CFG` * A command line argument `./pulsar-heartbeat -config /path/to/runtime.yml` @@ -30,7 +30,7 @@ You can download a template https://github.com/datastax/pulsar-heartbeat/blob/ma == Observability -Pulsar Heartbeat exposes Prometheus compliant metrics at the `\metrics` endpoint for scraping. The exported metrics are: +{pulsar-short} Heartbeat exposes Prometheus compliant metrics at the `\metrics` endpoint for scraping. The exported metrics are: [cols=3] |=== @@ -75,13 +75,13 @@ Pulsar Heartbeat exposes Prometheus compliant metrics at the `\metrics` endpoint == In-cluster monitoring -Pulsar Heartbeat can be deployed within the same Pulsar Kubernetes cluster. +{pulsar-short} Heartbeat can be deployed within the same {pulsar-short} Kubernetes cluster. Kubernetes' pod and service, and individual broker monitoring are only supported within the same Kubernetes cluster deployment. == Docker -Pulsar Heartbeat's official docker image can be pulled https://hub.docker.com/r/datastax/pulsar-heartbeat[here]. +{pulsar-short} Heartbeat's official docker image can be pulled https://hub.docker.com/r/datastax/pulsar-heartbeat[here]. === Docker compose @@ -90,7 +90,7 @@ Pulsar Heartbeat's official docker image can be pulled https://hub.docker.com/r/ docker-compose up ---- -`./config/runtime.yml` or `./config/runtime.json` must have a Pulsar jwt and must be configured properly. +`./config/runtime.yml` or `./config/runtime.json` must have a {pulsar-short} jwt and must be configured properly. === Docker example @@ -101,4 +101,4 @@ docker run -d -it -v ./config/runtime.yml:/config/runtime.yml -v /etc/pki/ca-tru The `runtime.yml/yaml` or `runtime.json` file must be mounted to `/config/runtime.yml` as the default configuration path. -Run docker container with Pulsar CA certificate if TLS is enabled and expose Prometheus metrics for collection. \ No newline at end of file +Run docker container with {pulsar-short} CA certificate if TLS is enabled and expose Prometheus metrics for collection. \ No newline at end of file diff --git a/modules/components/pages/pulsar-sql.adoc b/modules/components/pages/pulsar-sql.adoc index f500858c..0fe77463 100644 --- a/modules/components/pages/pulsar-sql.adoc +++ b/modules/components/pages/pulsar-sql.adoc @@ -1,20 +1,20 @@ -= Using Pulsar SQL with Luna Streaming -:navtitle: Pulsar SQL -:description: This guide installs the luna streaming helm chart using minimum values for a working Pulsar cluster that includes SQL workers += Using {pulsar-short} SQL with Luna Streaming +:navtitle: {pulsar-short} SQL +:description: This guide installs the Luna Streaming Helm chart using minimum values for a working {pulsar-short} cluster that includes SQL workers :helmValuesPath: https://raw.githubusercontent.com/datastaxdevs/luna-streaming-examples/main/pulsar-sql/values.yaml -Pulsar SQL allows enterprises to query Apache Pulsar topic data with SQL. +{pulsar-short} SQL allows enterprises to query {pulsar} topic data with SQL. This is a powerful feature for an Enterprise, and SQL is a language they're likely familiar with. Stream processing, real-time analytics, and highly customized dashboards are just a few of the possibilities. -Pulsar offers a pre-made plugin for Trino that is included in its distribution. -Additionally, Pulsar has built-in options to create Trino workers and automatically configure the communications between Pulsar's ledger and Trino. +{pulsar-short} offers a pre-made plugin for Trino that is included in its distribution. +Additionally, {pulsar-short} has built-in options to create Trino workers and automatically configure the communications between {pulsar-short}'s ledger and Trino. -In this guide, we will use the DataStax Pulsar Helm Chart to install a Pulsar cluster with Pulsar SQL. +In this guide, we will use the {company} {pulsar-short} Helm Chart to install a {pulsar-short} cluster with {pulsar-short} SQL. The Trino coordinator and desired number of workers will be created directly in the cluster. == Prerequisites -* Pulsar CLI +* {pulsar-short} CLI * https://prestodb.io/docs/current/installation/cli.html[Presto CLI] (this example version 0.278.1) * https://helm.sh/docs/intro/install/[Helm 3 CLI] (this example uses version 3.8.0) * https://kubernetes.io/docs/tasks/tools/[Kubectl CLI] (this example uses version 1.23.4) @@ -22,7 +22,7 @@ The Trino coordinator and desired number of workers will be created directly in [IMPORTANT] ==== -PrestoDB has been replaced by Trino, but Apache Pulsar is using Presto's version. +PrestoDB has been replaced by Trino, but {pulsar} is using Presto's version. The Trino CLI uses the "X-TRINO-USER" header for authentications but Presto expects "X-PRESTO-USER", which is why we use the Presto CLI. ==== @@ -34,7 +34,7 @@ include::partial$install-helm.adoc[] You'll need to interact with services in the K8s cluster. Map a few ports to those services. -There's no need to forward Pulsar's messaging service ports. +There's no need to forward {pulsar-short}'s messaging service ports. In a new terminal window, port forward the Presto SQL service: @@ -43,7 +43,12 @@ In a new terminal window, port forward the Presto SQL service: kubectl port-forward -n datastax-pulsar service/pulsar-sql 8090:8090 ---- -include::partial$port-forward-web-service.adoc[] +In a separate terminal, port forward {pulsar-short}'s admin service: + +[source,shell] +---- +kubectl port-forward -n datastax-pulsar service/pulsar-broker 8080:8080 +---- == Confirm Presto is available @@ -62,11 +67,11 @@ image::presto-sql-dashboard.png[Presto SQL dashboard] == Fill a topic with the data-generator source In this example, we will use the "data-generator" source connector to create a topic and add sample data simultaneously. -The minimalist Helm chart values use the https://github.com/datastax/release-notes/blob/master/Luna_Streaming_2.10_Release_Notes.md#lunastreaming-all-distribution[datastax/lunastreaming-all] image, which includes all supported Pulsar connectors. +The minimalist Helm chart values use the https://github.com/datastax/release-notes/blob/master/Luna_Streaming_2.10_Release_Notes.md#lunastreaming-all-distribution[datastax/lunastreaming-all] image, which includes all supported {pulsar-short} connectors. This example uses the "public" tenant and "default" namespace. -These are created by default in Pulsar, but you can use whatever tenant and namespace you are comfortable with. +These are created by default in {pulsar-short}, but you can use whatever tenant and namespace you are comfortable with. -. Download the minimalist Pulsar client. +. Download the minimalist {pulsar-short} client. This "client.conf" assumes the port forwarding addresses we will perform in the next step. + [source,shell] @@ -82,8 +87,8 @@ wget https://raw.githubusercontent.com/datastaxdevs/luna-streaming-examples/main export PULSAR_CLIENT_CONF= ---- -. Navigate to the Pulsar home folder and run the following command. -The CLI will use the environment variable's value as configuration for interacting with the Pulsar cluster. +. Navigate to the {pulsar-short} home folder and run the following command. +The CLI will use the environment variable's value as configuration for interacting with the {pulsar-short} cluster. + [source,shell] ---- @@ -129,7 +134,7 @@ The user can match the name you used to login earlier in this guide, but doesn't presto> show catalogs; ---- + -Notice the similarities between your Pulsar tenant/namespaces and Presto's output: +Notice the similarities between your {pulsar-short} tenant/namespaces and Presto's output: + .Result [source,console] @@ -170,7 +175,7 @@ Query 20230103_163355_00001_zvk84, FINISHED, 2 nodes presto> select * from pulsar."public/default".mytopic limit 10; ---- + -The output should be the 10 messages that were added to the Pulsar topic previously. +The output should be the 10 messages that were added to the {pulsar-short} topic previously. + If you prefer, you can query your table with the Presto client REST API. The response will include a `nextUri` value. @@ -192,7 +197,7 @@ select * from pulsar."public/default".mytopic limit 10 presto> exit ---- -You have successfully interacted with a Pulsar Cluster via SQL. +You have successfully interacted with a {pulsar-short} Cluster via SQL. Want to put your new learnings to the test? Try using the Presto plugin in https://redash.io/data-sources/presto[Redash] or https://superset.apache.org/docs/databases/presto/[Superset] to create useful dashboards. @@ -200,11 +205,11 @@ Want to put your new learnings to the test? Try using the Presto plugin in https === Why are there quotes around the schema name? You might wonder why there are quotes ("") around the schema name. -This is a result of mapping Presto primitives to Pulsar's primitives. +This is a result of mapping Presto primitives to {pulsar-short}'s primitives. Presto has catalogs, schemas, and tables. -Pulsar has tenants, namespaces, and topics. -The Pulsar Presto plugin assumes the catalog name which leaves schema and table, so the tenant and namespace are combined with a forward slash delimited string. Presto has to see that combination as a single string, which means it needs to be wrapped in quotes. +{pulsar-short} has tenants, namespaces, and topics. +The {pulsar-short} Presto plugin assumes the catalog name which leaves schema and table, so the tenant and namespace are combined with a forward slash delimited string. Presto has to see that combination as a single string, which means it needs to be wrapped in quotes. == Connect with JDBC diff --git a/modules/components/pages/starlight-for-kafka.adoc b/modules/components/pages/starlight-for-kafka.adoc deleted file mode 100644 index 330ba842..00000000 --- a/modules/components/pages/starlight-for-kafka.adoc +++ /dev/null @@ -1,116 +0,0 @@ -= Using Starlight for Kafka with Luna Streaming -:navtitle: Starlight for Kafka -:description: This guide will take you step-by-step through deploying DataStax Luna Streaming helm chart with the Starlight for Kafka protocol handler extension -:helmValuesPath: https://raw.githubusercontent.com/datastaxdevs/luna-streaming-examples/main/starlight-for-kafka/values.yaml - -Starlight for Kafka brings the native Apache Kafka protocol support to Apache Pulsar by introducing a Kafka protocol handler on Pulsar brokers. -By adding the Starlight for Kafka protocol handler to your Pulsar cluster, you can migrate your existing Kafka applications and services to Pulsar without modifying the code. - -== Prerequisites - -* https://helm.sh/docs/intro/install/[Helm 3 CLI] (we used version 3.8.0) -* https://www.apache.org/dyn/closer.cgi?path=/kafka/3.3.1/kafka_2.13-3.3.1.tgz[Kafka CLI] (we used version 3.3.1) -* https://kubernetes.io/docs/tasks/tools/[Kubectl CLI] (we used version 1.23.4) -* Enough access to a K8s cluster to create a namespace, deployments, and pods - -== Install Luna Streaming helm chart - -include::partial$install-helm.adoc[] - -== Forward service port - -You'll need to interact with a few of the services in the K8s cluster. -Map a few ports to those services. - -include::partial$port-forward-web-service.adoc[] - -In a separate terminal window, port forward the Starlight for Kafka service: - -[source,shell] ----- -kubectl port-forward -n datastax-pulsar service/pulsar-proxy 9092:9092 ----- - -== Have a look around - -The Luna Streaming Helm Chart automatically creates a tenant named "public" and a namespace within that tenant named "default". - -The Starlight for Kafka extension creates a few namespaces and topics to function correctly. - -List the namespaces in the "public" tenant to see what was created. - -[source,shell] ----- -~/apache-pulsar-2.10.1$ ./bin/pulsar-admin namespaces list public ----- - -The output should be similar to the following. - -[source,console] ----- -public/__kafka -public/__kafka_producerid -public/default ----- - -Notice the namespaces prefixed with "__kafka". -These are used by the service for different functions. -To learn more about Starlight for Kafka operations, see the S4K xref:starlight-for-kafka:ROOT:index.adoc[documentation]. - -== Produce a message with the Kafka CLI - -If you hadn't noticed, we never opened the Pulsar binary port to accept new messages. -Only the admin port and the Kafka port are open. -To further show how native Starlight for Kafka is to Pulsar, we will use the Kafka CLI to produce and consume messages from Pulsar. - -From within the Kafka directory, run the following command to start the shell. - -[source,shell] ----- -~/kafka_2.13-3.3.1$ ./bin/kafka-console-producer.sh --topic quickstart --bootstrap-server localhost:9092 ----- - -Type a message, press Enter to send it, then Ctrl-C to exit the producer shell. - -A `quickstart` topic is created automatically because the default behavior of Starlight for Kafka is to create a new single partition, persistent topic when one is not present. -You can configure this behavior and many other S4K parameters in the https://github.com/datastaxdevs/luna-streaming-examples/blob/main/starlight-for-kafka/values.yaml[Helm chart]. -Learn more about the configuration values xref:starlight-for-kafka:configuration:starlight-kafka-configuration.adoc[here]. - -Let's have a look at the topic that was created. From your Pulsar home folder, run the following command. - -[source,shell] ----- -~/apache-pulsar-2.10.1$ ./bin/pulsar-admin topics list public/default ----- - -The output will include the newly created topic. - -[source,console] ----- -persistent://public/default/quickstart-partition-0 ----- - -== Consume the new message with the Kafka CLI - -Let's use the Kafka CLI to consume the message we just produced. Start the consumer shell from the Kafka home folder with the following command. - -[source,shell] ----- -~/kafka_2.13-3.3.1$ ./bin/kafka-console-consumer.sh --topic quickstart --from-beginning --bootstrap-server localhost:9092 ----- - -The data of our new message will be output. - -Enter Ctrl-C to exit the shell. - -== Next steps - -Kafka users and existing applications using Kafka can enjoy the many benefits of a Pulsar cluster, while never having to change tooling or libraries. -Other folks that are more comfortable with Pulsar tooling and clients can also interact with the same topics. Together, new and legacy applications work together to create modern solutions. - -Here are links to other guides and resource you might be interested in. - -* xref:streaming-learning:use-cases-architectures:starlight/kafka/index.adoc[Messaging with Starlight for Kafka] -* xref:pulsar-beam.adoc[] -* xref:pulsar-sql.adoc[] -* xref:heartbeat-vm.adoc[] \ No newline at end of file diff --git a/modules/components/pages/starlight-for-rabbitmq.adoc b/modules/components/pages/starlight-for-rabbitmq.adoc deleted file mode 100644 index 57e957a2..00000000 --- a/modules/components/pages/starlight-for-rabbitmq.adoc +++ /dev/null @@ -1,95 +0,0 @@ -= Using Starlight for RabbitMQ with Luna Streaming -:navtitle: Starlight for RabbitMQ -:description: This guide will take you step-by-step through deploying DataStax Luna Streaming helm chart with the Starlight for RabbitMQ protocol handler extension -:helmValuesPath: https://raw.githubusercontent.com/datastaxdevs/luna-streaming-examples/main/starlight-for-rabbitmq/values.yaml - -Starlight for RabbitMQ brings native https://www.rabbitmq.com/[RabbitMQ] protocol support to https://pulsar.apache.org/[Apache Pulsar(TM)] by introducing a RabbitMQ protocol handler on Pulsar brokers or Pulsar proxies. -By adding the Starlight for RabbitMQ protocol handler to your Pulsar cluster, you can migrate your existing RabbitMQ applications and services to Pulsar without modifying the code. - -== Prerequisites - -* https://helm.sh/docs/intro/install/[Helm 3 CLI] (we used version 3.8.0) -* https://kubernetes.io/docs/tasks/tools/[Kubectl CLI] (we used version 1.23.4) -* Python (we used version 3.8.10) -* Enough access to a K8s cluster to create a namespace, deployments, and pods - -== Install Luna Streaming helm chart - -include::partial$install-helm.adoc[] - -== Forward service port - -You'll need to interact with a few of the services in the K8s cluster. -Map a few ports to those services. - -include::partial$port-forward-web-service.adoc[] - -In a separate terminal window, port forward the Starlight for RabbitMQ service: - -[source,shell] ----- -kubectl port-forward -n datastax-pulsar service/pulsar-proxy 5672:5672 ----- - -== Produce a message with the RabbitMQ Python client - -If you hadn't noticed, we never opened the Pulsar binary port to accept new messages. -Only the admin port and the RabbitMQ port are open. -To further demonstrate how native Starlight for RabbitMQ is, we will use the Pika RabbitMQ Python library to produce and consume messages from Pulsar. - -Save the following Python script to a safe place as `test-queue.py`. -The script assumes you have opened the localhost:5672 port. - -[source,python] ----- -#!/usr/bin/env python -import pika - -connection = pika.BlockingConnection(pika.ConnectionParameters(port=5672)) -channel = connection.channel() - -try: - channel.queue_declare("test-queue") - print("created test-queue queue") - - channel.basic_publish(exchange="", routing_key="test-queue", body="test".encode('utf-8')) - print("published message test") - - _, _, res = channel.basic_get(queue="test-queue", auto_ack=True) - assert res is not None, "should have received a message" - print("received message: " + res.decode()) - - channel.queue_delete("test-queue") - print("deleted test-queue queue") - -finally: - connection.close() ----- - -Open a terminal and return to the safe place where you saved the Python script. -Run the following command to execute the Python program. - -[source,shell] ----- -python ./test-queue.py ----- - -The output should look like the following. - -[source,console] ----- -created test-queue queue -published message test -received message: test -deleted test-queue queue ----- - -== Next steps - -The Luna Helm chart deployed Starlight for RabbitMQ on the Pulsar proxy and opened the correct port. -Your application will now "talk" to Pulsar as if it were a real RabbitMQ host. - -* xref:streaming-learning:use-cases-architectures:starlight/rabbitmq/index.adoc[Messaging with Starlight for RabbitMQ] -* xref:pulsar-beam.adoc[] -* xref:pulsar-sql.adoc[] -* xref:heartbeat-vm.adoc[] \ No newline at end of file diff --git a/modules/components/pages/starlight.adoc b/modules/components/pages/starlight.adoc new file mode 100644 index 00000000..2d16f8c0 --- /dev/null +++ b/modules/components/pages/starlight.adoc @@ -0,0 +1,34 @@ += {company} Starlight suite of {pulsar-reg} extensions +:navtitle: Starlight + +The Starlight suite of extensions is a collection of {pulsar-reg} protocol handlers that extend an existing {pulsar-short} cluster. +The goal of these extensions is to create a native, seamless interaction with a {pulsar-short} cluster using existing tooling and clients. + +Each extension integrates two popular event streaming ecosystems, unlocking new use cases and reducing barriers for users adopting {pulsar-short}. +Leverage advantages from each ecosystem to build a truly unified event streaming platform, accelerating the development of real-time applications and services. + +The Starlight extensions are open source and included in https://www.ibm.com/docs/en/supportforpulsar[IBM Elite Support for {pulsar}]. + +== {starlight-kafka} + +The https://github.com/datastax/starlight-for-kafka[{starlight-kafka} extension] brings native Apache Kafka(R) protocol support to {pulsar} by introducing a Kafka protocol handler on {pulsar-short} brokers. + +For more information, see the xref:starlight-for-kafka:ROOT:index.adoc[{starlight-kafka} documentation]. + +== {starlight-rabbitmq} + +The https://github.com/datastax/starlight-for-rabbitmq[{starlight-rabbitmq} extension] brings native RabbitMQ(R) protocol support to {pulsar-reg}. + +For more information, see the xref:starlight-for-rabbitmq:ROOT:index.adoc[{starlight-rabbitmq} documentation]. + +== Starlight for JMS + +The https://github.com/datastax/pulsar-jms[Starlight for JMS extension] allows enterprises to take advantage of the scalability and resiliency of a modern streaming platform to run their existing JMS applications. + +For more information, see the xref:starlight-for-jms:ROOT:index.adoc[Starlight for JMS documentation]. + +== See also + +* xref:components:pulsar-beam.adoc[] +* xref:components:pulsar-sql.adoc[] +* xref:components:heartbeat-vm.adoc[] \ No newline at end of file diff --git a/modules/components/partials/install-helm.adoc b/modules/components/partials/install-helm.adoc index f6ddb383..11b558cf 100644 --- a/modules/components/partials/install-helm.adoc +++ b/modules/components/partials/install-helm.adoc @@ -1,4 +1,4 @@ -. Add the DataStax Helm chart repo to your Helm store: +. Add the {company} Helm chart repo to your Helm store: + [source,shell] ---- @@ -6,7 +6,7 @@ helm repo add datastax-pulsar https://datastax.github.io/pulsar-helm-chart ---- . Install the Helm chart using a minimalist values file. -This command creates a Helm release named "my-pulsar-cluster" using the DataStax Luna Helm chart, within the K8s namespace "datastax-pulsar". +This command creates a Helm release named "my-pulsar-cluster" using the {company} Luna Helm chart, within the K8s namespace "datastax-pulsar". The minimal cluster creates only the essential components and has no ingress or load balanced services. + [source,shell,subs="attributes+"] diff --git a/modules/components/partials/port-forward-web-service.adoc b/modules/components/partials/port-forward-web-service.adoc deleted file mode 100644 index d90c57d0..00000000 --- a/modules/components/partials/port-forward-web-service.adoc +++ /dev/null @@ -1,6 +0,0 @@ -In a new terminal, port forward Pulsar's admin service: - -[source,shell] ----- -kubectl port-forward -n datastax-pulsar service/pulsar-broker 8080:8080 ----- \ No newline at end of file diff --git a/modules/install-upgrade/pages/cluster-sizing-reference.adoc b/modules/install-upgrade/pages/cluster-sizing-reference.adoc index 2eb6ee0c..c5885129 100644 --- a/modules/install-upgrade/pages/cluster-sizing-reference.adoc +++ b/modules/install-upgrade/pages/cluster-sizing-reference.adoc @@ -1,9 +1,9 @@ = Installation topologies :navtitle: Cluster Sizing Reference -This page describes recommended starting points for Pulsar deployments. +This page describes recommended starting points for {pulsar-short} deployments. -* SANDBOX (or Pulsar Standalone) is an all-in-one single-node deployment that is useful for taking Pulsar for a test drive. +* SANDBOX (or {pulsar-short} Standalone) is an all-in-one single-node deployment that is useful for taking {pulsar-short} for a test drive. * DEVELOPMENT is a 3-node deployment that is not highly available, but able to maintain parity with the TESTING environment. * SINGLE REGION TESTING ENVIRONMENT is a highly-available 3/3/3 deployment. * HIGH-AVAILABILITY PRODUCTION ENVIRONMENT is a highly-available deployment replicated across 3 zones in 1 region. @@ -45,7 +45,7 @@ For example, if there are 3 zones, set a replication factor of 3. |Bookie |3 | -|Pulsar proxy +|{pulsar-short} Proxy |3 | |(Dedicated) Function Worker @@ -66,7 +66,7 @@ The number of function workers depends on the cluster's functions workload. |Bookie |6 |2 nodes per AZ^*^ -|Pulsar proxy +|{pulsar-short} Proxy |3 |1 node per AZ^*^ |Autorecovery @@ -78,7 +78,7 @@ The number of function workers depends on the cluster's functions workload. == Hardware sizing -The following table lists the minimum hardware requirements for a Pulsar cluster. +The following table lists the minimum hardware requirements for a {pulsar-short} cluster. [cols=4*,options=header] |=== @@ -110,7 +110,7 @@ a|* CPU: 4 vCPU * Data Disk Journal: 32 GB SSD * Data Disk Ledger: 256 GB SSD | -|Pulsar Proxy, Function Worker +|{pulsar-short} Proxy, Function Worker a|* CPU: 4 vCPU * Memory: 8 GB | @@ -132,7 +132,7 @@ a|* CPU: 8 vCPU * Data Disk Journal: 256 GB SSD * Data Disk Ledger: 1024 GB SSD |Ledger disk capacity can be beyond 1TB. -|Pulsar Proxy, Autorecovery +|{pulsar-short} Proxy, Autorecovery a|* CPU: 4 vCPU * Memory: 16 GB | diff --git a/modules/install-upgrade/pages/production-cluster-sizing.adoc b/modules/install-upgrade/pages/production-cluster-sizing.adoc index 5aacdb78..1a85e512 100644 --- a/modules/install-upgrade/pages/production-cluster-sizing.adoc +++ b/modules/install-upgrade/pages/production-cluster-sizing.adoc @@ -1,9 +1,9 @@ = Production cluster sizing -This document summarizes DataStax's recommendations for the sizing and optimization of an Apache Pulsar cluster on Linux in a production environment. -Remember, a Pulsar *instance* is made of one or many clusters. +This document summarizes {company}'s recommendations for the sizing and optimization of an {pulsar} cluster on Linux in a production environment. +Remember, a {pulsar-short} *instance* is made of one or many clusters. -Of course, the sizing of a cluster depends on factors like use case and expected load, so this document is not intended to be a one-size-fits-all guide. Rather, we'd like to demonstrate how we consider and spec the initial size of a Pulsar cluster, and assist you on your journey to unlocking the scaling power of Pulsar. +Of course, the sizing of a cluster depends on factors like use case and expected load, so this document is not intended to be a one-size-fits-all guide. Rather, we'd like to demonstrate how we consider and spec the initial size of a {pulsar-short} cluster, and assist you on your journey to unlocking the scaling power of {pulsar-short}. This page summarizes the requirements, assumptions, definitions, and methodologies that inform our cluster sizing recommendations. If you're looking for specific deployment topology recommendations, see xref:cluster-sizing-reference.adoc[]. @@ -12,27 +12,27 @@ include::operations:partial$operator-scaling.adoc[] == Dedicated VMs or Kubernetes -As you begin your journey to design an Apache Pulsar cluster, one of the first questions to consider is what infrastructure your cluster will run on. +As you begin your journey to design an {pulsar} cluster, one of the first questions to consider is what infrastructure your cluster will run on. Most of this guide will focus on running a cluster with dedicated virtual machines. While Kubernetes is the more popular option, it is easier to express disk calculations, throughput, and secure communications in terms of a VM. -== Pulsar cluster components +== {pulsar-short} cluster components -Pulsar clusters come in many shapes and sizes. There are minimum components for base functionality, and there are recommended components that make message routing, management, and observability easier. For this guide we will focus on the required components and what it takes to make them resilient to outages and highly available in a three-zone cloud. +{pulsar-short} clusters come in many shapes and sizes. There are minimum components for base functionality, and there are recommended components that make message routing, management, and observability easier. For this guide we will focus on the required components and what it takes to make them resilient to outages and highly available in a three-zone cloud. === Required components -* https://pulsar.apache.org/docs/concepts-architecture-overview/#metadata-store[Zookeeper] - This is Pulsar's meta data store. It stores data about a cluster's configuration, helps the proxy direct messages to the correct broker, and holds Bookie configurations. Start with 1 instance of Zookeeper in each availability zone (AZ) to mitigate a single failure point, and scale Zookeeper as cluster traffic increases. You could scale Zookeeper as traffic within the cluster increases, but it shouldn't be very often as it can handle quite a bit of load. +* https://pulsar.apache.org/docs/concepts-architecture-overview/#metadata-store[Zookeeper] - This is {pulsar-short}'s meta data store. It stores data about a cluster's configuration, helps the proxy direct messages to the correct broker, and holds Bookie configurations. Start with 1 instance of Zookeeper in each availability zone (AZ) to mitigate a single failure point, and scale Zookeeper as cluster traffic increases. You could scale Zookeeper as traffic within the cluster increases, but it shouldn't be very often as it can handle quite a bit of load. -* https://pulsar.apache.org/docs/concepts-architecture-overview/#brokers[Broker] - This is Pulsar's message router. +* https://pulsar.apache.org/docs/concepts-architecture-overview/#brokers[Broker] - This is {pulsar-short}'s message router. Ideally, each broker should be fully utilized without becoming a performance bottleneck. -The Pulsar broker is stateless, so it requires considerable computing power but not much storage. +The {pulsar-short} broker is stateless, so it requires considerable computing power but not much storage. Start with 1 broker instance in each zone, and set a scaling rule that watches CPU load. The best way to optimize this is through performance testing based on your cluster's workload characteristics. -* https://pulsar.apache.org/docs/concepts-architecture-overview/#apache-bookkeeper[Bookkeeper (bookie)] - This is Pulsar's data store. +* https://pulsar.apache.org/docs/concepts-architecture-overview/#apache-bookkeeper[Bookkeeper (bookie)] - This is {pulsar-short}'s data store. Bookkeeper stores message data in a low-latency, resilient way. -Pulsar uses Bookkeeper's quorum math to function, so a loss of 1 Bookkeeper instance won't bring your system down, but will cause some data loss. +{pulsar-short} uses Bookkeeper's quorum math to function, so a loss of 1 Bookkeeper instance won't bring your system down, but will cause some data loss. Start with at least 3 bookies, with 1 in each AZ. At least 2 bookies per AZ are required for high availability, so if one bookie goes down, the other bookie in the AZ can take over. Scale bookies up on disc usage percentage. Scale down manually by making a bookie read-only, offloading its data, then terminating the instance. @@ -40,25 +40,25 @@ Scale bookies up on disc usage percentage. Scale down manually by making a booki [#recommended] === Recommended server components -The DataStax Luna Streaming Helm chart deployment includes optional but highly recommended server components for better Pulsar cluster metrics monitoring and operation visibility. +The {company} Luna Streaming Helm chart deployment includes optional but highly recommended server components for better {pulsar-short} cluster metrics monitoring and operation visibility. -* https://bookkeeper.apache.org/docs/admin/autorecovery[Bookkeeper AutoRecovery] - This is a Pulsar component that recovers Bookkeeper data in the event of a bookie outage. While optional you will want the insurance of autorecovery working on your behalf. +* https://bookkeeper.apache.org/docs/admin/autorecovery[Bookkeeper AutoRecovery] - This is a {pulsar-short} component that recovers Bookkeeper data in the event of a bookie outage. While optional you will want the insurance of autorecovery working on your behalf. A single instance of Autorecovery should be adequate - only in the most heavily-used clusters will you need more. -* https://pulsar.apache.org/docs/concepts-architecture-overview/#pulsar-proxy[Pulsar proxy] - The Pulsar proxy is just that - a proxy. +* https://pulsar.apache.org/docs/concepts-architecture-overview/#pulsar-proxy[{pulsar-short} proxy] - The {pulsar-short} proxy is just that - a proxy. It runs at the edge of the cluster with public facing endpoints. Without it, your brokers would expose those endpoints, which is not an ideal configuration in production. -Pulsar proxy also offers special options for cluster extensions, like our xref:starlight-for-kafka::index.adoc[Starlight Suite of APIs]. +{pulsar-short} proxy also offers special options for cluster extensions, like the xref:components:starlight.adoc[{company} Starlight suite of APIs]. Start with a proxy in each zone. The proxy will be made aware of all the brokers in the same zone and load balance across them. Have your load balancer round-robin to all proxy instances in all zones. Proxy is optional for VM deployments and required for Kubernetes deployments. Scale proxies by their network load or (if running extensions) also scale on CPU usage. -* https://pulsar.apache.org/docs/functions-worker-run-separately/[Dedicated functions worker(s)] - You can optionally run dedicated function workers in a Pulsar cluster. +* https://pulsar.apache.org/docs/functions-worker-run-separately/[Dedicated functions worker(s)] - You can optionally run dedicated function workers in a {pulsar-short} cluster. Without dedicated function workers, functions run as a separate process on the broker. Function worker spec is usually focused on compute and memory. Scale the workers based on overall usage (both CPU and memory). -* xref:luna-streaming:components:admin-console-tutorial.adoc[Pulsar AdminConsole] - This is an optional web-based admin console for managing Pulsar clusters, and makes management much easier than tons of CLI commands. The sizing and scaling for AdminConsole has nothing to do with the cluster, as it is not a failure point. -* xref:luna-streaming:components:heartbeat-vm.adoc[Pulsar Heartbeat] - This is an optional component that monitors the health of Pulsar cluster and emits metrics about the cluster that are helpful for observing and debugging issues. +* xref:luna-streaming:components:admin-console-tutorial.adoc[{pulsar-short} AdminConsole] - This is an optional web-based admin console for managing {pulsar-short} clusters, and makes management much easier than tons of CLI commands. The sizing and scaling for AdminConsole has nothing to do with the cluster, as it is not a failure point. +* xref:luna-streaming:components:heartbeat-vm.adoc[{pulsar-short} Heartbeat] - This is an optional component that monitors the health of {pulsar-short} cluster and emits metrics about the cluster that are helpful for observing and debugging issues. * Prometheus/Grafana/Alert manager stack - This is the default observability stack for a cluster. The Luna Helm chart includes pre-made dashboards in Grafana and pre-wires all the metrics scraping. image::pulsar-components.png[] @@ -67,37 +67,37 @@ image::pulsar-components.png[] == Message retention The broker ensures messages are received and delivered appropriately, but it is a stateless process so it doesn't use its memory to track this. Instead, the broker uses Bookkeepers (or "bookies") to store message data and the message's acknowledgement state. -A great benefit of Bookkeeper is its quorum policies. These policies make each bookie aware of the other bookies to form a bookkeeper cluster. With a cluster established, the cluster can have acknowledgement rules that form a data replication factor. For example, if you had 3 bookies in a Bookkeeper cluster with an acknowledgement rule that at least 2 of the 3 bookies must have a copy of the data, then the cluster has a replication factor of 2. A Pulsar broker uses the `managedLedgerDefaultAckQuorum` and `managedLedgerDefaultWriteQuorum` configurations to set the bounds of this rule. For more about Bookkeeper persistence, see https://pulsar.apache.org/docs/administration-zk-bk/#bookkeeper-persistence-policies[here]. +A great benefit of Bookkeeper is its quorum policies. These policies make each bookie aware of the other bookies to form a bookkeeper cluster. With a cluster established, the cluster can have acknowledgement rules that form a data replication factor. For example, if you had 3 bookies in a Bookkeeper cluster with an acknowledgement rule that at least 2 of the 3 bookies must have a copy of the data, then the cluster has a replication factor of 2. A {pulsar-short} broker uses the `managedLedgerDefaultAckQuorum` and `managedLedgerDefaultWriteQuorum` configurations to set the bounds of this rule. For more about Bookkeeper persistence, see https://pulsar.apache.org/docs/administration-zk-bk/#bookkeeper-persistence-policies[here]. -When a client produces a message, the broker will not acknowledge receipt until the replication factor has been achieved. Continuing from the above example, if the replication factor is 2, a broker's acknowledgment means a minimum of 2 bookies have confirmed storage of message data. If the broker times out waiting for at least 2 responses from the bookies, then the broker will not acknowledge receipt with the client. The client will need to handle the exception by attempting to resend or fail. This process forms one of Pulsar's core values - guaranteed message receipt. +When a client produces a message, the broker will not acknowledge receipt until the replication factor has been achieved. Continuing from the above example, if the replication factor is 2, a broker's acknowledgment means a minimum of 2 bookies have confirmed storage of message data. If the broker times out waiting for at least 2 responses from the bookies, then the broker will not acknowledge receipt with the client. The client will need to handle the exception by attempting to resend or fail. This process forms one of {pulsar-short}'s core values - guaranteed message receipt. -Now that the broker has a message, it guarantees delivery to the associated topic's subscribers. We refer to this as the broker's backlog. The size of the backlog is sometimes expressed by the number of messages. For example, a Pulsar operator might say, "we try to keep the backlog below 100 messages." The number of brokers available to process messages directly impacts the size of the backlog. However, the number of messages is not a meaningful number on its own without knowing the size of the messages. Message size is essential information because it determines how full a bookie's disk will be. If the backlog has 100 messages that are 4Gb each, then 400Gb is occupied on a bookie's disk. If the backlog has 100 messages that are 1Kb each, then only 100Kb is occupied on the bookie's disk. Quite a difference in storage capacity! +Now that the broker has a message, it guarantees delivery to the associated topic's subscribers. We refer to this as the broker's backlog. The size of the backlog is sometimes expressed by the number of messages. For example, a {pulsar-short} operator might say, "we try to keep the backlog below 100 messages." The number of brokers available to process messages directly impacts the size of the backlog. However, the number of messages is not a meaningful number on its own without knowing the size of the messages. Message size is essential information because it determines how full a bookie's disk will be. If the backlog has 100 messages that are 4Gb each, then 400Gb is occupied on a bookie's disk. If the backlog has 100 messages that are 1Kb each, then only 100Kb is occupied on the bookie's disk. Quite a difference in storage capacity! -Until all subscribers have acknowledged receipt of a message, the broker will not mark a message as acknowledged. This is another core feature of Pulsar - guaranteed message delivery. But there are realities around this - we must assume that all functions, sinks, and clients subscribed to the message's topic are healthy and programmed correctly to acknowledge receipt. +Until all subscribers have acknowledged receipt of a message, the broker will not mark a message as acknowledged. This is another core feature of {pulsar-short} - guaranteed message delivery. But there are realities around this - we must assume that all functions, sinks, and clients subscribed to the message's topic are healthy and programmed correctly to acknowledge receipt. -Unfortunately, this isn't realistic. Things happen. Processes lock up. Networks go down. If we tell Pulsar to indefinitely attempt message delivery to all subscribers on all topics, the backlog would grow out of control, with bookie disks continuously filling and never draining. So, guaranteed message delivery must be managed with some rules. +Unfortunately, this isn't realistic. Things happen. Processes lock up. Networks go down. If we tell {pulsar-short} to indefinitely attempt message delivery to all subscribers on all topics, the backlog would grow out of control, with bookie disks continuously filling and never draining. So, guaranteed message delivery must be managed with some rules. -Pulsar has different ways of managing the broker's backlog (ie: guaranteed message delivery). Combining these different settings make up the rules of message retention. The rules of message retention directly impact how full a bookie's disk can be. We can't cover every possability within Pulsar's message retention system, so we will focus on 3 key areas and let those drive our sizing calculation. For more on message retention and expiration, see https://pulsar.apache.org/docs/concepts-messaging/#message-retention-and-expiry[Pulsar's message retention and expiry documentation]. +{pulsar-short} has different ways of managing the broker's backlog (ie: guaranteed message delivery). Combining these different settings make up the rules of message retention. The rules of message retention directly impact how full a bookie's disk can be. We can't cover every possability within {pulsar-short}'s message retention system, so we will focus on 3 key areas and let those drive our sizing calculation. For more on message retention and expiration, see https://pulsar.apache.org/docs/concepts-messaging/#message-retention-and-expiry[{pulsar-short}'s message retention and expiry documentation]. === Retention policy -The broker's goal is to mark a message for deletion, which means all subscribers have acknowledged message receipt and the message can be removed from the bookie's disk. Don't confuse this with Pulsar's tiered storage, where you can store the broker's backlog for a very long time. This is a different concept than retention. Sometimes you want acknowledged messages to stay on disk for a certain period of time, or until a certain size threshold has been reached. For example, when a client is constantly reading a topic's messages and needs to have the same low latency performance as a consumer of unacknowledged messages, a highly performant reader is required. +The broker's goal is to mark a message for deletion, which means all subscribers have acknowledged message receipt and the message can be removed from the bookie's disk. Don't confuse this with {pulsar-short}'s tiered storage, where you can store the broker's backlog for a very long time. This is a different concept than retention. Sometimes you want acknowledged messages to stay on disk for a certain period of time, or until a certain size threshold has been reached. For example, when a client is constantly reading a topic's messages and needs to have the same low latency performance as a consumer of unacknowledged messages, a highly performant reader is required. Retention can be expressed in size or time. Expressed as size, when the broker's backlog reaches some size threshold (in Mb), it will begin marking the oldest acknowledged messages for deletion until the size is reduced. Expressed in time, any acknowledged messages older than some time period (like 3 hours) will be marked for deletion. Size and time can also be used together to create a more comprehensive retention rule. -Pulsar's default behavior disables retention policy, so our sizing calculations will assume this configuration. When all subscribers have acknowledged, the message is removed. +{pulsar-short}'s default behavior disables retention policy, so our sizing calculations will assume this configuration. When all subscribers have acknowledged, the message is removed. === Backlog quota size -As mentioned above, the broker's backlog size is directly proportional to how much disk is being consumed on a bookie. Pulsar provides the option to set thresholds of how large the backlog of a certain namespace can get. A policy can also be set to manage behavior for when that backlog threshold is passed. +As mentioned above, the broker's backlog size is directly proportional to how much disk is being consumed on a bookie. {pulsar-short} provides the option to set thresholds of how large the backlog of a certain namespace can get. A policy can also be set to manage behavior for when that backlog threshold is passed. -Pulsar's default is to not set a backlog quote on a namespace, so our sizing calculations will assume this configuration. +{pulsar-short}'s default is to not set a backlog quote on a namespace, so our sizing calculations will assume this configuration. === Message time to live (TTL) -TTL determines how long an unacknowledged message will last in the backlog before it is marked for deletion. Pulsar's default behavior disables TTL and stores unacked messages forever, but in a production cluster, there must be limits in place to prevent bookie disks from filling up and crippling a cluster's health. +TTL determines how long an unacknowledged message will last in the backlog before it is marked for deletion. {pulsar-short}'s default behavior disables TTL and stores unacked messages forever, but in a production cluster, there must be limits in place to prevent bookie disks from filling up and crippling a cluster's health. -The TTL parameter is like a stopwatch attached to each message that defines the amount of time a message is allowed to stay unacknowledged. When the TTL expires, Pulsar automatically moves the message to the acknowledged state (and thus makes it ready for deletion). +The TTL parameter is like a stopwatch attached to each message that defines the amount of time a message is allowed to stay unacknowledged. When the TTL expires, {pulsar-short} automatically moves the message to the acknowledged state (and thus makes it ready for deletion). TTL is expressed in terms of time, at the namespace level. A default value for all new namespace can be set with the `ttlDurationDefaultInSeconds` broker configuration value. @@ -108,15 +108,15 @@ Realistically, it's almost impossible to definitively know the exact application * _Average message size (uncompressed)_ - this is the most important number to understand. A message is sized by the number of bytes. A message includes its *message key*, *properties*, and a *message payload*. A *message key* is roughly the same number of characters as a GUID (or hash). *Message properties* is a key/value collection of metadata, so the number of characters varies. The *message payload* accounts for the bulk of the sizing variability. To start, assume the message is a JSON string with some number of characters. + -For more on message compression, see the https://pulsar.apache.org/docs/concepts-messaging/#compression[Pulsar documentation], or search for "calculate bytes of string" in your favorite search engine - you'll find many free tools where you can type out a sample JSON-formatted string and see the byte count. +For more on message compression, see the https://pulsar.apache.org/docs/concepts-messaging/#compression[{pulsar-short} documentation], or search for "calculate bytes of string" in your favorite search engine - you'll find many free tools where you can type out a sample JSON-formatted string and see the byte count. -* _Incoming message throughput_ - this is the second most important number to understand. Throughput is expressed as a number of messages that the cluster can produce in a second. Think about this number in terms of steady traffic and burst traffic. Pulsar can scale brokers to handle bursts, so you don't need to size for maximum workload, but you do need to consider the time it takes to scale up broker instances. If you were streaming in data every time someone clicked on a web page, and the site received a constant 2000 views per second, then your minimum throughput must be able to handle a load above that requirement, because that stream won't be the only load on the cluster. You likewise wouldn't size the cluster to your existing load, because you hope that load will grow over time. +* _Incoming message throughput_ - this is the second most important number to understand. Throughput is expressed as a number of messages that the cluster can produce in a second. Think about this number in terms of steady traffic and burst traffic. {pulsar-short} can scale brokers to handle bursts, so you don't need to size for maximum workload, but you do need to consider the time it takes to scale up broker instances. If you were streaming in data every time someone clicked on a web page, and the site received a constant 2000 views per second, then your minimum throughput must be able to handle a load above that requirement, because that stream won't be the only load on the cluster. You likewise wouldn't size the cluster to your existing load, because you hope that load will grow over time. * _Message retention and TTL period_ - the size or time acknowledged messages are kept on disk. See message retention above for more detail. -* _Tiered storage policies_ - Tiered storage offloads bookkeeper data to cheaper, long-term storage, and can impact cluster sizing if that storage service is included in the cluster. For our calculations we will not be including this feature. For more on tiered storage, see https://pulsar.apache.org/docs/tiered-storage-overview/[Pulsar documentation]. +* _Tiered storage policies_ - Tiered storage offloads bookkeeper data to cheaper, long-term storage, and can impact cluster sizing if that storage service is included in the cluster. For our calculations we will not be including this feature. For more on tiered storage, see https://pulsar.apache.org/docs/tiered-storage-overview/[{pulsar-short} documentation]. -There are other factors that could be a part of the aggregated cluster workload. As you gain familiarity with Pulsar you can further customize this calculation. For now, we will estimate with the above numbers to size a cluster. +There are other factors that could be a part of the aggregated cluster workload. As you gain familiarity with {pulsar-short} you can further customize this calculation. For now, we will estimate with the above numbers to size a cluster. [#aggregate-worksheet] == Example workload aggregation worksheet @@ -152,8 +152,8 @@ With the aggregated workload characteristics, we can now apply our methodology t First, we will size the bookkeeper's disk. We size this first because it's the most important component (bookies store message data) and are also the hardest to scale. -By default, Pulsar sets Bookkeeper https://pulsar.apache.org/docs/administration-zk-bk/#bookkeeper-persistence-policies[ack-quorum] size to 2. -That means at least 2 bookies in the ensemble need to acknowledge receipt of message data before Pulsar will acknowledge receipt of the message. +By default, {pulsar-short} sets Bookkeeper https://pulsar.apache.org/docs/administration-zk-bk/#bookkeeper-persistence-policies[ack-quorum] size to 2. +That means at least 2 bookies in the ensemble need to acknowledge receipt of message data before {pulsar-short} will acknowledge receipt of the message. But (very important) we want the message replication factor to be an odd number, so we can tolerate 1 Bookie failure. . Multiply replication factor (3) by average message payload size (1) by average message throughput (100000), then factor in TTL (3) and retention period (3600) (when applicable). @@ -192,11 +192,11 @@ We need 1 broker to serve messages. As with other components, this must account for fault tolerance. To be evenly divisible by the number of zones, we will set brokers to 3. -=== Pulsar component instance counts +=== {pulsar-short} component instance counts Now that we know how many server instances of broker and Bookie are required to support our workload, we include the other components to size the overall cluster. -.Pulsar cluster component count +.{pulsar-short} cluster component count [cols="2,2,2", options=header] |=== |Component diff --git a/modules/install-upgrade/pages/quickstart-helm-installs.adoc b/modules/install-upgrade/pages/quickstart-helm-installs.adoc index a9792fbc..9cca2842 100644 --- a/modules/install-upgrade/pages/quickstart-helm-installs.adoc +++ b/modules/install-upgrade/pages/quickstart-helm-installs.adoc @@ -1,12 +1,12 @@ = Quick Start for Helm Chart installs -You have options for installing *DataStax Luna Streaming*: +You have options for installing *{company} Luna Streaming*: -* With the provided *DataStax Helm chart* for an existing Kubernetes environment locally or with a cloud provider, as covered in this topic. -* With the *DataStax Luna Streaming tarball* for deployment to a single server/VM, or to multiple servers/VMs. See xref:install-upgrade:quickstart-server-installs.adoc[Quick Start for Server/VM installs]. -* With the *DataStax Ansible scripts* provided at https://github.com/datastax/pulsar-ansible[https://github.com/datastax/pulsar-ansible]. +* With the provided *{company} Helm chart* for an existing Kubernetes environment locally or with a cloud provider, as covered in this topic. +* With the *{company} Luna Streaming tarball* for deployment to a single server/VM, or to multiple servers/VMs. See xref:install-upgrade:quickstart-server-installs.adoc[Quick Start for Server/VM installs]. +* With the *{company} Ansible scripts* provided at https://github.com/datastax/pulsar-ansible[https://github.com/datastax/pulsar-ansible]. -The Helm chart and options described below configure an Apache Pulsar cluster. +The Helm chart and options described below configure an {pulsar} cluster. It is designed for production use, but can also be used in local development environments with the proper settings. The resulting configuration includes support for: @@ -15,21 +15,21 @@ The resulting configuration includes support for: * xref:install-upgrade:quickstart-helm-installs.adoc#authentication[Authentication] * WebSocket Proxy * Standalone Functions Workers -* Pulsar IO Connectors +* {pulsar-short} IO Connectors * xref:install-upgrade:quickstart-helm-installs.adoc#_tiered_storage_configuration[Tiered Storage] including Tardigarde distributed cloud storage -* xref:install-upgrade:quickstart-helm-installs.adoc#_pulsar_sql_configuration[Pulsar SQL Workers] -* Pulsar Admin Console for managing the cluster -* Pulsar heartbeat +* xref:install-upgrade:quickstart-helm-installs.adoc#_pulsar_sql_configuration[{pulsar-short} SQL Workers] +* {pulsar-short} Admin Console for managing the cluster +* {pulsar-short} heartbeat * Burnell for API-based token generation -* Prometheus, Grafana, and Alertmanager stack with default Grafana dashboards and Pulsar-specific alerting rules +* Prometheus, Grafana, and Alertmanager stack with default Grafana dashboards and {pulsar-short}-specific alerting rules * cert-manager with support for self-signed certificates as well as public certificates using ACME; such as Let's Encrypt -* Ingress for all HTTP ports (Pulsar Admin Console, Prometheus, Grafana, others) +* Ingress for all HTTP ports ({pulsar-short} Admin Console, Prometheus, Grafana, others) == Prerequisites -For an example set of production cluster values, see the DataStax production-ready https://github.com/datastax/pulsar-helm-chart[Helm chart]. +For an example set of production cluster values, see the {company} production-ready https://github.com/datastax/pulsar-helm-chart[Helm chart]. -DataStax recommends these hardware resources for running Luna Streaming in a Kubernetes environment: +{company} recommends these hardware resources for running Luna Streaming in a Kubernetes environment: * Helm version 3 @@ -57,9 +57,9 @@ For the local machine running the Helm chart, you will need: [TIP] ==== -Interested in a production benchmark of a Pulsar cluster? +Interested in a production benchmark of a {pulsar-short} cluster? -Check out https://community.intel.com/t5/Blogs/Tech-Innovation/Cloud/Improve-Apache-Pulsar-Performance-on-3rd-Gen-Intel-Xeon-Scalable/post/1547895[Improve Apache Pulsar Performance on 3rd Gen Intel Xeon Scalable Processors], where Intel and DataStax benchmark a Pulsar cluster on 3rd Gen Intel(R) Xeon(R) processors running in AWS VM instances. +Check out https://community.intel.com/t5/Blogs/Tech-Innovation/Cloud/Improve-Apache-Pulsar-Performance-on-3rd-Gen-Intel-Xeon-Scalable/post/1547895[Improve {pulsar} Performance on 3rd Gen Intel Xeon Scalable Processors], where Intel and {company} benchmark a {pulsar-short} cluster on 3rd Gen Intel(R) Xeon(R) processors running in AWS VM instances. ==== === Storage Class Settings @@ -138,7 +138,7 @@ AKS:: -- ==== -* Create a custom storage configuration as a `yaml` file (https://github.com/datastax/pulsar-helm-chart/blob/master/helm-chart-sources/pulsar/templates/bookkeeper/bookkeeper-storageclass.yaml[like the DataStax example]) and tell the Helm chart to use that storage configuration when it creates the BookKeeper PVCs. +* Create a custom storage configuration as a `yaml` file (https://github.com/datastax/pulsar-helm-chart/blob/master/helm-chart-sources/pulsar/templates/bookkeeper/bookkeeper-storageclass.yaml[like the {company} example]) and tell the Helm chart to use that storage configuration when it creates the BookKeeper PVCs. + [source,yaml] ---- @@ -155,7 +155,7 @@ First, create the namespace; in this example, we use `pulsar`. `kubectl create namespace pulsar` -Then run this helm command: +Then run this `helm` command: `helm install pulsar datastax-pulsar/pulsar --namespace pulsar --values storage_values.yaml --create-namespace` @@ -163,15 +163,15 @@ TIP: To avoid having to specify the `pulsar` namespace on each subsequent comman `kubectl config set-context $(kubectl config current-context) --namespace=pulsar` -Once Pulsar is installed, you can now access your Luna Streaming cluster. +Once {pulsar-short} is installed, you can now access your Luna Streaming cluster. === Access the Luna Streaming cluster -The default values will create a ClusterIP for all components. ClusterIPs are only accessible within the Kubernetes cluster. The easiest way to work with Pulsar is to log into the bastion host (assuming it is in the `pulsar` namespace): +The default values will create a ClusterIP for all components. ClusterIPs are only accessible within the Kubernetes cluster. The easiest way to work with {pulsar-short} is to log into the bastion host (assuming it is in the `pulsar` namespace): `kubectl exec $(kubectl get pods -l component=bastion -o jsonpath="{.items[*].metadata.name}" -n pulsar) -it -n pulsar -- /bin/bash` -Once you are logged into the bastion, you can run Pulsar admin commands: +Once you are logged into the bastion, you can run {pulsar-short} admin commands: [source,shell] ---- @@ -197,7 +197,7 @@ proxy: If you are using a load balancer on the proxy, you can find the IP address using `kubectl get service -n pulsar`. -=== Manage Luna Streaming with Pulsar Admin Console +=== Manage Luna Streaming with {pulsar-short} Admin Console Or if you would rather go directly to the broker: @@ -205,9 +205,9 @@ Or if you would rather go directly to the broker: `kubectl port-forward -n pulsar $(kubectl get pods -n pulsar -l component=broker -o jsonpath='{.items[0].metadata.name}') 6650:6650` -=== Manage Luna Streaming with Pulsar Admin Console +=== Manage Luna Streaming with {pulsar-short} Admin Console -The Pulsar Admin Console is installed in your cluster by enabling the console with this values setting: +The {pulsar-short} Admin Console is installed in your cluster by enabling the console with this values setting: [source,yaml] ---- @@ -215,9 +215,9 @@ component: pulsarAdminConsole: yes ---- -The Pulsar Admin Console will be automatically configured to connect to the Pulsar cluster. +The {pulsar-short} Admin Console will be automatically configured to connect to the {pulsar-short} cluster. -By default, the Pulsar Admin Console has authentication disabled. You can enable authentication with these settings: +By default, the {pulsar-short} Admin Console has authentication disabled. You can enable authentication with these settings: [source,yaml] ---- @@ -225,7 +225,7 @@ pulsarAdminConsole: authMode: k8s ---- -To learn more about using the Pulsar Admin Console, see xref:components:admin-console-tutorial.adoc[Admin Console Tutorial]. +To learn more about using the {pulsar-short} Admin Console, see xref:components:admin-console-tutorial.adoc[Admin Console Tutorial]. == Install Luna Streaming locally @@ -284,7 +284,7 @@ pulsar-zookeeper-0 1/1 Running 0 pulsar-zookeeper-metadata-5l58k 0/1 Completed 0 12m ---- -Once all the pods are running, you can access the Pulsar Admin Console by forwarding to localhost: +Once all the pods are running, you can access the {pulsar-short} Admin Console by forwarding to localhost: [source,shell] ---- @@ -292,11 +292,11 @@ kubectl port-forward $(kubectl get pods -l component=adminconsole -o jsonpath='{ ---- Now open a browser to `\http://localhost:8080`. -In the Pulsar Admin Console, you can test your Pulsar setup using the built-in clients (Test Clients in the left-hand menu). +In the {pulsar-short} Admin Console, you can test your {pulsar-short} setup using the built-in clients (Test Clients in the left-hand menu). -=== Access the Pulsar cluster on localhost +=== Access the {pulsar-short} cluster on localhost -To port forward the proxy admin and Pulsar ports to your local machine: +To port forward the proxy admin and {pulsar-short} ports to your local machine: `kubectl port-forward -n pulsar $(kubectl get pods -n pulsar -l component=proxy -o jsonpath='{.items[0].metadata.name}') 8080:8080` @@ -310,14 +310,14 @@ Or if you would rather go directly to the broker: === Access Admin Console on your local machine -To access Pulsar Admin Console on your local machine, forward port 80: +To access {pulsar-short} Admin Console on your local machine, forward port 80: [source,shell] ---- kubectl port-forward -n pulsar $(kubectl get pods -n pulsar -l component=adminconsole -o jsonpath='{.items[0].metadata.name}') 8888:80 ---- -TIP: While using the Admin Console and Pulsar Monitoring, if the connection to `localhost:3000` is refused, set a port-forward to the Grafana pod. Example: +TIP: While using the Admin Console and {pulsar-short} Monitoring, if the connection to `localhost:3000` is refused, set a port-forward to the Grafana pod. Example: [source,shell] ---- @@ -348,9 +348,9 @@ helm install pulsar -f dev-values-auth.yaml datastax-pulsar/pulsar === Enabling the Prometheus stack -You can enable a full Prometheus stack (Prometheus, Alertmanager, Grafana) from [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus). This includes default Prometheus rules and Grafana dashboards for Kubernetes. +You can enable a full Prometheus stack (Prometheus, Alertmanager, Grafana) from `https://github.com/prometheus-operator/kube-prometheus[kube-prometheus]`. This includes default Prometheus rules and Grafana dashboards for Kubernetes. -In an addition, this chart can deploy Grafana dashboards for Pulsar as well as Pulsar-specific rules for Prometheus. +In an addition, this chart can deploy Grafana dashboards for {pulsar-short} as well as {pulsar-short}-specific rules for Prometheus. To enable the Prometheus stack, use the following setting in your values file: @@ -383,9 +383,9 @@ Tiered storage (offload to blob storage) can be configured in the `storageOffloa In addition, you can configure any S3 compatible storage. There is explicit support for https://tardigrade.io[Tardigrade], which is a provider of secure, decentralized storage. You can enable the Tardigarde S3 gateway in the `extra` configuration. The instructions for configuring the gateway are provided in the `tardigrade` section of the `values.yaml` file. -=== Pulsar SQL Configuration +=== {pulsar-short} SQL Configuration -If you enable Pulsar SQL, the cluster provides https://prestodb.io/[Presto] access to the data stored in BookKeeper (and tiered storage, if enabled). Presto is exposed on the service named `-sql`. +If you enable {pulsar-short} SQL, the cluster provides https://prestodb.io/[Presto] access to the data stored in BookKeeper (and tiered storage, if enabled). Presto is exposed on the service named `-sql`. The easiest way to access the Presto command line is to log into the bastion host and then connect to the Presto service port, like this: @@ -407,9 +407,9 @@ Splits: 17 total, 17 done (100.00%) 0:04 [2 rows, 144B] [0 rows/s, 37B/s] ---- -To access Pulsar SQL from outside the cluster, you can enable the `ingress` option which will expose the Presto port on hostname. We have tested with the Traefik ingress, but any Kubernetes ingress should work. You can then run SQL queries using the Presto CLI and monitoring Presto using the built-in UI (point browser to the ingress hostname). Authentication is not enabled on the UI, so you can log in with any username. +To access {pulsar-short} SQL from outside the cluster, you can enable the `ingress` option which will expose the Presto port on hostname. We have tested with the Traefik ingress, but any Kubernetes ingress should work. You can then run SQL queries using the Presto CLI and monitoring Presto using the built-in UI (point browser to the ingress hostname). Authentication is not enabled on the UI, so you can log in with any username. -It is recommended that you match the Presto CLI version to the version running as part of Pulsar SQL. +It is recommended that you match the Presto CLI version to the version running as part of {pulsar-short} SQL. The Presto CLI supports basic authentication, so if you enabled that on the Ingress (using annotations), you can have secure Presto access. Example: @@ -439,8 +439,8 @@ The Helm chart has the following optional dependencies: [#authentication] === Authentication -The chart can enable token-based authentication for your Pulsar cluster. For information on token-based -authentication in Pulsar, see https://pulsar.apache.org/docs/en/security-token-admin/[Pulsar token authentication admin documentation]. +The chart can enable token-based authentication for your {pulsar-short} cluster. For information on token-based +authentication in {pulsar-short}, see https://pulsar.apache.org/docs/en/security-token-admin/[{pulsar-short} token authentication admin documentation]. For authentication to work, the token-generation keys need to be stored in Kubernetes secrets along with some default tokens (for superuser access). @@ -483,7 +483,7 @@ You can create the certificate like this: `kubectl create secret tls --key --cert ` -The resulting secret will be of type `kubernetes.io/tls`. The key should not be in `PKCS 8` format even though that is the format used by Pulsar. The format will be converted by the chart to `PKCS 8`. +The resulting secret will be of type `kubernetes.io/tls`. The key should not be in `PKCS 8` format even though that is the format used by {pulsar-short}. The format will be converted by the chart to `PKCS 8`. You can also specify the certificate information directly in the values: @@ -499,7 +499,7 @@ This is useful if you are using a self-signed certificate. For automated handling of publicly signed certificates, you can use a tool such as https://cert-mananager[cert-manager]. -For more information, see https://github.com/datastax/pulsar-helm-chart/blob/master/aws-customer-docs.md[Using Cert-Manager for Pulsar Certificates in AWS]. +For more information, see https://github.com/datastax/pulsar-helm-chart/blob/master/aws-customer-docs.md[Using Cert-Manager for {pulsar-short} Certificates in AWS]. Once you have created the secrets that store the certificate info (or specified it in the values), you can enable TLS in the values: @@ -508,7 +508,7 @@ Once you have created the secrets that store the certificate info (or specified [#video] == Getting started with Kubernetes video -Follow along with this video from our *Five Minutes About Pulsar* series to get started with a Helm installation. +Follow along with this video from our *Five Minutes About {pulsar-short}* series to get started with a Helm installation. video::hEBP_IVQqQM[youtube, list=PL2g2h-wyI4SqeKH16czlcQ5x4Q_z-X7_m] diff --git a/modules/install-upgrade/pages/quickstart-server-installs.adoc b/modules/install-upgrade/pages/quickstart-server-installs.adoc index 5f266ee7..4420670a 100644 --- a/modules/install-upgrade/pages/quickstart-server-installs.adoc +++ b/modules/install-upgrade/pages/quickstart-server-installs.adoc @@ -1,15 +1,15 @@ = Quick Start for Bare Metal/VM installs -This document explains xref:install-upgrade:quickstart-server-installs.adoc#install[installation] of Luna Streaming for Bare Metal/VM deployments with a Pulsar tarball. +This document explains xref:install-upgrade:quickstart-server-installs.adoc#install[installation] of Luna Streaming for Bare Metal/VM deployments with a {pulsar-short} tarball. The resulting Luna Streaming deployment includes: * *Tiered Storage:* Offload historical messages to more cost effective object storages such as AWS S3, Azure Blob, Google Cloud Storage, and HDFS. * *Built-in Schema Registry:* Guarantee messaging type safety on a per-topic basis without relying on any external facility. -* *Pulsar I/O connectors:* Enables Pulsar to exchange data with external systems, either as sources or sinks. -* *Pulsar Function:* Lightweight compute extensions of Pulsar brokers which enable real-time simple event processing within Pulsar. -* *Pulsar SQL:* SQL-based interactive query for message data stored in Pulsar. -* *Pulsar Transactions:* enables event streaming applications to consume, process, and produce messages in one atomic operation. +* *{pulsar-short} I/O connectors:* Enables {pulsar-short} to exchange data with external systems, either as sources or sinks. +* *{pulsar-short} Function:* Lightweight compute extensions of {pulsar-short} brokers which enable real-time simple event processing within {pulsar-short}. +* *{pulsar-short} SQL:* SQL-based interactive query for message data stored in {pulsar-short}. +* *{pulsar-short} Transactions:* enables event streaming applications to consume, process, and produce messages in one atomic operation. == Requirements @@ -17,11 +17,11 @@ The resulting Luna Streaming deployment includes: * JDK 11 + -Pulsar can run with JDK8, but DataStax Luna Streaming is designed for Java 11. +{pulsar-short} can run with JDK8, but {company} Luna Streaming is designed for Java 11. * File System + -DataStax recommends XFS, but ext4 will work. +{company} recommends XFS, but ext4 will work. * For a single node install, a server with at least 8 CPU and 32 GB of memory is required. @@ -33,7 +33,7 @@ The servers must be on the same network so they can communicate with each other. * BookKeeper should use one volume device for the journal, and one volume device for the ledgers. The journal device should be 20GB. The ledger volume device should be sized to hold the expected amount of stored message data. -* DataStax recommends a separate data disk volume for ZooKeeper. +* {company} recommends a separate data disk volume for ZooKeeper. * Operating System Settings + @@ -43,7 +43,7 @@ Check this setting with `cat /sys/kernel/mm/transparent_hugepage/enabled` and `c [#install] == Installation -. Download the DataStax Luna Streaming tarball from the https://github.com/datastax/pulsar/releases[DataStax GitHub repo]. There are three versions of Luna Streaming currently available: +. Download the {company} Luna Streaming tarball from the https://github.com/datastax/pulsar/releases[{company} GitHub repo]. There are three versions of Luna Streaming currently available: + [cols="1,1"] [%autowidth] @@ -52,13 +52,13 @@ Check this setting with `cat /sys/kernel/mm/transparent_hugepage/enabled` and `c |*Included components* |`lunastreaming-core--bin.tar.gz` -|Contains the core Pulsar modules: Zookeeper, Broker, BookKeeper, and function worker +|Contains the core {pulsar-short} modules: Zookeeper, Broker, BookKeeper, and function worker |`lunastreaming--bin.tar.gz` -|Contains all components from `lunastreaming-core` as well as support for Pulsar SQL +|Contains all components from `lunastreaming-core` as well as support for {pulsar-short} SQL |`lunastreaming-all--bin.tar.gz` -|Contains all components from `lunastreaming` as well as the NAR files for all Pulsar I/O connectors and offloaders +|Contains all components from `lunastreaming` as well as the NAR files for all {pulsar-short} I/O connectors and offloaders |=== @@ -89,29 +89,29 @@ drwxr-xr-x@ 277 firstname.lastname staff 8864 May 17 05:58 lib drwxr-xr-x@ 25 firstname.lastname staff 800 Jan 22 2020 licenses ---- -You have successfully installed the DataStax Luna Streaming tarball. +You have successfully installed the {company} Luna Streaming tarball. == Additional tooling -Once the DataStax Luna Streaming tarball is installed, you may want to add additional tooling to your server/VM deployment. +Once the {company} Luna Streaming tarball is installed, you may want to add additional tooling to your server/VM deployment. -* *Pulsar Admin Console:* Web-based UI that administrates Pulsar. -Download the latest version from the https://github.com/datastax/pulsar-admin-console[DataStax GitHub repo] and follow the instructions xref:components:admin-console-vm.adoc[here]. +* *{pulsar-short} Admin Console:* Web-based UI that administrates {pulsar-short}. +Download the latest version from the https://github.com/datastax/pulsar-admin-console[{company} GitHub repo] and follow the instructions xref:components:admin-console-vm.adoc[here]. + [NOTE] ==== Admin Console requires https://nodejs.org/download/release/latest-v14.x/[NodeJS 14 LTS] and Nginx version 1.17.9+. ==== -* *Pulsar Heartbeat:* Monitors Pulsar cluster availability. -Download the latest version from the https://github.com/datastax/pulsar-heartbeat/releases/[DataStax GitHub repo] and follow the instructions xref:components:heartbeat-vm.adoc[here]. +* *{pulsar-short} Heartbeat:* Monitors {pulsar-short} cluster availability. +Download the latest version from the https://github.com/datastax/pulsar-heartbeat/releases/[{company} GitHub repo] and follow the instructions xref:components:heartbeat-vm.adoc[here]. == Next steps -* For initializing Pulsar components like BookKeeper and ZooKeeper, see the https://pulsar.apache.org/docs/deploy-bare-metal[Pulsar documentation]. +* For initializing {pulsar-short} components like BookKeeper and ZooKeeper, see the https://pulsar.apache.org/docs/deploy-bare-metal[{pulsar-short} documentation]. -* For installing optional built-in connectors or tiered storage included in `lunastreaming-all`, see the https://pulsar.apache.org/docs/deploy-bare-metal#install-builtin-connectors-optional[Pulsar documentation]. +* For installing optional built-in connectors or tiered storage included in `lunastreaming-all`, see the https://pulsar.apache.org/docs/deploy-bare-metal#install-builtin-connectors-optional[{pulsar-short} documentation]. * For installation to existing Kubernetes environments or with a cloud provider, see xref:install-upgrade:quickstart-helm-installs.adoc[Quick Start for Helm Chart installs]. -* For Ansible deployment, use the DataStax Ansible scripts provided at https://github.com/datastax/pulsar-ansible[https://github.com/datastax/pulsar-ansible]. \ No newline at end of file +* For Ansible deployment, use the {company} Ansible scripts provided at https://github.com/datastax/pulsar-ansible[https://github.com/datastax/pulsar-ansible]. \ No newline at end of file diff --git a/modules/install-upgrade/pages/supported-versions.adoc b/modules/install-upgrade/pages/supported-versions.adoc index 2ebd7b0f..e3f98b16 100644 --- a/modules/install-upgrade/pages/supported-versions.adoc +++ b/modules/install-upgrade/pages/supported-versions.adoc @@ -1,22 +1,22 @@ = Supported software -Support covers only the following Software versions for Apache Pulsar: +Support covers only the following Software versions for {pulsar}: [cols="2*"] |=== |Name |Versions -|Apache Pulsar +|{pulsar} |2.10.x |=== -Support covers only the following Software versions for DataStax Luna Streaming Distribution: +Support covers only the following Software versions for {company} Luna Streaming Distribution: [cols="2*"] |=== |Name |Version -|DataStax Luna Streaming Distribution +|{company} Luna Streaming Distribution |v.1.0.0 and above |=== @@ -26,23 +26,23 @@ Support covers only the following Software versions for the Open Source Projects |=== |Name |Version -|Pulsar Admin Console +|{pulsar-short} Admin Console |v.1.0.0 and above -|Pulsar Heartbeat +|{pulsar-short} Heartbeat |v.1.0.0 and above -|DataStax pulsar-sink +|{company} pulsar-sink |v.1.0.0 and above |Starlight for JMS |v.1.0.0 and above -|Starlight for RabbitMQ +|{starlight-rabbitmq} |v.1.0.0 and above |=== -Support covers only the following Apache Pulsar Connectors (as included in the Apache Pulsar download): +Support covers only the following {pulsar} Connectors (as included in the {pulsar} download): [cols="2*"] |=== diff --git a/modules/operations/pages/auth.adoc b/modules/operations/pages/auth.adoc index b98585c3..c602b9ec 100644 --- a/modules/operations/pages/auth.adoc +++ b/modules/operations/pages/auth.adoc @@ -1,12 +1,12 @@ = Luna Streaming Authentication -The Helm chart can enable token-based authentication for your Pulsar cluster. For more, see https://pulsar.apache.org/docs/en/security-token-admin/[Pulsar token authentication]. +The Helm chart can enable token-based authentication for your {pulsar-short} cluster. For more, see https://pulsar.apache.org/docs/en/security-token-admin/[{pulsar-short} token authentication]. For authentication to work, the token-generation keys need to be stored in Kubernetes secrets along with superuser default tokens. The Helm chart includes tooling to automatically create the necessary secrets, or you can do this manually. -== Automatically generating secrets for Pulsar token authentication +== Automatically generating secrets for {pulsar-short} token authentication Use the following settings in your `values.yaml` file to enable automatic generation of the secrets and enable token-based authentication: @@ -17,7 +17,7 @@ autoRecovery: enableProvisionContainer: yes ---- -When `enableProvisionContainer` is enabled, Pulsar will check if the required secrets exist. If they don't exist, it will generate new token keys and use those keys to generate the default set of tokens. +When `enableProvisionContainer` is enabled, {pulsar-short} will check if the required secrets exist. If they don't exist, it will generate new token keys and use those keys to generate the default set of tokens. The name of the key secrets are: @@ -31,7 +31,7 @@ Using these keys will generate tokens for each role listed in `superUserRoles` i * `token-proxy` * `token-websocket` -== Manually generating secrets for Pulsar token authentication +== Manually generating secrets for {pulsar-short} token authentication include::operations:partial$manually-create-credentials.adoc[] @@ -52,7 +52,7 @@ Create the certificate: kubectl create secret tls --key --cert ---- -The resulting secret will be of type `kubernetes.io/tls`. The key should *not* be in `PKCS 8` format, even though that is the format used by Pulsar. The `kubernetes.io/tls` format will be converted by the chart to `PKCS 8`. +The resulting secret will be of type `kubernetes.io/tls`. The key should *not* be in `PKCS 8` format, even though that is the format used by {pulsar-short}. The `kubernetes.io/tls` format will be converted by the chart to `PKCS 8`. If you have a self-signed certificate, manually specify the certificate information directly in https://github.com/datastax/pulsar-helm-chart/blob/master/examples/dev-values-keycloak-auth.yaml[values]: @@ -70,11 +70,11 @@ Once you have created the secrets that store the certificate info (or manually s == Token Authentication via Keycloak Integration -DataStax created the https://github.com/datastax/pulsar-openid-connect-plugin[Pulsar OpenID Connect Authentication Plugin] to provide a more dynamic authentication option for Pulsar. This plugin integrates with any OpenID Connect-compliant identity provider to dynamically retrieve public keys for token validation. This dynamic public key retrieval enables support for key rotation and multiple authentication/identity providers by configuring multiple allowed token issuers. It also means that token secret keys will *not* be stored in Kubernetes secrets. +{company} created the https://github.com/datastax/pulsar-openid-connect-plugin[{pulsar-short} OpenID Connect Authentication Plugin] to provide a more dynamic authentication option for {pulsar-short}. This plugin integrates with any OpenID Connect-compliant identity provider to dynamically retrieve public keys for token validation. This dynamic public key retrieval enables support for key rotation and multiple authentication/identity providers by configuring multiple allowed token issuers. It also means that token secret keys will *not* be stored in Kubernetes secrets. -In order to simplify deployment for Pulsar cluster components, the plugin provides the option to use Keycloak in conjunction with Pulsar's basic token based authentication. For more, see https://github.com/datastax/pulsar-openid-connect-plugin[Pulsar OpenID Connect Authentication Plugin]. +In order to simplify deployment for {pulsar-short} cluster components, the plugin provides the option to use Keycloak in conjunction with {pulsar-short}'s basic token based authentication. For more, see https://github.com/datastax/pulsar-openid-connect-plugin[{pulsar-short} OpenID Connect Authentication Plugin]. -See the example https://github.com/datastax/pulsar-helm-chart/blob/master/examples/dev-values-keycloak-auth.yaml[Keycloak Helm chart] for deploying a working cluster that integrates with Keycloak. By default, the Helm chart creates a Pulsar realm within Keycloak and sets up the client used by the Pulsar Admin Console as well as a sample client and some sample groups. The configuration for the broker side auth plugin should be placed in the `.Values..configData` maps. +See the example https://github.com/datastax/pulsar-helm-chart/blob/master/examples/dev-values-keycloak-auth.yaml[Keycloak Helm chart] for deploying a working cluster that integrates with Keycloak. By default, the Helm chart creates a {pulsar-short} realm within Keycloak and sets up the client used by the {pulsar-short} Admin Console as well as a sample client and some sample groups. The configuration for the broker side auth plugin should be placed in the `.Values..configData` maps. === Configuring Keycloak for Token Generation @@ -104,20 +104,20 @@ keycloak: adminPassword: "F3...ncK@" ---- -. Navigate to `localhost:8080` in a browser and view the Pulsar realm in the Keycloak UI. Note that the realm name must match the configured realm name (`.Values.keycloak.realm`) for the OpenID Connect plugin to work properly. +. Navigate to `localhost:8080` in a browser and view the {pulsar-short} realm in the Keycloak UI. Note that the realm name must match the configured realm name (`.Values.keycloak.realm`) for the OpenID Connect plugin to work properly. -The OpenID Connect plugin uses the `sub` (subject) claim from the JWT as the role used for authorization within Pulsar. To get Keycloak to generate the JWT for a client with the right `sub`, create a special "mapper" that is a "Hardcoded claim" mapping claim name sub to a claim value that is the desired role, like `superuser`. The default config installed by https://github.com/datastax/pulsar-helm-chart/blob/master/examples/dev-values-keycloak-auth.yaml[this helm chart] provides examples of how to add custom mapper protocols to clients. +The OpenID Connect plugin uses the `sub` (subject) claim from the JWT as the role used for authorization within {pulsar-short}. To get Keycloak to generate the JWT for a client with the right `sub`, create a special "mapper" that is a "Hardcoded claim" mapping claim name sub to a claim value that is the desired role, like `superuser`. The default config installed by https://github.com/datastax/pulsar-helm-chart/blob/master/examples/dev-values-keycloak-auth.yaml[this Helm chart] provides examples of how to add custom mapper protocols to clients. -=== Retrieving and using a token from Keycloak with Pulsar Admin CLI +=== Retrieving and using a token from Keycloak with {pulsar-short} Admin CLI -. After creating your realm and client, retrieve a token with the Pulsar Admin CLI. To generate a token that will have an allowed issuer, you should exec into a bastion pod in the k8s cluster. Exec'ing into a bastion host will give you immediate access to a `pulsar-admin` cli tool that you can use to verify that you have access. +. After creating your realm and client, retrieve a token with the {pulsar-short} Admin CLI. To generate a token that will have an allowed issuer, you should exec into a bastion pod in the k8s cluster. Exec'ing into a bastion host will give you immediate access to a `pulsar-admin` cli tool that you can use to verify that you have access. + [source,bash] ---- kubectl -n default exec $(kubectl get pods --namespace default -l "app=pulsar,component=bastion" -o jsonpath="{.items[0].metadata.name}") -it -- bash ---- -. Run the following from a bastion pod to generate an allowed issuer token. +. Run the following from a bastion pod to generate an allowed issuer token: + [source,bash] ---- @@ -125,17 +125,29 @@ pulsar@pulsar-bastion-85c9b777f6-gt9ct:/pulsar$ curl -d "client_id=test-client" -d "client_secret=19d9f4a2-65fb-4695-873c-d0c1d6bdadad" \ -d "grant_type=client_credentials" \ "http://test-keycloak/auth/realms/pulsar/protocol/openid-connect/token" -{"access_token":"ey...CE2ug","expires_in":600,"refresh_expires_in":0,"token_type":"Bearer","not-before-policy":0,"scope":"email profile"} ---- ++ +.Results +[%collapsible] +==== +[source,json] +---- +{ + "access_token":"ey...CE2ug", + "expires_in":600, + "refresh_expires_in":0, + "token_type":"Bearer", + "not-before-policy":0, + "scope":"email profile" +} +---- +==== -. Copy the `access_token` contents and use it here: +. Copy the `access_token` contents and use it in the `pulsar-admin` command's `--auth-params` option: + [source,bash] ---- -pulsar@pulsar-bastion-85c9b777f6-gt9ct:/pulsar$ bin/pulsar-admin --auth-params "token:e...CE2ug" \ - tenants list -"public" -"pulsar" +pulsar@pulsar-bastion-85c9b777f6-gt9ct:/pulsar$ bin/pulsar-admin --auth-params "token:e...CE2ug" tenants list ---- You're now using Keycloak tokens with `pulsar-admin` CLI. @@ -148,7 +160,7 @@ An alternative method for retrieving and using a Keycloak token from the bastion . Create a `creds.json` file and enter your retrieved credentials in this format: + -[source,bash] +[source,json] ---- { "client_id": "pulsar-admin-example-client", @@ -163,14 +175,12 @@ An alternative method for retrieving and using a Keycloak token from the bastion ---- pulsar@pulsar-broker-79b87f786d-tjvm7:/pulsar$ bin/pulsar-admin \ --auth-plugin "org.apache.pulsar.client.impl.auth.oauth2.AuthenticationOAuth2" ---auth-params '{"privateKey":"file:///pulsar/creds.json","issuerUrl":"http://test-keycloak:8081/auth/realms/pulsar","audience":"I dont matter"}' +--auth-params '{"privateKey":"file:///pulsar/creds.json","issuerUrl":"http://test-keycloak:8081/auth/realms/pulsar","audience":"not used"}' --tenants list -public -pulsar ---- You're now using Keycloak tokens with `pulsar-admin` CLI. == Next steps -To connect with the Pulsar Admin console and start sending and consuming messages, see xref:components:admin-console-tutorial.adoc[Admin Console]. \ No newline at end of file +To connect with the {pulsar-short} Admin console and start sending and consuming messages, see xref:components:admin-console-tutorial.adoc[Admin Console]. \ No newline at end of file diff --git a/modules/operations/pages/functions.adoc b/modules/operations/pages/functions.adoc index e01040f4..9cd3b7e7 100644 --- a/modules/operations/pages/functions.adoc +++ b/modules/operations/pages/functions.adoc @@ -2,11 +2,11 @@ Functions are lightweight compute processes that enable you to process each message received on a topic or multiple topics. You can apply custom logic to that message, transforming or enriching it, and then output it to a different topic. -Functions run inside Luna Streaming and are therefore serverless. Write the code for your function in Java, Python, or Go, then upload the code to the Pulsar cluster and deploy the function. The function will be automatically run for each message published to the specified input topic. See https://pulsar.apache.org/docs/en/functions-overview/[Pulsar Functions overview] for more information about Apache Pulsar(R) functions. +Functions run inside Luna Streaming and are therefore serverless. Write the code for your function in Java, Python, or Go, then upload the code to the {pulsar-short} cluster and deploy the function. The function will be automatically run for each message published to the specified input topic. See https://pulsar.apache.org/docs/en/functions-overview/[{pulsar-short} Functions overview] for more information about {pulsar-reg} functions. -== Manage functions using Pulsar Admin CLI +== Manage functions using {pulsar-short} Admin CLI -Add functions using the Pulsar Admin CLI. Create a new Python function to consume a message from one topic, add an exclamation point, and publish the results to another topic. +Add functions using the {pulsar-short} Admin CLI. Create a new Python function to consume a message from one topic, add an exclamation point, and publish the results to another topic. . Create the following Python function in `function.py`: + @@ -22,7 +22,7 @@ class ExclamationFunction(Function): return input + '!' ---- -. Deploy `function.py` to your Pulsar cluster using the Pulsar Admin CLI: +. Deploy `function.py` to your {pulsar-short} cluster using the {pulsar-short} Admin CLI: + [source,bash] ---- @@ -44,7 +44,7 @@ If the function is set up and ready to accept messages, you should see "Created Triggering a function is a convenient way to test that the function is working. When you trigger a function, you are publishing a message on the function's input topic, which triggers the function to run. -To test a function with the Pulsar CLI, send a test value with Pulsar CLI's `trigger`. +To test a function with the {pulsar-short} CLI, send a test value with {pulsar-short} CLI's `trigger`. . Listen for messages on the output topic: + @@ -69,9 +69,9 @@ bin/pulsar-client consume persistent:///default/ \ The trigger sends the string `Hello world` to your exclamation function. Your function should output `Hello world!` to your consumed output. -== Add Functions using Pulsar Admin Console +== Add Functions using {pulsar-short} Admin Console -If the Pulsar Admin Console is deployed, you can also add and manage the Pulsar functions in the *Functions* tab of the Admin Console web UI. +If the {pulsar-short} Admin Console is deployed, you can also add and manage the {pulsar-short} functions in the *Functions* tab of the Admin Console web UI. . Select *Choose File* to choose a local Function. In this example, we chose `exclamation_function.py`. @@ -98,7 +98,7 @@ Your input topics, output topics, log topics, and processing guarantees will aut . Provide a *Configuration Key* in the dropdown menu. + -For a list of configuration keys, see the https://pulsar.apache.org/functions-rest-api/#operation/registerFunction[Pulsar Functions API Docs]. +For a list of configuration keys, see the https://pulsar.apache.org/functions-rest-api/#operation/registerFunction[{pulsar-short} Functions API Docs]. . Select *Add* to add your function. @@ -140,7 +140,7 @@ A *Function-name Deleted Successfully!* flag will appear to let you know you've === Trigger your function -To trigger a function in the Pulsar Admin Console, select *Trigger* in the *Manage* dashboard. +To trigger a function in the {pulsar-short} Admin Console, select *Trigger* in the *Manage* dashboard. image::admin-console-trigger-function.png[Trigger Function] @@ -150,4 +150,4 @@ If the function has an output topic and the function returns data to the output == Next steps -For more information, see the https://pulsar.apache.org/docs/en/functions-develop/[Pulsar documentation on developing functions]. \ No newline at end of file +For more information, see the https://pulsar.apache.org/docs/en/functions-develop/[{pulsar-short} documentation on developing functions]. \ No newline at end of file diff --git a/modules/operations/pages/io-connectors.adoc b/modules/operations/pages/io-connectors.adoc index ba1f3884..958a96cc 100644 --- a/modules/operations/pages/io-connectors.adoc +++ b/modules/operations/pages/io-connectors.adoc @@ -4,7 +4,7 @@ When you have Luna Streaming xref:install-upgrade:quickstart-server-installs.ado [NOTE] ==== -There are three versions of the DataStax Luna Streaming distribution. +There are three versions of the {company} Luna Streaming distribution. The lunastreaming-all version contains all connectors. ==== @@ -14,63 +14,63 @@ The lunastreaming-all version contains all connectors. [#astradb-sink] === AstraDB sink -The AstraDB sink connector reads messages from Apache Pulsar topics and writes them to AstraDB systems. +The AstraDB sink connector reads messages from {pulsar} topics and writes them to AstraDB systems. xref:streaming-learning:pulsar-io:connectors/sinks/astra-db.adoc[AstraDB sink documentation] [#elasticsearch-sink] === ElasticSearch sink -The Elasticsearch sink connector reads messages from Apache Pulsar topics and writes them to Elasticsearch systems. +The Elasticsearch sink connector reads messages from {pulsar} topics and writes them to Elasticsearch systems. xref:streaming-learning:pulsar-io:connectors/sinks/elastic-search.adoc[Elasticsearch sink documentation] [#jdbc-clickhouse-sink] === JDBC-Clickhouse sink -The JDBC-ClickHouse sink connector reads messages from Apache Pulsar topics and writes them to JDBC-ClickHouse systems. +The JDBC-ClickHouse sink connector reads messages from {pulsar} topics and writes them to JDBC-ClickHouse systems. xref:streaming-learning:pulsar-io:connectors/sinks/jdbc-clickhouse.adoc[JDBC ClickHouse sink documentation] [#jdbc-mariadb-sink] === JDBC-MariaDB sink -The JDBC-MariaDB sink connector reads messages from Apache Pulsar topics and writes them to JDBC-MariaDB systems. +The JDBC-MariaDB sink connector reads messages from {pulsar} topics and writes them to JDBC-MariaDB systems. xref:streaming-learning:pulsar-io:connectors/sinks/jdbc-mariadb.adoc[JDBC MariaDB sink documentation] [#jdbc-postgres-sink] === JDBC-PostgreSQL sink -The JDBC-PostgreSQL sink connector reads messages from Apache Pulsar topics and writes them to JDBC-PostgreSQL systems. +The JDBC-PostgreSQL sink connector reads messages from {pulsar} topics and writes them to JDBC-PostgreSQL systems. xref:streaming-learning:pulsar-io:connectors/sinks/jdbc-postgres.adoc[JDBC PostgreSQL sink documentation] [#jdbc-sqlite-sink] === JDBC-SQLite -The JDBC-SQLite sink connector reads messages from Apache Pulsar topics and writes them to JDBC-SQLite systems. +The JDBC-SQLite sink connector reads messages from {pulsar} topics and writes them to JDBC-SQLite systems. xref:streaming-learning:pulsar-io:connectors/sinks/jdbc-sqllite.adoc[JDBC SQLite sink documentation] [#kafka-sink] === Kafka -The Kafka sink connector reads messages from Apache Pulsar topics and writes them to Kafka systems. +The Kafka sink connector reads messages from {pulsar} topics and writes them to Kafka systems. xref:streaming-learning:pulsar-io:connectors/sinks/kafka.adoc[Kafka sink documentation] [#kinesis-sink] === Kinesis -The Kinesis sink connector reads messages from Apache Pulsar topics and writes them to Kinesis systems. +The Kinesis sink connector reads messages from {pulsar} topics and writes them to Kinesis systems. xref:streaming-learning:pulsar-io:connectors/sinks/kinesis.adoc[Kinesis sink documentation] [#snowflake-sink] === Snowflake -The Snowflake sink connector reads messages from Apache Pulsar topics and writes them to Snowflake systems. +The Snowflake sink connector reads messages from {pulsar} topics and writes them to Snowflake systems. xref:streaming-learning:pulsar-io:connectors/sinks/snowflake.adoc[Snowflake sink documentation] @@ -80,62 +80,62 @@ xref:streaming-learning:pulsar-io:connectors/sinks/snowflake.adoc[Snowflake sink [#datagenerator-source] === Data Generator source -The Data generator source connector produces messages for testing and persists the messages to Pulsar topics. +The Data generator source connector produces messages for testing and persists the messages to {pulsar-short} topics. xref:streaming-learning:pulsar-io:connectors/sources/data-generator.adoc[Data Generator source documentation] [#debezium-mongodb-source] === Debezium MongoDB source -The Debezium MongoDB source connector reads data from Debezium MongoDB systems and produces data to Pulsar topics. +The Debezium MongoDB source connector reads data from Debezium MongoDB systems and produces data to {pulsar-short} topics. xref:streaming-learning:pulsar-io:connectors/sources/debezium-mongodb.adoc[Debezium MongoDB source documentation] [#debezium-mysql-source] === Debezium MySQL source -The Debezium MySQL source connector reads data from Debezium MySQL systems and produces data to Pulsar topics. +The Debezium MySQL source connector reads data from Debezium MySQL systems and produces data to {pulsar-short} topics. xref:streaming-learning:pulsar-io:connectors/sources/debezium-mysql.adoc[Debezium MySQL source documentation] [#debezium-oracle-source] === Debezium Oracle source -The Debezium Oracle source connector reads data from Debezium Oracle systems and produces data to Pulsar topics. +The Debezium Oracle source connector reads data from Debezium Oracle systems and produces data to {pulsar-short} topics. xref:streaming-learning:pulsar-io:connectors/sources/debezium-oracle.adoc[Debezium Oracle source documentation] [#debezium-postgres-source] === Debezium Postgres source -The Debezium PostgreSQL source connector reads data from Debezium PostgreSQL systems and produces data to Pulsar topics. +The Debezium PostgreSQL source connector reads data from Debezium PostgreSQL systems and produces data to {pulsar-short} topics. xref:streaming-learning:pulsar-io:connectors/sources/debezium-postgres.adoc[Debezium PostgreSQL source documentation] [#debezium-sql-server-source] === Debezium SQL Server source -The Debezium SQL Server source connector reads data from Debezium SQL Server systems and produces data to Pulsar topics. +The Debezium SQL Server source connector reads data from Debezium SQL Server systems and produces data to {pulsar-short} topics. xref:streaming-learning:pulsar-io:connectors/sources/debezium-sqlserver.adoc[Debezium SQL Server source documentation] [#kafka-source] === Kafka source -The Kafka source connector reads data from Kafka systems and produces data to Pulsar topics. +The Kafka source connector reads data from Kafka systems and produces data to {pulsar-short} topics. xref:streaming-learning:pulsar-io:connectors/sources/kafka.adoc[Kafka source connector documentation] [#kinesis-source] === AWS Kinesis source -The AWS Kinesis source connector reads data from Kinesis systems and produces data to Pulsar topics. +The AWS Kinesis source connector reads data from Kinesis systems and produces data to {pulsar-short} topics. xref:streaming-learning:pulsar-io:connectors/sources/kinesis.adoc[Kinesis source connector documentation] == Experimental Connectors -DataStax is always experimenting with connectors. +{company} is always experimenting with connectors. Below are the connectors available in the luna-streaming-all version of *Luna Streaming*. We call these *experimental connectors* because they have not yet been promoted to official support in our *Astra Streaming* clusters, but they will work in your Luna Streaming cluster. @@ -222,4 +222,4 @@ We call these *experimental connectors* because they have not yet been promoted == Next steps -For more on connectors and Pulsar, see xref:streaming-learning:pulsar-io:connectors/index.adoc[] and the https://pulsar.apache.org/docs/2.11.x/io-overview/[Pulsar documentation]. \ No newline at end of file +For more on connectors and {pulsar-short}, see xref:streaming-learning:pulsar-io:connectors/index.adoc[] and the https://pulsar.apache.org/docs/2.11.x/io-overview/[{pulsar-short} documentation]. \ No newline at end of file diff --git a/modules/operations/pages/scale-cluster.adoc b/modules/operations/pages/scale-cluster.adoc index 971dbc31..968f0b5b 100644 --- a/modules/operations/pages/scale-cluster.adoc +++ b/modules/operations/pages/scale-cluster.adoc @@ -4,9 +4,9 @@ You can scale Luna Streaming clusters up for more compute capacity and down for include::operations:partial$operator-scaling.adoc[] -== Install Pulsar cluster +== Install {pulsar-short} cluster -For our Pulsar cluster installation, use this https://github.com/datastax/pulsar-helm-chart[Helm chart]. +For our {pulsar-short} cluster installation, use this https://github.com/datastax/pulsar-helm-chart[Helm chart]. To start the cluster, use the values provided in this https://github.com/datastax/pulsar-helm-chart/blob/master/examples/dev-values.yaml[YAML file]: @@ -34,7 +34,7 @@ diff ~/dev-values.yaml ~/dev-values_large.yaml > defaultWriteQuorum: 3 ---- -. Create the cluster by installing Pulsar with `dev-values_large.yaml`: +. Create the cluster by installing {pulsar-short} with `dev-values_large.yaml`: + [source,bash] ---- @@ -96,7 +96,7 @@ kubectl delete pvc pulsar-bookkeeper-ledgers-pulsar-bookkeeper-3 replicaCount: 5 ---- -. Upgrade the Helm chart to use the new value in the Pulsar cluster: +. Upgrade the Helm chart to use the new value in the {pulsar-short} cluster: + [source,bash] ---- diff --git a/modules/operations/pages/troubleshooting.adoc b/modules/operations/pages/troubleshooting.adoc index bb145011..9f6d2cfa 100644 --- a/modules/operations/pages/troubleshooting.adoc +++ b/modules/operations/pages/troubleshooting.adoc @@ -41,7 +41,7 @@ image::gcp-quota-example2.png[GCP Backend Quota] If your pods are stuck in a *Pending* state after installation or your cloud provider is warning you about *Unschedulable Pods*, there are a few ways to work through this: -* If some of your pods start, but others like `pulsar-adminconsole` and `pulsar-grafana` are left in an *Unschedulable* state, you might need to add CPUs to your existing nodes or an additional node pool. Luna Streaming requires more resources than Apache Pulsar. +* If some of your pods start, but others like `pulsar-adminconsole` and `pulsar-grafana` are left in an *Unschedulable* state, you might need to add CPUs to your existing nodes or an additional node pool. Luna Streaming requires more resources than {pulsar}. * To examine a specific pod, use `kubectl describe`. For example, if your `pulsar-bookkeeper-0` pod is not scheduling, use `kubectl describe pods/pulsar-bookkeeper-0` to view detailed output on the pod's state, dependencies, and events. @@ -88,7 +88,7 @@ To create a public namespace, run `pulsar-admin namespaces create public/default === Publish a message -To test your Pulsar cluster with the bastion pod, produce a message with `pulsar-client` through the bastion pod shell: +To test your {pulsar-short} cluster with the bastion pod, produce a message with `pulsar-client` through the bastion pod shell: [source,bash] ---- @@ -102,7 +102,7 @@ You should receive a confirmation that the message was produced: 00:16:37.970 [main] INFO org.apache.pulsar.client.cli.PulsarClientTool - 1 messages successfully produced ---- -This means your Pulsar cluster is functional. +This means your {pulsar-short} cluster is functional. If the message isn't produced, double-check your message syntax. == Next steps diff --git a/modules/operations/partials/manually-create-credentials.adoc b/modules/operations/partials/manually-create-credentials.adoc index 15b118b3..0bc0b7ef 100644 --- a/modules/operations/partials/manually-create-credentials.adoc +++ b/modules/operations/partials/manually-create-credentials.adoc @@ -1,6 +1,6 @@ A number of values need to be stored in secrets prior to enabling token-based authentication. -. Generate a key-pair for signing the tokens using the Pulsar tokens command: +. Generate a key-pair for signing the tokens using the {pulsar-short} tokens command: + [source,bash] ---- diff --git a/modules/operations/partials/operator-scaling.adoc b/modules/operations/partials/operator-scaling.adoc index 86b317c9..0c4105ea 100644 --- a/modules/operations/partials/operator-scaling.adoc +++ b/modules/operations/partials/operator-scaling.adoc @@ -1,4 +1,4 @@ [TIP] ==== -The xref:kaap-operator::index.adoc[Kubernetes Autoscaling for Apache Pulsar (KAAP)] operator takes care of scaling Pulsar cluster components, deploying new clusters, and even migrating your existing cluster to an operator-managed deployment. +The xref:kaap-operator::index.adoc[Kubernetes Autoscaling for {pulsar} (KAAP)] operator takes care of scaling {pulsar-short} cluster components, deploying new clusters, and even migrating your existing cluster to an operator-managed deployment. ==== \ No newline at end of file