Flink Native integration

Since version 1.13, Flink ships with a Native Kubernetes integration. Cloudflow Contrib integration leverages this native functionality to power a Native Flink Streamlet, deployed into a Cloudflow application.

To build Flink Native Streamlets you need to add an additional sbt plugin along with the Cloudflow one in plugins.sbt:

addSbtPlugin("com.lightbend.cloudflow" % "contrib-sbt-flink" % "0.1.1")

And use the Flink Native sbt plugin functionalities in your streamlet sbt sub-project functionalities:

  .enablePlugins(CloudflowApplicationPlugin, CloudflowFlinkPlugin, CloudflowNativeFlinkPlugin)
  .settings(
    baseDockerInstructions := flinkNativeCloudflowDockerInstructions.value,
    libraryDependencies ~= fixFlinkNativeCloudflowDeps
  )

Now you can develop and use your Flink streamlets as described in the official cloudflow documentation.

Once you have run buildApp and you have the compiled Blueprint of your application you can deploy it using the kubectl cloudflow plugin and adding the option to ignore checks on the Flink operator:

kubectl cloudflow deploy your-application-cr.json --unmanaged-runtimes=flink

Now you can notice that the Flink streamlets are marked as <external> and their status is Unknown running the kubectl cloudflow status your-application command:

+------------+--------------------------------+
| Name:      |         taxi-ride-fare         |
| Namespace: |         taxi-ride-fare         |
| Version:   | 0.0.3-8-37d51e0b-20210427-1057 |
| Created:   |      2021-04-27T09:45:17Z      |
| Status:    |            Running             |
+------------+--------------------------------+
+-----------+-------------------------------------------+-------+---------+----------+
| STREAMLET | POD                                       | READY | STATUS  | RESTARTS |
+-----------+-------------------------------------------+-------+---------+----------+
| generator | taxi-ride-fare-generator-6fdb9bf946-ddh2n |  1/1  | Running |    0     |
| logger    | taxi-ride-fare-logger-7c7dcc8699-r7jxb    |  1/1  | Running |    0     |
| processor | <external>                                |  0/0  | Unknown |    0     |
+-----------+-------------------------------------------+-------+---------+----------+

To help you manage Flink streamlets we have developed some example scripts they are contained in the example-scripts folder at the root of Cloudflow Contrib public repository.

The following scripts are expected to be present on the PATH:

  • bash

  • jq (minimum version required is > 1.5, because of this issue)

  • kubectl

  • flink cli (version 1.13.0)

In the flink sub-folder the first script available is setup-example-rbac.sh and this first step needs to be performed once on any cluster you want to deploy Flink streamlets, refer to the upstream documentation for further details. This script will setup service account and cluster role binding. Take a note of the service account name that it prints in stdout - you will need it to run subsequent scripts.

Now you have 3 folders to map 3 different use-cases, note that the order of the operations matter:

  • deploy a new Cloudflow application to a cluster:

    • deploy the Cloudflow application using the kubectl cloudflow command

    • cd into the deploy folder and run ./deploy-application.sh application-name service-account-name

  • undeploy a deployed Cloudflow application:

    • cd into the undeploy folder and run ./undeploy-application.sh application-name

    • undeploy the Cloudflow application using the kubectl cloudflow command

  • redeploy a pre-existing Cloudflow application:

    • deploy/configure the Cloudflow application using the kubectl cloudflow command

    • cd into the redeploy folder and run ./redeploy-application.sh application-name service-account-name

Note that the undeploy and redeploy operations are triggering a savepoint into the Flink Job.

The provided scripts are deliberately simple and intended to be used as starting point for you to customize those operations based on your needs. The structure always resembles 3 steps:

  • fetch the Cloudflow application informations from the cluster

  • generate commands

  • executes the generated commands