How to create/remove/enlarge/reduce one JStorm logic cluster on yarn?

function points

Currently support cluster management through script invokes the thrift interface. The invoke path is bin/JstormYarn

Create a cluster

  • Create a cluster, limit the number of cluster resources (cpu core and memory)
  • Each cluster can support different versions of jstorm and jdk
  • Multiple logical cluster container can run on one machine
  • Start: Executive submitJstormOnYarn command to submit a cluster, then start nimbus and supervisor with the startNimbus and addSupervisors command

Destroy a cluster

  • execute killJstormOnYarn, kill the process

Upgrading a cluster binary

  • Update jstorm deployment files under deploy directory
  • Execute upgradeCluster command

Upgrading a cluster configuration

  • Update storm.yaml under deploy/jstorm/conf
  • Execute upgradeCluster command

Download binary and configuration file of a cluster

  • The configuration item jstorm.yarn.instance.deploy.dir is the storage path of current binary files on hadoop

Restart Cluster

  • Execute upgradeCluster command

View cluster status

  • Execute info command

Expand and reduce the capacity of cluster

  • Execute Command addSupervisors and removeSupervisors

Expand and reduce the capacity of cluster with rack assignment

  • Execute Command addSpecSupervisors and removeSpecSupervisors

JStorm-on-Yarn Process

jstorm-on-yarn use wholesale mode Unlike spark-on-yarn use resale mode. means an AM is generated each time when submitting an application to cluster. The AM of jstorm-on-yarn is permanent, that is, for a JStorm cluster, it will be only one AM.

wholesale mode raise 50% utilization, for example, in production environment we found resale mode (spark-on-yarn) can only use 60-70% physical resource. but wholesale can reach nearly 90%. that’s a wide gap;Flexible resource grading support requirement better in many cases,eg:application 1 use more memory resource in period A, use more cpu cores in period B, application 2 use more network IO in period A,use more memory resource in period B, then, if use resale mode , total resource needs equals every resource ‘s peak value of application 1 plus every resource’s peak value of application 2. and wholesale mode can share resource between multi-application in whole lifecycle,reduce resource waste;And if use resale mode , container often crash. because default GC strategy most application can’t trigger oldgen GC, when heap memory run out, NM kill this container and AM will start another contaier in difference place. that need schedule overhead and state management. use wholesale mode this situation can be completely avoided

So currently jstorm-on-yarn works as that there is an AM as total control. It is responsible for the following tasks: * Create Nimbus (and automatically restart nimbus when it hung) * Receives requests and create Supervisor * Cluster capacity expand and reduce dynamically

Submitting, killing and other operations on the topology, interact with nimbus and supervisor in the container directly as with the standalone cluster of JStorm, do not interact with AM.

Create AM

From start-JstormYarn.sh start, it will call JStormOnYarn class to create AM, essentially the yarn client of JStorm. This class is quite simple, it mainly set some parameters of nimbus container, such as memory, CPU core, jar, libjar, shellscript (actual value is start_jstorm.sh, for starting nimbus and supervisor). Then uploaded the jar and shellscript to the HDFS. Finally call yarnClient.submitApplication (appContext) to create the AM.

Creating nimbus and supervisor

This step is achieved through thrift interface, embodied as: client end JstormAM.py (automatically generated by thrift), server-side automatically generated RPC interface is JstormAM, while the real processing logic is in JstormAMHandler class (the same as the thrift in jstorm). The handler initializes thrift server and monitor client’s requests when it creates AM.

Look at a few important methods: addSupervisors and startNimbus In fact, these two methods have similar implements. In essence, it specified the number of container, CPU core count. Launching the supervisor or nimbus is done in JStormMaster. It determine to launch the supervisor or nimbus base on the container’s priority property.

Note that, when jstorm-on-yarn creates a supervisor, not only apply the memory for supervisor itself, but will specify a chunk of memory (such as 20 or 40G), the container will hold all the worker under the supervisor. When the container is hung due to some question , all the workers will be killed, which is more stringent than in the standalone cluster.

Maintenance and configuration instructions

Jstorm Configuration

In order to facilitate the maintenance, please add the following configuration at all storm.yaml in YARN cluster:

 jstorm.on.yarn: true
 supervisor.childopts:  -Djstorm-on-yarn=true
 jstorm.log.dir: /home/yarn/jstorm_logs/<cluster_name>

if you need enable nimbus auto restart, pleas add the following configuration

blobstore.dir: /yourhdfsdir
blobstore.hdfs.hostname: yourhdfshost
blobstore.hdfs.port: yourhdfsport

The above configuration will override the default log path, all the logs are redirected to the /home/yarn below, preventing from missing the log after the container hangs. If more than one container of supervisor is launch on the same machine, it will add value supervisor.deamon.logview.port at the end of supervisor log (YARN dynamically generated port and port range for each container) to distinguish logs and prevent the logs from affecting each other. For example if two supervisor http ports are 9000 and 9001 respectively, then supervisor-9000.log and supervisor-9001.log are generated. koala will add the specified port suffix for the log according to the jstorm.on.yarn configuration.

Yarn Configuration

TO BE ADDED

Yarn introductory series articles

See: http: //zh.hortonworks.com/get-started/yarn/