Sysdig Install: Mesos/Marathon/DCOS (CLI method)

Sysdig Monitor is the first and only monitoring, alerting, and troubleshooting solution designed from the ground up to provide unprecedented visibility into containerized infrastructures.

Sysdig Monitor comes with built-in, first class support for Mesos, Marathon, and DC/OS. In order to instrument your Mesos environment with Sysdig Monitor, you simply need to install the Sysdig agent container on each underlying host in your Mesos cluster. Sysdig Monitor will automatically begin monitoring all of your hosts, apps, containers, and frameworks, and will also automatically connect to the Mesos and Marathon APIs to pull relevant metadata about your environment.

Recommended Setup

The recommended install method is to (1) deploy the Sysdig agent on all Mesos Agent (aka "Slave") nodes automatically using a Marathon app, and then (2) manually install the Sysdig agent on the Mesos Master nodes.

Step 1: Deploy the Sysdig agent on your Mesos Agent nodes

If you're using DC/OS, then you can find Sysdig in the Mesosphere Universe marketplace of apps. Installing from the Universe will automatically deploy the Sysdig agent container on your Mesos Agent nodes as a Marathon app.

Alternatively, you can use the following example to deploy yourself:

$cat <<- EOF > "sysdig.json"
  "id": "sysdig-agent",
  "cpus": 1.0,
  "constraints": [["hostname", "UNIQUE"]],
  "mem": 850.0,
  "labels" : {"role" : "monitoring", "name" : "sdc-agent" },
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "sysdig/agent",
      "forcePullImage": true,
      "network": "HOST",
      "privileged": true,
      "parameters": [
        { "key": "pid", "value": "host" },
        { "key": "env", "value": "ACCESS_KEY=YOUR-ACCESS-KEY-HERE" },
        { "key": "env", "value": "COLLECTOR=COLLECTOR-ADDRESS-HERE" },
  { "key": "shm-size", "value": "350m" } ] }, "volumes": [ { "containerPath": "/host/var/run/docker.sock", "hostPath": "/var/run/docker.sock", "mode": "RW" }, { "containerPath": "/host/dev", "hostPath": "/dev", "mode": "RW" }, { "containerPath": "/host/proc", "hostPath": "/proc", "mode": "RO" }, { "containerPath": "/host/boot", "hostPath": "/boot", "mode": "RO" }, { "containerPath": "/host/lib/modules", "hostPath": "/lib/modules", "mode": "RO" }, { "containerPath": "/host/usr", "hostPath": "/usr", "mode": "RO" } ] } } EOF

The above JSON should be POST-ed to Marathon leader API server; please replace the placeholders for number of instances (YOUR-NUMBER-OF-INSTANCES-HERE), access key (YOUR-ACCESS—KEY-HERE) and collector address (YOUR-COLLECTOR-ADDRESS-HERE) appropriately, as well as the “cpus”, “mem” and “labels” entries to fit the capacity and requirements of the cluster environment. Note, the collector address is only required for Sysdig Enterprise on-prem deployments.

$curl -X POST http://$(hostname -i):8080/v2/apps -d @sysdig.json -H "Content-type: application/json"

Step 2: Deploy the Sysdig agent on your Mesos Master nodes

Follow the standard Sysdig container install instructions to manually install the Sysdig agent on your Mesos Master nodes. 

The Sysdig agent will automatically look for the process named “mesos-master” on the Master node. If the process is found at any time, the Sysdig agent will automatically connect to the local Mesos and Marathon (if available) API servers via http://localhost:5050 and http://localhost:8080 respectively, to collect cluster configuration and current state metadata in addition to host metrics.

Note, special care should be taken not to schedule Sysdig agents running as Marathon apps on Mesos Master nodes. If a cluster node has both Mesos Master and Mesos Agent roles, the Sysdig Agent should not be installd manually to avoid two instances of the Sysdig agent from running and causing errors.

Additional Configuration

Some additional configuration may be required for standard installations to monitor Marathon. There are several additional configuration parameters required for Mesos installations in certain unique situations:

  • The Sysdig agent can not be run directly on the Mesos API server
  • The API server is protected with a username/password

Descriptions of each parameter and examples are shown below.

Monitoring Marathon Instances 

In order to monitor marathon instances, the app check configuration needs to be updated as well. You can obtain the 'arg' in 'pattern' by executing "ps -ef | grep marathon" on the host where marathon is running.
mesos_state_uri: http://<Mesos_HOST_IP>:5050
  - http://<Marathon_HOST_IP>:8080
  - name: marathon
    check_module: marathon
    interval: 30
      arg: marathon-assembly-1.4.5.jar 
      url: http://<Marathon_HOST_IP>:8080 

Agent Can Not Run On Mesos API Server

If the API server can not be instrumented with a Sysdig agent, simply choose another node with an agent installed to remotely receive infrastructure information from the API server. Choose only one other agent installed node for this role. Please not that, when a static configuration entry is present for either Mesos or Marathon, the Sysdig agent will not automatically detect Mesos Master migrations - leader changes automatic detection is only enabled in case when there are no static configuration file entries.

Add the following Mesos parameter to the delegated agent's user settings configuration file /opt/draios/etc/dragent.yaml to allow it to connect to the remote API server and authenticate. Specify the API server's connection method, address and port. Also specify credentials if necessary:

mesos_state_uri: http://[acct:passwd@][hostname][:port]
marathon_uris: - http://[acct:passwd@][hostname][:port]

Note: Although `marathon_uris:` is an array, currently only a single "root" Marathon framework per cluster is supported; multiple side-by-side Marathon frameworks should not be configured in order for our agent to function properly. Multiple side-by-side "root" Marathon frameworks on the same cluster are currently not supported. The only supported multiple-Marathon configuration is with one "root" Marathon and other Marathon frameworks as its apps.

For more information on adding parameters to a container agent's configuration file, see the FAQ: How-can-I-edit-the-agent-s-configuration-file?


The Mesos API Server Requires Authentication

If the agent is installed on the API server but the API server uses a different port or requires authentication, they must be explicitly specified. Add the following Mesos parameters to the API server's agent configuration file /opt/draios/etc/dragent.yaml to make it connect to the API server and authenticate with any unique account and password. Specify the API server's protocol, user credentials, and port:

mesos_state_uri: http://[username:password@][hostname][:port]
  - http://[acct:passwd@][hostname][:port]

Example Configuration File

This example shows all parameters which would be installed in the user settings file:/opt/draios/etc/dragent.yaml for an agent running on the API server, using a non-standard port and requiring user authentication.

customerid: 831g28-Your-Key-Here-a0fpc9d
tags: local:nyc1,linux:ubuntu,acct:dev
mesos_state_uri: http://myacct:mypass@localhost:5050
  - http://myacct:mypass@localhost:8088

HTTPS Protocol

HTTPS protocol is supported.

Turning Off Meta Data Reception

In troubleshooting cases where auto-detection and reporting of your Mesos infrastructure needs to be temporarily turned off in a designated agent, comment or remove any Mesos parameter entries and restart the agent. If the agent is running on the API server and auto-detecting a default configuration, you can add the below entry to the agent's configuration file and disable detection and reporting:

mesos_autodetect: false

Have more questions? Submit a request