This guide is intended to help you troubleshoot and proactively collect information for a technical support request. When troubleshooting issues with the Sysdig Monitor service or where the agent will not install or run, we may ask for multiple pieces of information. Supplying as much information as possible with your request will help considerably in resolving your issue as quickly as possible.
In general, please detail the problem you are having, listing any errors you see (screenshots preferred) and supply the most recent log file covering the time period of the error or suspected anomaly. Since the agent creates up to eleven 10MB log files in rotation, each with a date and time-stamp, we are able to troubleshoot some agent reporting issues for a substantial time after any event.
Confirm Agent Connectivity
When you notice a host not showing up in the user interface or see "Error, connection_manager: Lost connection" messages in the agent's log file (/opt/draios/logs/draios.log), suspect agent connectivity problems. Here are a few quick things to check:
1. Confirm that you have not used all available agent licenses. The agent license count is available in the Settings > Subscription tab. You can purchase additional agent licenses from that tab if needed.
2. Confirm basic connectivity through your firewall:
3. Confirm the correct port is open through your firewall:
telnet collector.sysdigcloud.com 6666
See the FAQ to change the port number to 80 if 6666 is not available.
4. Check for duplicate MAC addresses in your hosts, when the agent starts, an entry in the /opt/draios/logs/draios.log file reports the host's MAC:
2016-09-26 10:20:25.982, 2363, Information, machine id: a2:11:0b:84:11:21
Compare the logged MAC address to any existing reporting agents in the Explore tab using the 'Hosts & Containers' hierarchy (Show feature).
5. Confirm your access key is correct. Your access key is available in the Settings > User Profile tab of your account. You can see the Sysdig agent key as configured in /opt/draios/etc/dragent.yaml.
The agent is routinely updated to include new features and resolve bugs. Many times problems can be resolved by simply updating your Sysdig agent. Please update your agent to make sure you are not troubleshooting a known issue.
For Your Support Request
If you need to contact us about a problem, please review the list below and feel free to supply anything else you think may be useful in helping us to understand the issue you are having and speed its resolution:
1 - Your Sysdig Monitor Account
If the Sysdig Monitor account you are using is not the same as the email address on your support ticket or signature, please be sure to list it so we troubleshoot the proper account. Your full name and company name can also be useful in finding you in our databases.
2 - The Operating System
Please be sure to check that your operating system is supported. Submitting the output of
uname -a and
lsb_release -a will help us determine if the kernel is supported. If you have a custom kernel and the kernel development headers are not available you will not be able to install our agent.
Be sure to also note if the Linux distribution used inside your app's container is not the same as the host.
3 - Your Infrastructure and Agent Version
When using our agent in an orchestration infrastructure (Kubernetes, ECS, OpenShift, Mesos and etc.) please let us know what kind and version of orchestration environment your containers are running in. This is especially useful if you are using the very latest environments.
Similarly, it's important to verify the version of the Sysdig agent since an older agent may not support collecting metrics from newer orchestration tools. You can check the agent version with the command:
Compare your installed version to the latest release shown on our agent build list.
Upgrade any older agents to the latest version to make sure you are not encountering a known issue: Sysdig-Agent-Update-Uninstall
4 - Agent Start Command or Manifest File
Many agent connection problems are due to transcription errors in the agent start command or manifest files. This is especially true with truncated access keys and when using the Additional_Conf parameter in a container agent installation. Always cut and paste and then modify our example commands or manifest files.
Try running the agent from the command line using the 'docker start' or curl command and send in the command used and initial output. If using docker start, remove the '-d' option so output will display on the console. Your docker start and native agent run commands are available in Settings > Agent Installation tab of the user interface.
5 - Sysdig Agent Configuration File and Logs
The Sysdig agent reads the user-settings configuration file /opt/draios/etc/dragent.yaml and generates log entries in /opt/draios/logs/draios.log. The agent will rotate out the log file when it reaches 10MB in size keeping the 10 most recent log files archived with a date-stamp appended to the filename.
It's always helpful to attach the config file and latest log to your support request if you see metrics not reporting or have agent connection issues. Since the agent logs critical startup information when initializing, restarting the agent and then collecting the logs is desirable.
Whenever possible, be sure to "attach" any files rather than cut/paste log file or config file text inline in the email body. Important formatting will be preserved.
To copy the configuration file and most recent log file out of an agent running in a container use these Docker commands:
docker cp sysdig-agent:/opt/draios/logs/draios.log ./draios.log docker cp sysdig-agent:/opt/draios/etc/dragent.yaml ./dragent.yaml
Please compress large files before attaching them to a support ticket. Files over 7MB will require us to supply you with a download link.
Also be sure to let us know the host name that the files came from.
6 - Sysdig Monitor On-Prem Logs
In addition to the above, if you are running the Sysdig Monitor on-premises version, you can generate a complete support bundle from the web console's Support tab. The support bundle is invaluable when troubleshooting suspected 'backend' issues such as problems with component startup or when all of your agents are having problems connecting, etc.
Go to the Support tab and click on "Download Support Bundle". It can take a minute for larger installations or those with more history. You will be prompted to save a file "replicated-support<#####>.tar.gz".
Files over 7MB will require us to supply you with a download link.
Generating A Sysdig Agent Coredump File
We typically ask for this only when the agent log files do not supply enough details. Coredump files are useful for troubleshooting when the Sysdig agent crashes. The ability to create a coredump is available starting in agent version 0.21.0. To allow the agent to create a file upon a crash, add the coredump entry to the agent's user settings configuration file
echo coredump: true >> /opt/draios/etc/dragent.yaml
After restarting the agent ('service dragent restart'` or 'docker restart sysdig-agent'), when a crash next occurs, a coredump file will be generated which can be sent to email@example.com for troubleshooting.
Coredumps can be found in the location configured in
/proc/sys/kernel/core_pattern. Usually the location is /tmp or the process' current working directory. However, note that some operating systems (Ubuntu) have a hook that does custom logic with the core file. For easier troubleshooting in those cases, you can temporarily override the hook by putting 'core' inside
echo core | sudo tee /proc/sys/kernel/core_pattern
The coredump will be called
core and and will be found in root
/opt/draios if the Sysdig agent is installed natively, otherwise in
/ within the agent container. For container agent installations, retrieve the core file with:
docker cp sysdig-agent:/core .
For more information on adding parameters to a container agent's configuration file, see the FAQ: How-can-I-edit-the-agent-s-configuration-file?