Administration
This administration guide explains how to manage NodeSniff after the initial setup. It is intended for administrators, DevOps teams, infrastructure owners, and technical users responsible for keeping monitored systems visible, secure, and reliable.
Overview
Administration covers the operational side of NodeSniff. It includes user access, monitored systems, agents, security settings, updates, backups, logs, diagnostics, and troubleshooting.
NodeSniff can be used to monitor servers, virtual machines, cloud instances, Raspberry Pi devices, edge devices, remote assets, and other Linux-based systems running the NodeSniff agent.
A good administration process keeps the platform predictable. Administrators should know which systems are monitored, which agents are active, who has access, where logs are stored, and how to recover the platform when something fails.
Main administration areas
- User and access management
- Server and device management
- Agent lifecycle management
- Security and token handling
- Updates and release management
- Backup and recovery
- Logs, diagnostics, and troubleshooting
User Management
User management controls who can access the NodeSniff dashboard and what actions they can perform. Access should be assigned only to users who need it.
In production environments, avoid shared administrator accounts. Each person should use an individual account so access can be reviewed and revoked when needed.
Typical tasks
- Create new users
- Disable users who no longer need access
- Assign roles and permissions
- Reset passwords
- Review administrator access regularly
Recommended access review
Access should be reviewed periodically, especially after team changes, project changes, or handovers between administrators.
# Example access review checklist
- Review active users
- Remove accounts no longer needed
- Confirm administrator roles
- Check recently created accounts
- Rotate shared credentials if they existServer Management
In NodeSniff, a monitored server is any system registered in the platform and reporting metrics through the NodeSniff agent.
This may include a physical server, virtual machine, cloud instance, edge device, industrial computer, Raspberry Pi, or other Linux-based system.
Common server management tasks
- Add a new monitored system
- Remove a system that is no longer used
- Check the last reporting timestamp
- Review hostname, IP address, operating system, and agent version
- Regenerate an API token when required
Checking basic system information
When investigating a monitored system, start with basic system information.
hostnamectl
uname -a
ip addr
df -hChecking uptime and load
uptime
cat /proc/loadavg
topChecking disk usage
df -h
du -sh /var/log/* 2>/dev/null
du -sh /opt/nodesniff/* 2>/dev/nullIf a server stops reporting metrics, first check whether the system is online, whether the agent service is running, and whether the system can reach the NodeSniff API.
Agent Management
The NodeSniff agent collects operational metrics from monitored systems and sends them to the platform API. Keeping the agent healthy is one of the main administration tasks.
Checking agent status
Verify that the agent service is running.
sudo systemctl status nodesniff-agentStarting the agent
sudo systemctl start nodesniff-agentStopping the agent
sudo systemctl stop nodesniff-agentRestarting the agent
Restart the service after configuration changes or package updates.
sudo systemctl restart nodesniff-agentEnabling the agent on boot
sudo systemctl enable nodesniff-agentViewing recent logs
sudo journalctl -u nodesniff-agent -n 100 --no-pagerFollowing live logs
sudo journalctl -u nodesniff-agent -fExample agent configuration
# /etc/nodesniff/config.yml
server:
url: https://monitor.example.com
authentication:
token: YOUR_API_TOKEN
agent:
interval: 60
hostname: auto
include_extra_metrics: trueReloading service configuration
Use this after changing a systemd unit file.
sudo systemctl daemon-reload
sudo systemctl restart nodesniff-agentSecurity
NodeSniff should be operated with secure communication, restricted access, and protected credentials. API tokens should be treated like passwords.
Recommended practices
- Use HTTPS in production
- Keep API tokens private
- Limit dashboard access to trusted users
- Use firewall rules where possible
- Rotate credentials when access changes
- Keep the platform and agents updated
Testing API connectivity
curl -I https://monitor.example.com/api/healthTesting API response
curl https://monitor.example.com/api/healthTesting DNS resolution
dig monitor.example.com
nslookup monitor.example.comChecking firewall status
sudo ufw status
# or
sudo iptables -LChecking open ports
ss -tulpn
sudo lsof -i -P -nExample Nginx reverse proxy check
sudo nginx -t
sudo systemctl reload nginxUpdates
Updates may include dashboard improvements, API changes, database changes, security fixes, and new agent versions.
Production environments should not be updated blindly. Review the release notes before applying updates, especially when database migrations or agent changes are included.
Recommended update workflow
- Review the release notes
- Create a database backup
- Update the platform
- Update agents if required
- Verify that metrics are still being received
Updating an agent package
sudo apt update
sudo apt install nodesniff-agent
sudo systemctl restart nodesniff-agentChecking the installed agent version
nodesniff-agent --version
dpkg -l | grep nodesniffChecking platform services after update
systemctl status nginx
systemctl status postgresql
systemctl status nodesniff-apiBackup & Recovery
Backups protect configuration, monitored system records, user data, and historical metrics. At minimum, administrators should back up the database and deployment configuration.
Minimum backup scope
- PostgreSQL database
- Environment configuration
- Reverse proxy configuration
- Deployment scripts or service files
- Agent installation notes and token handling procedures
Creating a PostgreSQL backup
pg_dump -U postgres nodesniff > nodesniff-backup.sqlCreating a compressed backup
pg_dump -U postgres nodesniff | gzip > nodesniff-backup.sql.gzRestoring a PostgreSQL backup
psql -U postgres nodesniff < nodesniff-backup.sqlRestoring a compressed backup
gunzip -c nodesniff-backup.sql.gz | psql -U postgres nodesniffBacking up configuration files
sudo tar -czf nodesniff-config-backup.tar.gz /etc/nginx/sites-available /etc/systemd/system /opt/nodesniff /etc/nodesniffLogs & Diagnostics
Logs are the first place to check when something does not work as expected. They help identify authentication problems, connectivity issues, service failures, and database errors.
Agent logs
sudo journalctl -u nodesniff-agent -n 100 --no-pager
sudo journalctl -u nodesniff-agent -fPlatform service logs
sudo journalctl -u nodesniff-api -n 100 --no-pager
sudo journalctl -u nginx -n 100 --no-pager
sudo journalctl -u postgresql -n 100 --no-pagerBasic diagnostics
hostnamectl
uptime
df -h
free -m
topNetwork diagnostics
ping monitor.example.com
traceroute monitor.example.com
curl -v https://monitor.example.com/api/healthExample successful agent log
[INFO] Agent started
[INFO] Collecting metrics
[INFO] Sending report to https://monitor.example.com/api/report
[INFO] Report accepted: HTTP 200Example failed token log
[INFO] Collecting metrics
[INFO] Sending report to https://monitor.example.com/api/report
[ERROR] Report rejected: HTTP 403
[ERROR] Invalid API tokenTroubleshooting
Most administration problems are related to service status, network connectivity, authentication, TLS, or database availability.
Agent is offline
- Check whether the agent service is running
- Review recent agent logs
- Verify the platform URL
- Check outbound network access
- Confirm that the API token is valid
sudo systemctl status nodesniff-agent
sudo journalctl -u nodesniff-agent -n 100 --no-pager
curl -I https://monitor.example.com/api/healthMetrics are missing
- Check the last reporting timestamp in the dashboard
- Restart the agent
- Verify that the monitored system has network access
- Review API logs on the platform side
sudo systemctl restart nodesniff-agent
sudo journalctl -u nodesniff-agent -fInvalid API token
If the platform rejects an agent request, regenerate the token in the dashboard and update the agent configuration on the monitored system.
# /etc/nodesniff/config.yml
authentication:
token: YOUR_NEW_API_TOKENDashboard unavailable
- Check whether the web application is running
- Check reverse proxy configuration
- Verify DNS
- Verify TLS certificates
- Review application logs
sudo systemctl status nginx
sudo nginx -t
curl -I https://monitor.example.comDatabase connection failed
- Check whether PostgreSQL is running
- Verify database credentials
- Check environment variables
- Review database logs
sudo systemctl status postgresql
psql -U postgres -l
sudo journalctl -u postgresql -n 100 --no-pagerTLS certificate problem
openssl s_client -connect monitor.example.com:443 -servername monitor.example.com
curl -Iv https://monitor.example.comDocker deployment issues
docker ps
docker compose logs -f
docker compose restartMaintenance
Regular maintenance helps keep NodeSniff stable and predictable over time.
Recommended maintenance tasks
- Review users and permissions
- Check agent versions
- Rotate old credentials
- Review storage usage
- Clean or archive old metrics if required
- Test backups regularly
- Review release notes before applying updates
Checking storage usage
df -h
du -sh /var/lib/postgresql/* 2>/dev/null
du -sh /var/log/* 2>/dev/nullChecking service health
systemctl --failed
systemctl status nginx
systemctl status postgresql
systemctl status nodesniff-apiExample monthly checklist
# Monthly NodeSniff maintenance
- Review users
- Review inactive servers
- Check failed services
- Check database size
- Check disk usage
- Verify latest backup
- Test restore procedure
- Review release notes
- Plan agent updatesOperational Checklist
The following checklist can be used when onboarding a new monitored system.
New system checklist
- Confirm system owner
- Confirm hostname
- Confirm environment
- Install the agent
- Configure the API token
- Verify first successful report
- Confirm dashboard visibility
# Quick validation
hostname
systemctl status nodesniff-agent
journalctl -u nodesniff-agent -n 50 --no-pager
curl -I https://monitor.example.com/api/healthIncident Response
During incidents, focus first on restoring visibility. After the system is stable, investigate root cause and document the findings.
Initial checks
date
hostname
uptime
systemctl --failed
df -h
free -mConnectivity checks
ping monitor.example.com
curl -v https://monitor.example.com/api/health
ss -tulpnService checks
systemctl status nodesniff-agent
systemctl status nginx
systemctl status postgresqlToken Rotation
Token rotation should be performed when a token is exposed, when a system owner changes, or during planned security maintenance.
Rotation process
- Generate a new token in the dashboard
- Update the agent configuration
- Restart the agent
- Verify that new reports are accepted
- Revoke or invalidate the old token
sudo nano /etc/nodesniff/config.yml
sudo systemctl restart nodesniff-agent
sudo journalctl -u nodesniff-agent -n 50 --no-pagerEnvironment Variables
Platform deployments usually depend on environment variables for database connections, API configuration, and runtime settings.
Example environment file
# /opt/nodesniff/.env
DATABASE_URL=postgresql://nodesniff:password@localhost:5432/nodesniff
APP_ENV=production
LOG_LEVEL=info
PUBLIC_URL=https://monitor.example.comChecking environment variables
systemctl show nodesniff-api --property=Environment
cat /opt/nodesniff/.env