Administration

This administration guide explains how to manage NodeSniff after the initial setup. It is intended for administrators, DevOps teams, infrastructure owners, and technical users responsible for keeping monitored systems visible, secure, and reliable.

Note: This page focuses on platform administration. For daily dashboard usage, see the User Guide.

Overview

Administration covers the operational side of NodeSniff. It includes user access, monitored systems, agents, security settings, updates, backups, logs, diagnostics, and troubleshooting.

NodeSniff can be used to monitor servers, virtual machines, cloud instances, Raspberry Pi devices, edge devices, remote assets, and other Linux-based systems running the NodeSniff agent.

A good administration process keeps the platform predictable. Administrators should know which systems are monitored, which agents are active, who has access, where logs are stored, and how to recover the platform when something fails.

Main administration areas

User and access management
Server and device management
Agent lifecycle management
Security and token handling
Updates and release management
Backup and recovery
Logs, diagnostics, and troubleshooting

User Management

User management controls who can access the NodeSniff dashboard and what actions they can perform. Access should be assigned only to users who need it.

In production environments, avoid shared administrator accounts. Each person should use an individual account so access can be reviewed and revoked when needed.

Typical tasks

Create new users
Disable users who no longer need access
Assign roles and permissions
Reset passwords
Review administrator access regularly

Recommended access review

Access should be reviewed periodically, especially after team changes, project changes, or handovers between administrators.

# Example access review checklist

- Review active users
- Remove accounts no longer needed
- Confirm administrator roles
- Check recently created accounts
- Rotate shared credentials if they exist

Warning: Do not use shared administrator accounts in production environments.

Server Management

In NodeSniff, a monitored server is any system registered in the platform and reporting metrics through the NodeSniff agent.

This may include a physical server, virtual machine, cloud instance, edge device, industrial computer, Raspberry Pi, or other Linux-based system.

Common server management tasks

Add a new monitored system
Remove a system that is no longer used
Check the last reporting timestamp
Review hostname, IP address, operating system, and agent version
Regenerate an API token when required

Checking basic system information

When investigating a monitored system, start with basic system information.

hostnamectl

uname -a

ip addr

df -h

Checking uptime and load

uptime

cat /proc/loadavg

top

Checking disk usage

df -h

du -sh /var/log/* 2>/dev/null

du -sh /opt/nodesniff/* 2>/dev/null

If a server stops reporting metrics, first check whether the system is online, whether the agent service is running, and whether the system can reach the NodeSniff API.

Agent Management

The NodeSniff agent collects operational metrics from monitored systems and sends them to the platform API. Keeping the agent healthy is one of the main administration tasks.

Checking agent status

Verify that the agent service is running.

sudo systemctl status nodesniff-agent

Starting the agent

sudo systemctl start nodesniff-agent

Stopping the agent

sudo systemctl stop nodesniff-agent

Restarting the agent

Restart the service after configuration changes or package updates.

sudo systemctl restart nodesniff-agent

Enabling the agent on boot

sudo systemctl enable nodesniff-agent

Viewing recent logs

sudo journalctl -u nodesniff-agent -n 100 --no-pager

Following live logs

sudo journalctl -u nodesniff-agent -f

Example agent configuration

# /etc/nodesniff/config.yml

server:
  url: https://monitor.example.com

authentication:
  token: YOUR_API_TOKEN

agent:
  interval: 60
  hostname: auto
  include_extra_metrics: true

Reloading service configuration

Use this after changing a systemd unit file.

sudo systemctl daemon-reload
sudo systemctl restart nodesniff-agent

Warning: If the API token is changed in the dashboard, the agent configuration must be updated on the monitored system.

Security

NodeSniff should be operated with secure communication, restricted access, and protected credentials. API tokens should be treated like passwords.

Recommended practices

Use HTTPS in production
Keep API tokens private
Limit dashboard access to trusted users
Use firewall rules where possible
Rotate credentials when access changes
Keep the platform and agents updated

Testing API connectivity

curl -I https://monitor.example.com/api/health

Testing API response

curl https://monitor.example.com/api/health

Testing DNS resolution

dig monitor.example.com

nslookup monitor.example.com

Checking firewall status

sudo ufw status

# or

sudo iptables -L

Checking open ports

ss -tulpn

sudo lsof -i -P -n

Example Nginx reverse proxy check

sudo nginx -t

sudo systemctl reload nginx

Tip: If an API token is exposed, regenerate it immediately and update the agent configuration on the affected system.

Updates

Updates may include dashboard improvements, API changes, database changes, security fixes, and new agent versions.

Production environments should not be updated blindly. Review the release notes before applying updates, especially when database migrations or agent changes are included.

Recommended update workflow

Review the release notes
Create a database backup
Update the platform
Update agents if required
Verify that metrics are still being received

Updating an agent package

sudo apt update
sudo apt install nodesniff-agent
sudo systemctl restart nodesniff-agent

Checking the installed agent version

nodesniff-agent --version

dpkg -l | grep nodesniff

Checking platform services after update

systemctl status nginx
systemctl status postgresql
systemctl status nodesniff-api

Backup & Recovery

Backups protect configuration, monitored system records, user data, and historical metrics. At minimum, administrators should back up the database and deployment configuration.

Minimum backup scope

PostgreSQL database
Environment configuration
Reverse proxy configuration
Deployment scripts or service files
Agent installation notes and token handling procedures

Creating a PostgreSQL backup

pg_dump -U postgres nodesniff > nodesniff-backup.sql

Creating a compressed backup

pg_dump -U postgres nodesniff | gzip > nodesniff-backup.sql.gz

Restoring a PostgreSQL backup

psql -U postgres nodesniff < nodesniff-backup.sql

Restoring a compressed backup

gunzip -c nodesniff-backup.sql.gz | psql -U postgres nodesniff

Backing up configuration files

sudo tar -czf nodesniff-config-backup.tar.gz   /etc/nginx/sites-available   /etc/systemd/system   /opt/nodesniff   /etc/nodesniff

Warning: A backup that was never restored is only an assumption. Test restore procedures regularly.

Logs & Diagnostics

Logs are the first place to check when something does not work as expected. They help identify authentication problems, connectivity issues, service failures, and database errors.

Agent logs

sudo journalctl -u nodesniff-agent -n 100 --no-pager

sudo journalctl -u nodesniff-agent -f

Platform service logs

sudo journalctl -u nodesniff-api -n 100 --no-pager

sudo journalctl -u nginx -n 100 --no-pager

sudo journalctl -u postgresql -n 100 --no-pager

Basic diagnostics

hostnamectl
uptime
df -h
free -m
top

Network diagnostics

ping monitor.example.com

traceroute monitor.example.com

curl -v https://monitor.example.com/api/health

Example successful agent log

[INFO] Agent started
[INFO] Collecting metrics
[INFO] Sending report to https://monitor.example.com/api/report
[INFO] Report accepted: HTTP 200

Example failed token log

[INFO] Collecting metrics
[INFO] Sending report to https://monitor.example.com/api/report
[ERROR] Report rejected: HTTP 403
[ERROR] Invalid API token

Troubleshooting

Most administration problems are related to service status, network connectivity, authentication, TLS, or database availability.

Agent is offline

Check whether the agent service is running
Review recent agent logs
Verify the platform URL
Check outbound network access
Confirm that the API token is valid

sudo systemctl status nodesniff-agent

sudo journalctl -u nodesniff-agent -n 100 --no-pager

curl -I https://monitor.example.com/api/health

Metrics are missing

Check the last reporting timestamp in the dashboard
Restart the agent
Verify that the monitored system has network access
Review API logs on the platform side

sudo systemctl restart nodesniff-agent

sudo journalctl -u nodesniff-agent -f

Invalid API token

If the platform rejects an agent request, regenerate the token in the dashboard and update the agent configuration on the monitored system.

# /etc/nodesniff/config.yml

authentication:
  token: YOUR_NEW_API_TOKEN

Dashboard unavailable

Check whether the web application is running
Check reverse proxy configuration
Verify DNS
Verify TLS certificates
Review application logs

sudo systemctl status nginx

sudo nginx -t

curl -I https://monitor.example.com

Database connection failed

Check whether PostgreSQL is running
Verify database credentials
Check environment variables
Review database logs

sudo systemctl status postgresql

psql -U postgres -l

sudo journalctl -u postgresql -n 100 --no-pager

TLS certificate problem

openssl s_client -connect monitor.example.com:443 -servername monitor.example.com

curl -Iv https://monitor.example.com

Docker deployment issues

docker ps

docker compose logs -f

docker compose restart

Maintenance

Regular maintenance helps keep NodeSniff stable and predictable over time.

Recommended maintenance tasks

Review users and permissions
Check agent versions
Rotate old credentials
Review storage usage
Clean or archive old metrics if required
Test backups regularly
Review release notes before applying updates

Checking storage usage

df -h

du -sh /var/lib/postgresql/* 2>/dev/null

du -sh /var/log/* 2>/dev/null

Checking service health

systemctl --failed

systemctl status nginx
systemctl status postgresql
systemctl status nodesniff-api

Example monthly checklist

# Monthly NodeSniff maintenance

- Review users
- Review inactive servers
- Check failed services
- Check database size
- Check disk usage
- Verify latest backup
- Test restore procedure
- Review release notes
- Plan agent updates

Best Practice: For critical environments, maintenance tasks should be planned and documented. This makes upgrades, incident response, and recovery easier.

Operational Checklist

The following checklist can be used when onboarding a new monitored system.

New system checklist

Confirm system owner
Confirm hostname
Confirm environment
Install the agent
Configure the API token
Verify first successful report
Confirm dashboard visibility

# Quick validation

hostname
systemctl status nodesniff-agent
journalctl -u nodesniff-agent -n 50 --no-pager
curl -I https://monitor.example.com/api/health

Incident Response

During incidents, focus first on restoring visibility. After the system is stable, investigate root cause and document the findings.

Initial checks

date
hostname
uptime
systemctl --failed
df -h
free -m

Connectivity checks

ping monitor.example.com
curl -v https://monitor.example.com/api/health
ss -tulpn

Service checks

systemctl status nodesniff-agent
systemctl status nginx
systemctl status postgresql

Token Rotation

Token rotation should be performed when a token is exposed, when a system owner changes, or during planned security maintenance.

Rotation process

Generate a new token in the dashboard
Update the agent configuration
Restart the agent
Verify that new reports are accepted
Revoke or invalidate the old token

sudo nano /etc/nodesniff/config.yml

sudo systemctl restart nodesniff-agent

sudo journalctl -u nodesniff-agent -n 50 --no-pager

Environment Variables

Platform deployments usually depend on environment variables for database connections, API configuration, and runtime settings.

Example environment file

# /opt/nodesniff/.env

DATABASE_URL=postgresql://nodesniff:password@localhost:5432/nodesniff
APP_ENV=production
LOG_LEVEL=info
PUBLIC_URL=https://monitor.example.com

Checking environment variables

systemctl show nodesniff-api --property=Environment

cat /opt/nodesniff/.env