There are many pieces of information that can be gathered and commands that can be run when a Nominode is in an unhealthy state to investigate the source of the issue and to recover from it. All of the commands described below should be run in the folder where your Nominode software is installed.
Nominode software is a multi-container Docker application. docker ps, docker-compose ps and docker-compose images will all produce information about your currently running containers. While docker ps will include the Container ID, Image and Creation Time that docker-compose ps does not, it also includes any running containers that are not part of the Nominode application. You can use the --format parameter with docker ps to see just the information not exposed with docker-compose ps and docker-compose images. Below are typical output examples from a healthy Nominode.
docker ps --format "table {{.ID}}\t{{.Names}}\t{{.RunningFor}}\t{{.Status}}"
CONTAINER ID NAMES CREATED STATUS
b2b9a4d678c2 nominode_workers_5 45 minutes ago Up 45 minutes
4c5775c5d364 nominode_scheduler_1 45 minutes ago Up 45 minutes
e5e2af0db190 traefik 45 minutes ago Up 45 minutes
200a9f0cc4db nominode_mysql_1 45 minutes ago Up 45 minutes
708e6fb1ed71 nominode_redis_1 45 minutes ago Up 45 minutes
269f627c7687 nominode_dockerhost_1 45 minutes ago Up 45 minutes
8140b4af383d nominode_api_5 46 minutes ago Up 46 minutes (healthy)
9ddd0d3fac92 nominode_ui_1 49 minutes ago Up 49 minutes (healthy)
ed04ac2d1221 prometheus 17 hours ago Up 17 hours
b723f9f19b3d nominode_minio_1 17 hours ago Up 17 hours
acf82673a77d cadvisor 17 hours ago Up 17 hours (healthy)
e7c2454ae05a nominode_loki_1 17 hours ago Up 17 hours
./docker-compose ps
Name Command State Ports
--------------------------------------------------------------------------------------------------------------------------
cadvisor /usr/bin/cadvisor -logtost ... Up (healthy) 8080/tcp
nominode_api_5 ./run.sh Up (healthy) 4101/tcp
nominode_dockerhost_1 /entrypoint.sh Up
nominode_loki_1 /usr/bin/loki -config.file ... Up 3100/tcp, 80/tcp
nominode_minio_1 /usr/bin/docker-entrypoint ... Up 9000/tcp
nominode_mysql_1 docker-entrypoint.sh mysqld Up 3306/tcp, 33060/tcp
nominode_redis_1 docker-entrypoint.sh redis ... Up 6379/tcp
nominode_scheduler_1 ./main scheduler Up
nominode_ui_1 docker-entrypoint.sh yarn ... Up (healthy) 7076/tcp
nominode_vault_1 docker-entrypoint.sh serve ... Exit 1
nominode_workers_5 ./main worker Up
prometheus /bin/prometheus --config.f ... Up 9090/tcp
traefik /entrypoint.sh traefik Up 3306/tcp, 0.0.0.0:443->443/tcp, 0.0.0.0:80->80/tcp
./docker-compose images
Container Repository Tag Image Id Size
----------------------------------------------------------------------------------------------------------------------------------------------------
cadvisor google/cadvisor v0.33.0 752d61707eac 68.61 MB
nominode_api_5 445607516549.dkr.ecr.us-east-1.amazonaws.com/nomnom/nominode/api1.0 staging-20-02-19-da80a427 ebf4f7786a40 341.7 MB
nominode_dockerhost_1 qoomon/docker-host latest 9d4938b708c5 7.879 MB
nominode_loki_1 grafana/loki v1.2.0 7c166fdca7fe 44.26 MB
nominode_minio_1 minio/minio RELEASE.2019-10-12T01-39-57Z 67fec10c1e88 50.96 MB
nominode_mysql_1 mysql 8 791b6e40940c 465.2 MB
nominode_redis_1 redis 5 44d36d2c2374 98.21 MB
nominode_scheduler_1 445607516549.dkr.ecr.us-east-1.amazonaws.com/nomnom/nominode/api1.0 staging-20-02-19-da80a427 ebf4f7786a40 341.7 MB
nominode_ui_1 445607516549.dkr.ecr.us-east-1.amazonaws.com/nomnom/nominode/ui1.0 staging-20-02-18-dc03fbad c1930afe7e84 734.2 MB
nominode_vault_1 vault 1.3.1 aa7801420b95 139.9 MB
nominode_workers_5 445607516549.dkr.ecr.us-east-1.amazonaws.com/nomnom/nominode/api1.0 staging-20-02-19-da80a427 ebf4f7786a40 341.7 MB
prometheus prom/prometheus v2.15.2 b715301fa5eb 132.7 MB
traefik traefik v2.1 651438efd845 70.34 MB
./docker-compose top
You can display detailed process information for each container from docker-compose using the top parameter.
cadvisor
UID PID PPID C STIME TTY TIME CMD
--------------------------------------------------------------------------------------------------------------
root 17222 17179 11 Feb18 ? 01:56:34 /usr/bin/cadvisor -logtostderr --housekeeping_interval=5s
nominode_api_5
UID PID PPID C STIME TTY TIME CMD
--------------------------------------------------------------------
root 12813 12784 0 15:16 ? 00:00:00 /bin/sh ./run.sh
root 13071 12813 0 15:16 ? 00:00:01 ./main webserver
root 13166 13071 0 15:16 ? 00:00:04 ./main webserver
root 13172 13166 0 15:16 ? 00:00:28 ./main webserver
nominode_dockerhost_1
UID PID PPID C STIME TTY TIME CMD
--------------------------------------------------------------------------
root 14495 14446 0 15:17 ? 00:00:00 /bin/sh /entrypoint.sh
root 14855 14495 0 15:17 ? 00:00:00 /bin/sh /entrypoint.sh
nominode_loki_1
UID PID PPID C STIME TTY TIME CMD
----------------------------------------------------------------------------------------------------------
root 17168 17087 0 Feb18 ? 00:00:40 /usr/bin/loki -config.file=/etc/loki/local-config.yaml
nominode_minio_1
UID PID PPID C STIME TTY TIME CMD
--------------------------------------------------------------------------
root 17274 17129 0 Feb18 ? 00:00:12 minio server /mnt/data
nominode_mysql_1
UID PID PPID C STIME TTY TIME CMD
---------------------------------------------------------
999 14783 14740 0 15:17 ? 00:00:20 mysqld
nominode_redis_1
UID PID PPID C STIME TTY TIME CMD
----------------------------------------------------------------------
999 14496 14455 0 15:17 ? 00:00:07 redis-server *:6379
nominode_scheduler_1
UID PID PPID C STIME TTY TIME CMD
--------------------------------------------------------------------
root 15918 15891 0 15:18 ? 00:00:02 ./main scheduler
root 16010 15918 0 15:18 ? 00:00:06 ./main scheduler
nominode_ui_1
UID PID PPID C STIME TTY TIME CMD
-----------------------------------------------------------------------------------------------
root 9734 9708 0 15:13 ? 00:00:01 node /opt/yarn-v1.22.0/bin/yarn.js start:prod
root 9836 9734 0 15:13 ? 00:00:06 /usr/local/bin/node build/server.js
nominode_workers_5
UID PID PPID C STIME TTY TIME CMD
-----------------------------------------------------------------
root 16132 16099 0 15:18 ? 00:00:02 ./main worker
root 16350 16132 0 15:18 ? 00:00:03 ./main worker
root 16535 16350 0 15:18 ? 00:00:00 ./main worker
root 16536 16350 0 15:18 ? 00:00:00 ./main worker
root 16537 16350 0 15:18 ? 00:00:00 ./main worker
root 16538 16350 0 15:18 ? 00:00:00 ./main worker
root 16544 16350 0 15:18 ? 00:00:00 ./main worker
root 16545 16350 0 15:18 ? 00:00:00 ./main worker
prometheus
UID PID PPID C STIME TTY TIME CMD
------------------------------------------------------------------------------------------------------------------
nobody 18242 18212 1 Feb18 ? 00:16:52 /bin/prometheus --config.file=/etc/prometheus/prometheus.yml
traefik
UID PID PPID C STIME TTY TIME CMD
-------------------------------------------------------------------
root 15428 15395 0 15:17 ? 00:00:04 traefik traefik
Producing Container Logs
The docker-compose logs command will display the current run logs of all of the Nominode containers. It is strongly recommended that the output be redirected to a text file. The file will contain 200 MB or more of data. You can limit the log output to a particular service, by including the service name at the end of the command, such as docker-compose logs api or dock-compose logs ui.
./docker-compose logs > nominode-logs.txt
This command will generate all of the logs with output redirected to a text file.
Recreating and Restarting Containers and Services
In the course of bringing a Nomimode back to a healthy state, it may be necessary to restart some of its containers or services.
./nnode restart
Gracefully restarts all running services, waiting for any Tasks running on the Nominode to complete prior to restarting a service. Useful for picking up changes made to a Nominode's config.ini file.
sudo systemctl restart docker or sudo service docker restart
Restarts just the Docker service. Useful if the Docker service has entered an erratic state due to memory or other resource exhaustion.
./docker-compose restart
Forcefully restarts all stopped and running services. Useful if a Nominode service dies, disappears or becomes unresponsive.
./docker-compose up -d
Builds, (re)creates and starts containers for all services. If there are existing containers for a service, and the service’s configuration or image was changed after the container’s creation, docker-compose up picks up the changes by stopping and recreating the containers (preserving mounted volumes). Useful if changes are made to certain Nominode configuration files, like the .env file.
./docker-compose down
Stops and removes containers, networks, volumes, and images created by docker-compose up. Useful when you want to completely stop the Nominode software.