All systems, even those with high uptime and reliability, often need to be restarted on occasion. When working with complex media types, schemas, etc. it's possible to have unexpected cases, load, or configuration issues that require restart. In cases where there is a known bug, restarting also provides a work-around option while waiting for a support fix to be completed and validated.
The Insertions system currently is implemented with processing queues. Insertions includes ingesting new media elements, new predictions, updating existing predictions, copying internal files, etc.
The following cases may be good reasons to restart:
- A Workload that previously worked well appears to not be working for new inputs or is taking substantially longer then normal to process.
- No new transactions are working.
- Queue Status analytics are showing a need to restart.
The processing service is different from the online service. End users will still be able to query data and annotate existing elements while the processing queue is blocked and while it's being restarted.
- Contact your Infra Admin
The Infra Admin should:
- In the K8s Context, Delete all Walrus pods.
- Verify the Walrus pods will automatically respawn.
In some rare cases you may also need to restart the database to clear existing connections.
If this is a recurring issue, or an issue in-scope for engineering or support SLAs be sure to communicate every restart event to your Diffgram support team.
Updated 7 months ago