Skip to main content
Version: 0.9.12

Redeploy a service

platform v0.9.11verified 2026-05-14

This recipe gets a service back to a clean, fully-running state after a failed or interrupted update. It applies to any service under /opt/services/<service> (web, api, voice, telpro, database, media, ops, squid).

When to use this

  • A previous update.sh exited non-zero and the containers are now mixed (some new, some old).
  • You changed vars.yaml and want the running containers to pick the changes up.
  • You need to roll back to a previous ECR_TAG.

When not to use this

  • The host itself is unreachable or the disk is full — fix the host first.
  • A Postgres migration partly succeeded and the schema is in an unknown state. See the migration sub-recipe below; it is not safe to just re-run update.sh blindly in that case.

Step 1 — Confirm the current state

cd /opt/services/<service>
docker compose ps --all
docker compose images

Note the image tag of every container and whether any are missing or unhealthy.

Step 2 — Resync config

./init.sh

init.sh re-pulls service config from S3, refreshes env vars from SSM / SM, and logs into ECR. It does not restart healthy containers on its own.

Step 3 — Apply the desired tag

Choose the tag you want to converge to. To re-apply the current release:

./update.sh --ecr-tag $(cat .ecr-tag)

To roll back:

./update.sh --ecr-tag <previous-tag>

To restart on the current images without changing anything:

./update.sh --restart-only

Step 4 — Verify

docker compose ps

Every container should be Up (healthy). From a remote terminal, confirm the service responds on its public endpoint (Web/API: HTTPS health; Voice/TelPro: a brief SIP/RTP test if you have one ready).

Then check SigNoz monitoring for error-rate baseline.

Partial migration recovery

If update.sh failed during a database migration on the Web service, the schema may be partially advanced. Do not just retry.

STOP

Confirm with whoever owns the release whether the new tag's migrations are designed to be re-runnable after partial failure. If you don't know, treat the database as degraded and consult support before proceeding.

If you have confirmed the migrations are safe to re-attempt:

  1. Stop the Web containers: docker compose stop voiceai-telweb.

  2. Capture a Postgres snapshot before anything else (see Restore Postgres for the snapshot mechanism your deployment uses).

  3. Re-run the migration step explicitly. The Web container runs prisma migrate deploy at start; bring it back up and tail logs:

    docker compose up -d voiceai-telweb
    docker compose logs -f voiceai-telweb
  4. If the migration fails again with the same error, stop here. Do not loop. File a ticket per Getting help with the failing migration name and the error.

Once the schema converges, finish the standard Step 4 — Verify.

See also