CloudDock DeepSpeed 1.0.3 — Steps Tab + Launcher Status + Safer Usagi 4.1.5 Base

What’s new in 1.0.3 (vs 1.0.2)

Console tabs now display running-job steps: the active job’s step counter is surfaced directly in the tab UI (so you can track progress without switching panels or scrolling logs).
Launcher: DS Console status indicator: the Launcher DeepSpeed Console page now shows a clear status light / state indicator (busy/idle + quick health signal at a glance).
Universal Usagi 4.1.5 base (security uplift): 1.0.3 inherits stronger safety defaults from the latest Universal Usagi baseline. This is intentionally documented at a high level.
JupyterLab environment split: TensorFlow (DS/ML/Kaggle) is separated into its own kernel, while PyTorch stays in its own venv. Result: fewer “it worked yesterday” dependency conflicts.

Note on security details: 1.0.3 includes a significant safety uplift, but specific mechanisms are not listed here by design. Your workflow does not change — you just get a safer baseline.

DeepSpeed Console upgrade: Steps shown in tabs

1.0.3 adds a small but high-impact UI improvement: the Console’s job tabs can now display the current running job’s step count. This is meant to answer one question instantly: “Is it actually moving?”

When a job is running: the active tab shows live step updates.
When no job is running: tabs remain clean (no noisy placeholders).
Fallback behavior: if step is temporarily unavailable, the UI stays stable and does not “blink” aggressively.

DeepSpeed Console 1.0.3: steps shown in tabs

Figure 2 — DeepSpeed Console 1.0.3: job tabs show the running job’s steps for quick progress checks.

Launcher integration: DS Console status indicator

The Launcher DeepSpeed Console page now includes a dedicated status indicator so you can see whether the Console is busy/idle and healthy without opening the full Console UI. This improves “instance navigation” especially when you’re juggling multiple tools.

Figure 3 — Launcher: DS Console status indicator (quick glance health + busy/idle signal).

Workflow tip: Use Launcher as your “control tower.” If the status shows busy, jump into Console to view logs/steps. If it’s idle, you can start a new run with confidence (and avoid accidental double-start habits).

JupyterLab: TensorFlow kernel split (no more dependency fights)

DeepSpeed users often overlap with DS/ML/Kaggle workflows. In previous setups, installing or upgrading one stack could break the other. In 1.0.3, JupyterLab is structured so environments remain predictable:

PyTorch: stays in a dedicated venv (your PyTorch “daily driver”).
TensorFlow (DS/ML/Kaggle): moved into a separate Jupyter kernel.

Why this matters: You can now use both stacks on the same instance without the classic “pip install X → torch/tf breaks” cascade. This also reduces support time because the baseline is more deterministic.

How to select the right kernel

Open JupyterLab.
Create a new notebook.
In the kernel picker, choose:
- PyTorch kernel/venv for torch + deepspeed workflows
- TensorFlow (DS/ML/Kaggle) kernel for tf workflows

JupyterLab kernel selection: PyTorch venv vs TensorFlow kernel

Figure 4 — JupyterLab: clean environment split (PyTorch venv vs TensorFlow kernel).

Two ways to run training

1) CLI mode (terminal)

CLI remains the highest-control path. Your scripts, your flags, your configs. 1.0.3 does not remove power-user freedom — it improves visibility around what is running.

# Example (single GPU)
deepspeed --num_gpus=1 train.py \
  --dataset /workspace/data \
  --output_dir /workspace/output \
  --lr 3e-5 --epochs 1 --batch_size 8


# Recommended: keep datasets + outputs in /workspace for clean transfers.

2) GUI mode (CloudDock DeepSpeed Console)

GUI remains the fastest “blank instance → first run” path. In 1.0.3, job tracking is easier because you can see step progress directly in tabs.

DeepSpeed Console 1.0.3: job view with steps and logs

Figure 5 — Console: steps surfaced for the running job, plus the usual logs and health signals.

Recommended folder convention

/workspace/
  train.py
  data/
  output/
  configs/
  notebooks/

Upgrade notes (from 1.0.2)

Console tabs show steps automatically: no action needed — just start a run and you’ll see step updates where it matters.
Launcher status indicator: if you rely on Launcher as your entry point, you’ll notice DS Console state immediately.
Jupyter changes: if you previously installed TensorFlow into your PyTorch environment manually, stop doing that — use the dedicated TensorFlow (DS/ML/Kaggle) kernel instead.
Compatibility expectation: existing DeepSpeed scripts should run the same — 1.0.3 focuses on safer baseline + better visibility, not changing how your jobs are launched.

Less guessing. More training. (And fewer broken venvs.)

← Back to Documents