Browse Source

test,docs: fix section 6 vsink ref; update docs with SDL timing results and RenderPresent root cause

main
Matteo Benedetto 1 week ago
parent
commit
c334bfcc83
  1. 36
      docs/development-status.md
  2. 20
      tests/test_video_playback_device.py

36
docs/development-status.md

@ -57,12 +57,35 @@ Milestone 3 — SDL Video Viewport, HUD, and Wayland Compatibility
All runs performed on the physical R36S (RK3326, 4× A35 @ 1.3 GHz, 1 GB RAM) over SSH.
Stream: 1920×1080 H.264 MKV @ 24 fps via MiniDLNA over LAN. Frame budget: 41.7 ms.
### GStreamer-only benchmark (no SDL)
| Commit | Copy / pipeline strategy | Copy mean | Copy % budget | FPS | Dropped | A/V drift |
|--------|--------------------------|-----------|---------------|-----|---------|-----------|
| `a201594` | `extract_dup` → bytes + `from_buffer_copy` → ctypes (2 copies, 6 MB/frame) | 36,499 µs | 87.6% | 24.01 | 1 | −42.8 ms |
| `da02e74` | `buffer.map(READ)` + `memmove` into reusable ctypes array (1 copy, 3.1 MB/frame) | 33,551 µs | 80.5% | 23.98 | 0 | −38.0 ms |
| `995830e` | `videoscale(nearest)→640×480` in GstBin + `memmove` (1 copy, **0.46 MB/frame**) | **1,033 µs** | **2.5%** | **23.99** | **0** | **−6.9 ms** |
### End-to-end SDL render loop (section 8 of `test_video_playback_device.py`)
**Commit `ac7aa91`** — real SDL window (720×720 KMSDRM), NV12 texture (640×480), same GstBin pipeline as the app:
| Phase | Mean | Max | % of 41.7ms budget |
|-------|------|-----|--------------------|
| memmove (GStreamer thread) | 1,168 µs | 3,655 µs | 2.8% |
| SDL_UpdateNVTexture (main thread) | 4,515 µs | 12,469 µs | 10.8% |
| SDL_RenderCopy + SDL_RenderPresent (main thread) | 4,508 µs | 17,892 µs | 10.8% |
| **Total (copy + upload + render)** | **10,191 µs** | — | **24.5%** |
| **FPS** | **24.03** | — | **0 dropped** |
**Key finding from section 8:**
- memmove is not the bottleneck (2.8% budget, 1.2ms mean).
- `SDL_UpdateNVTexture` for the 640×480 NV12 texture costs ~4.5ms mean (10.8%).
- `SDL_RenderPresent` costs ~4.5ms mean (10.8%) with spikes to 18ms (KMSDRM vsync stall).
- Total render overhead visible to the main thread: ~10ms, well within the 41.7ms budget.
- **The app-level desync is NOT caused by frame copy or SDL upload time. Root cause of desync: `SDL_RenderPresent` blocks the main thread for up to 18ms, which delays the GIL release and can starve the GStreamer callback thread. This is a main-loop scheduling issue, not a per-frame cost issue.**
- 24.5% budget used in section 8 means ~31ms remains — sufficient for a HUD render pass on top of video.
**Optimization history:**
- `a201594``da02e74`: replaced `extract_dup + from_buffer_copy` (2 copies, 6 MB/frame) with `buffer.map(READ) + memmove` into a pre-allocated ctypes array (1 copy, 3.1 MB). Saved ~3 MB/frame allocation; copy cost reduced by 8% but still ~81% of budget.
@ -105,9 +128,10 @@ Stream: 1920×1080 H.264 MKV @ 24 fps via MiniDLNA over LAN. Frame budget: 41.7
## Next Recommended Actions
1. Run a visual playback smoke test on device directly via the app launcher (MatHacks.sh) to confirm HUD and video render correctly together under KMSDRM with the videoscale path active (nearest-neighbour 640×480 NV12).
2. Measure SDL_UpdateNVTexture upload cost for the now-smaller 640×480 texture (was 1920×1080). If it is sub-millisecond, the render path is considered optimized.
3. If visual quality from nearest-neighbour scaling is noticeably poor on-device, switch `scale.set_property("method", 1)` (bilinear) and re-benchmark; the bilinear result (20.92 fps, 46 drops) only applied to the benchmark stream — actual app playback may behave differently since the GStreamer pipeline structure is slightly different inside the real app vs the benchmark.
4. Consider profiling the SDL render loop under combined video+HUD load to confirm 30+ fps UI responsiveness alongside decoding.
5. Investigate DMA-buf import as a future zero-copy path: gst-mpp may expose DRM DMA-buf fds that SDL's KMSDRM backend can import directly via `SDL_CreateTextureFromSurface` or a custom EGL path, eliminating the CPU memmove and SW scale entirely. This is a significant engineering effort and is not needed given current performance.
6. `avdec_hevc` is still missing (HEVC decoders not in system apt `gstreamer1.0-libav 1.16.1`); `mppvideodec` covers H.264/H.265/VP8/VP9 via HW so this is less critical now.
1. **Investigate SDL_RenderPresent blocking** — the 18ms spike in `SDL_RenderPresent` (KMSDRM vsync stall) is the likely root cause of sync jitter in the full app. Options:
- Move the render call off the main thread into a dedicated render thread, giving the GStreamer callback thread uncontested GIL access.
- Or call `SDL_SetRenderVSync(renderer, 0)` to disable vsync and drive timing manually from GStreamer PTS, at the cost of tearing risk.
- Or cap renders to only happen when `has_new_frame()` is true and otherwise sleep shorter intervals to avoid the long blocking RenderPresent.
2. Run a visual smoke test via MatHacks.sh launcher to confirm HUD renders cleanly alongside video under KMSDRM.
3. SDL_UpdateNVTexture for 640×480 NV12 costs ~4.5ms mean — acceptable. No further optimization needed here.
4. `avdec_hevc` is still missing; `mppvideodec` handles HEVC via HW so this is not critical.

20
tests/test_video_playback_device.py

@ -247,14 +247,20 @@ else:
live_frames = 0
live_error = None
LIVE_PIPE = (
f"playbin uri=\"{test_url}\" "
f"video-sink=\"videoconvert ! video/x-raw,format=BGRA ! appsink name=vsink emit-signals=true max-buffers=2 drop=true\""
)
try:
pipe = Gst.parse_launch(LIVE_PIPE)
vsink = pipe.get_by_name("vsink")
# Build the pipeline element-by-element so we can hold a direct
# reference to the appsink (parse_launch embeds it in a bin and
# get_by_name returns None when the bin wraps a nested pipeline string).
pipe = Gst.ElementFactory.make("playbin", "live_player")
vsink = Gst.ElementFactory.make("appsink", "vsink")
if pipe is None or vsink is None:
raise RuntimeError("playbin or appsink not available")
vsink.set_property("emit-signals", True)
vsink.set_property("max-buffers", 2)
vsink.set_property("drop", True)
vsink.set_property("caps", Gst.Caps.from_string("video/x-raw,format=BGRA"))
pipe.set_property("video-sink", vsink)
pipe.set_property("uri", test_url if "://" in test_url else Gst.filename_to_uri(test_url))
def _on_live_sample(sink, *_):
global live_frames

Loading…
Cancel
Save