SDL2/GStreamer DLNA browser for R36S by Matteo Benedetto
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 

14 KiB

Development Status

Current Milestone

Milestone 3 — SDL Video Viewport, HUD, and Wayland Compatibility

Current Architecture Decisions

  • Language / UI: Python 3.9+ with PySDL2 (ctypes wrapper around system SDL2), with UI layout metrics now computed from the runtime window/display size instead of being fixed to 640x480
  • DLNA Discovery: Custom SSDP M-SEARCH implementation using asyncio datagrams + aiohttp for device description XML
  • Content Browsing: Direct SOAP/XML ContentDirectory client with DIDL-Lite parser (no dependency on async-upnp-client browsing at runtime — only aiohttp)
  • Playback: integrated GStreamer backend via PyGObject / GstPlayBin, decoding video into GstAppSink frames that are uploaded to SDL textures and rendered in the main SDL renderer
  • Playback viewport: SDL scales decoded video into a dedicated playback viewport in the same render pass as the HUD, with full-width video bounds and a black playback backdrop outside the viewport
  • Concurrency: Dedicated asyncio event loop in a daemon thread; thread-safe queues bridge it to the SDL2 main loop
  • Input: Keyboard mapping for desktop testing + SDL2 GameController for R36S D-pad/buttons
  • Font: Bundled package font preferred first; system font fallback kept only for development hosts
  • UI icons: prefer bundled monochrome glyphs or a bundled icon font subset instead of depending on OS emoji fonts
  • Playback HUD: SDL-rendered overlay is now simplified and compact, uses bundled playback icons, uses a smaller dedicated playback font with title ellipsis for 640x480 readability, supports auto-hide or fixed visibility, stays visible while playback is paused, and remains in the same SDL render pass as video
  • Wayland / DRM strategy: playback no longer depends on native overlay sinks or X11 window handles; R36S-class targets continue to prefer kmsdrm when no display server is present
  • Deploy packaging: a conda-forge-oriented environment.yml now defines a reproducible Miniforge/Miniconda environment for local development and release preparation
  • Python packaging: direct runtime dependency declarations now explicitly include aiohttp instead of relying on transitive installation through other packages
  • ArkOS deploy layout: on-device installs should place Miniforge and the git checkout under /home/ark to avoid the full /roms partition, while EmulationStation integration should stay lightweight under /roms/ports

Completed Tasks

  • Phase 1: Project bootstrap (pyproject.toml, requirements.txt, README.md, package layout under src/)
  • Phase 2: DLNA discovery (dlna/discovery.py — SSDP M-SEARCH, friendly-name fetch) and browsing (dlna/client.py — SOAP Browse, DIDL-Lite parser with relative-URL resolution + dlna/models.py domain models + dlna/browser_state.py navigation stack/cache)
  • Phase 3: SDL2 UI (ui/sdl_app.py — window, event loop, input dispatch; ui/screens.py — server list, browse list, playback, error screens; ui/theme.py — runtime-scaled layout helpers for 640x480 and 720x720-class displays)
  • Phase 4: Playback (player/backend.py abstract interface + player/gstreamer_backend.py integrated GStreamer backend)
  • Phase 5: Device integration (platform/controls.py — keyboard + gamecontroller mapping; platform/runtime.py — logging, R36S heuristic, SDL env hints)
  • Phase 7: Tests — 75 tests across 7 test files all passing (DIDL mapping, SOAP/XML parser, navigation state, playback backend, SDL redraw policy, input controls, runtime environment setup)
  • Desktop runtime verification completed: fixed SSDP discovery socket setup for IPv4 and removed pending-task shutdown noise from the async worker thread
  • Packaging hardening: bundled a local UI font asset and configured setuptools to ship it with the package
  • Real LAN regression fixed: Browse SOAP parser now handles Result elements both with and without a namespace, matching responses from the discovered MiniDLNA/Jellyfin servers
  • Playback backend pivoted to GStreamer because libmpv continued to create a separate native window on the desktop host instead of remaining embedded in the SDL UI
  • Milestone 2 is now implemented with GStreamer: playback uses GstPlayBin plus GstAppSink instead of an external player, libmpv, or native overlay sinks
  • SDL playback flow updated: decoded GStreamer frames are uploaded into SDL textures, playback end-of-stream returns automatically to the browser, and playback controls support pause/resume, relative seek, and volume
  • Milestone 3 implemented in code: SDL scales video into a dedicated viewport inside the SDL window, with reserved HUD margins instead of using the whole window area for video
  • Playback HUD expanded: progress bar, elapsed/duration, volume, buffer, resolution, and control legends are rendered around the video area and updated from GStreamer bus/pipeline queries
  • Playback-page flashing root cause addressed by removing native overlay composition entirely: video and HUD are now rendered together by SDL in one pass, with redraws driven by decoded frame availability and HUD state changes
  • Playback HUD simplified: the border around the video area was removed, playback control/status icons were added as bundled SVG+PNG assets, the title/timer top bar no longer overlaps, and playback now supports auto / fixed / hidden HUD modes through a dedicated command while staying visible when paused
  • UI scaling hardened for mixed small-display targets: list rows, HUD bands, icon sizes, viewport margins, and font sizes are now derived from the actual SDL window/display size so the app remains readable on both 640x480 and 720x720 screens
  • Deployment assets added: .gitignore, environment.yml, and a real LICENSE file so the project can be initialized and published as a clean git repository
  • Conda environment refreshed for current playback needs: runtime now includes GStreamer codec/plugin packages plus explicit Python build/test tooling, while editable install keeps the package code sourced from the repo checkout
  • Packaging fix: pyproject.toml now uses a valid TOML [project.urls] table so editable installs work with modern pip / tomllib
  • Copilot instructions and this status file
  • Device deployment reconnaissance completed on a real ArkOS-derived R36S over SSH: /roms is full, /home/ark has free space, required download tools are present, and /roms/ports plus gamelist.xml are the least invasive integration points for launchers

Tasks In Progress

  • NV12 frame path optimization complete: videoscale(nearest-neighbour)→640×480 GstBin reduces Python memmove from 32 ms (77% budget) to 1 ms (2.5%) with no FPS or drop regression. Awaiting visual smoke test on device via MatHacks.sh launcher.
  • Verify that the SDL-texture playback path is smooth enough on real host playback and on R36S hardware
  • Device deployment on the physical R36S is now wired through ArkOS Ports -> MatHacks, with the heavy runtime under /home/ark and only a lightweight stub launcher under /roms/ports

NV12 Render Path Benchmark Log

All runs performed on the physical R36S (RK3326, 4× A35 @ 1.3 GHz, 1 GB RAM) over SSH. Stream: 1920×1080 H.264 MKV @ 24 fps via MiniDLNA over LAN. Frame budget: 41.7 ms.

Commit Copy / pipeline strategy Copy mean Copy % budget FPS Dropped A/V drift
a201594 extract_dup → bytes + from_buffer_copy → ctypes (2 copies, 6 MB/frame) 36,499 µs 87.6% 24.01 1 −42.8 ms
da02e74 buffer.map(READ) + memmove into reusable ctypes array (1 copy, 3.1 MB/frame) 33,551 µs 80.5% 23.98 0 −38.0 ms
995830e videoscale(nearest)→640×480 in GstBin + memmove (1 copy, 0.46 MB/frame) 1,033 µs 2.5% 23.99 0 −6.9 ms

Optimization history:

  • a201594da02e74: replaced extract_dup + from_buffer_copy (2 copies, 6 MB/frame) with buffer.map(READ) + memmove into a pre-allocated ctypes array (1 copy, 3.1 MB). Saved ~3 MB/frame allocation; copy cost reduced by 8% but still ~81% of budget.

  • da02e74995830e: identified that the 3.1 MB memmove is necessary only because the appsink receives full 1920×1080 frames, while the display is 640×480. Inserted a GstBin containing videoscale(method=nearest-neighbour) → capsfilter(NV12,640×480) → appsink as the playbin video-sink. This causes the GStreamer pipeline thread to do SW scale before Python sees the frame; Python then receives only 460 KB (6.7× smaller). Memmove drops from 32 ms to 1 ms (31× improvement, 2.5% budget). FPS and drop count are unchanged (23.99, 0). A/V drift improved from −38 ms to −7 ms.

Alternatives tested and rejected during 995830e:

Variant Result Root cause
Bilinear videoscale (no queue) 20.92 fps, 46 drops Bilinear reads adjacent rows → loads ~89% of source cache lines, similar cost to memmove; scheduling pressure causes drops
Nearest-neighbour + leaky=2 queue 1.86 fps, 30 drops leaky=2 allows mppvideodec to race ahead; queue fills and drops ~93% of frames as stale
Nearest-neighbour, no queue 23.99 fps, 0 drops Nearest reads ~44% of source cache lines; back-pressure from appsink naturally rate-limits mppvideodec

Key observations (995830e):

  • Memmove reduced from 32 ms (3.1 MB) to ~1 ms (460 KB) — 31× improvement
  • No FPS or drop regression vs unscaled path
  • A/V drift improved significantly (−7 ms vs −38 ms)
  • SW nearest-neighbour scale on A35 costs ~14 ms per frame (estimated from cache line count), but this happens synchronously in the GStreamer pipeline thread BEFORE the appsink callback, not in the Python memmove measurement
  • Remaining 97.5% of frame budget is available for SDL upload, HUD rendering, and other pipeline work

Blockers Or Open Questions

  • SDL2_ttf system library needed for text rendering (sudo dnf install SDL2_ttf on Fedora, sudo apt install libsdl2-ttf-2.0-0 on Debian/Ubuntu). The app handles its absence gracefully but will show no text.
  • Integrated playback requires system GStreamer plus Python GI bindings (for Fedora: python3-gobject gstreamer1 gstreamer1-plugins-base gstreamer1-plugins-good; add codec/plugin packages as needed for target media).
  • Root browse verified against two real DLNA servers on the LAN.
  • On-device testing on R36S hardware is pending.
  • The current SDL-texture path avoids window-manager dependencies but may still need optimization on low-end hardware if BGRA upload cost is too high.
  • The first Miniforge install attempt on the physical R36S failed because the downloaded installer was corrupt and crashed during extraction.
  • The physical R36S now has Miniconda installed at /home/ark/miniconda3; the dedicated app env exists at /home/ark/miniconda3/envs/r36s-dlna-browser, but package solves can hang on-device and are being handled incrementally.
  • The dedicated R36S conda env requires LD_LIBRARY_PATH=/home/ark/miniconda3/envs/r36s-dlna-browser/lib for GI and GStreamer shared libraries to resolve correctly.
  • GStreamer imports now succeed in the dedicated env (GLib, GObject, Gst, GstApp, GstVideo), and Application imports cleanly.
  • ArkOS menu launch works on the physical device, and DLNA browsing reaches real MiniDLNA content.
  • Real playback is currently blocked by missing decoder elements in the device env: direct probing of a MiniDLNA .mkv URL showed missing H.264 High Profile and MPEG-4 AAC decoders, while the user-facing "can't play a text file" message is a misleading fallback caused by an additional text stream in the container.
  • RESOLVED: gst-libav conda package on linux-aarch64 has an unfixable ABI mismatch: libavcodec.so links libdav1d.so.6 (from dav1d <1.3) but only dav1d 1.4.x (.so.7) is available, and via libxml2-16 it also pulls libicuuc.so.78 which is not packaged for linux-aarch64 on conda-forge. Solution: install system gstreamer1.0-libav (v1.16.1) via apt and use GST_PLUGIN_PATH + LD_PRELOAD to expose its plugins to the conda Python runtime.
  • On the physical R36S, avdec_h264 and avdec_aac now register and resolve when launched with: LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1 GST_PLUGIN_PATH=/usr/lib/aarch64-linux-gnu/gstreamer-1.0 The LD_PRELOAD is required to avoid "cannot allocate memory in static TLS block" from the conda libgomp.so being loaded late by dlopen.
  • These variables are now persisted in /home/ark/miniconda3/envs/r36s-dlna-browser/etc/conda/activate.d/gst-env.sh and explicitly set in deploy/run.sh.
  1. Run a visual playback smoke test on device directly via the app launcher (MatHacks.sh) to confirm HUD and video render correctly together under KMSDRM with the videoscale path active (nearest-neighbour 640×480 NV12).
  2. Measure SDL_UpdateNVTexture upload cost for the now-smaller 640×480 texture (was 1920×1080). If it is sub-millisecond, the render path is considered optimized.
  3. If visual quality from nearest-neighbour scaling is noticeably poor on-device, switch scale.set_property("method", 1) (bilinear) and re-benchmark; the bilinear result (20.92 fps, 46 drops) only applied to the benchmark stream — actual app playback may behave differently since the GStreamer pipeline structure is slightly different inside the real app vs the benchmark.
  4. Consider profiling the SDL render loop under combined video+HUD load to confirm 30+ fps UI responsiveness alongside decoding.
  5. Investigate DMA-buf import as a future zero-copy path: gst-mpp may expose DRM DMA-buf fds that SDL's KMSDRM backend can import directly via SDL_CreateTextureFromSurface or a custom EGL path, eliminating the CPU memmove and SW scale entirely. This is a significant engineering effort and is not needed given current performance.
  6. avdec_hevc is still missing (HEVC decoders not in system apt gstreamer1.0-libav 1.16.1); mppvideodec covers H.264/H.265/VP8/VP9 via HW so this is less critical now.