R36SHack

Author	SHA1	Message	Date
Matteo Benedetto	fe39312cfa	perf: HUDTextCache — reuse text SDL textures, skip when HUD hidden	1 week ago
Matteo Benedetto	6ef8f4722c	test: add section 9 HUD overhead benchmark (draw_playback timing)	1 week ago
Matteo Benedetto	9db1bfc0ba	test: fix section 6 map unpack; warmup=30; add p95 to timing stats - Section 6 _on_live_sample: buf.map() returns (ok, map_info) tuple; accessing .size on the tuple caused AttributeError for every frame (22 tracebacks in log). Fixed to unpack properly. - Section 8 WARMUP raised from 5 to 30 frames (~1.25 s) so cold-DRM DMA setup, network buffer fill, and lazy texture init all complete before stats are recorded. Eliminates the 36ms first-upload spike from post-warmup measurements. - _stat() now shows mean / p95 / max so isolated spikes are visible without inflating the headline figure.	1 week ago
Matteo Benedetto	790f001f4e	fix: scale to 16:9 target box with add-borders to preserve source DAR GStreamer caps fixation always picks the identity value for unconstrained dimensions (width-only caps keeps source height unchanged, giving 640x1080 instead of 640x360 for a 1920x1080 source). Fix: compute a 16:9 output box that fits inside the video area, use both width and height in the capsfilter, and set add-borders=True so GStreamer letterboxes or pillarboxes any non-16:9 source without distortion. For the test device (720x720 KMSDRM, ~120px HUD): video area: ~720x600 → scale target: 720x404 (16:9) For default viewport (640x480): video area: 640x480 → scale target: 640x360 (16:9) Section 8 test updated to mirror the same 16:9+add-borders strategy.	1 week ago
Matteo Benedetto	761707b45a	test: section 8 — lazy texture + width-only NV12 caps (matches app AR fix) - Remove fixed SDL8_SCALE_H; capsfilter now uses width-only (same as app) so GStreamer derives height from source DAR. - Texture created lazily on first frame with correct dimensions instead of a fixed 640x480 that would mismatch an AR-preserving 640x360 frame. - SDL_RenderCopy now letterboxes the frame into the window (preserves AR) instead of stretching to fill, matching what _fit_frame_to_viewport does. - [texture] log line reports actual w x h and AR ratio for verification.	1 week ago
Matteo Benedetto	c334bfcc83	test,docs: fix section 6 vsink ref; update docs with SDL timing results and RenderPresent root cause	1 week ago
Matteo Benedetto	ac7aa9146d	test: add section 8 — end-to-end SDL NV12 render loop with per-phase timing (memmove+upload+render)	1 week ago
Matteo Benedetto	995830e3d2	player,bench: drop queue from vscale-bin (leaky=2 caused massive drops), keep nearest-neighbour	1 week ago
Matteo Benedetto	65665f4cff	player,bench: add queue+nearest-neighbour before videoscale to prevent pipeline stalls	1 week ago
Matteo Benedetto	435bd51bbe	bench: add videoscale GstBin path (mirrors _create_appsink, --noscale flag)	1 week ago
Matteo Benedetto	67224626a5	perf: insert videoscale before appsink to cut NV12 memmove 6.7× When hardware decode (mppvideodec/NV12) is active, wrap the appsink in a GstBin with a videoscale element so the VPU decodes at full stream resolution but Python only receives a frame pre-scaled to the SDL display size (default 640x480). Effect: NV12 buffer per frame: 3,133,440 B (1080p) → 460,800 B (640x480) memmove per frame: ~33 ms (80.5% budget) → ~5 ms (expected ~12%) The videoscale bilinear step runs entirely in software on the A35 cores but scales down 6.7×, so its cost is far lower than the avoided memmove. SDL still handles final aspect-ratio fitting inside the viewport, so visual quality is unchanged relative to what the 640x480 display can show. Fallback: if videoscale is not available, unscaled NV12 is used as before.	1 week ago
Matteo Benedetto	da02e7446f	perf: replace extract_dup+from_buffer_copy with buffer.map+memmove zero-copy Instead of extract_dup (GLib alloc+memcpy → Python bytes) followed by from_buffer_copy (Python bytes → ctypes array) — two 3MB copies per frame — use Gst.Buffer.map(READ) to get a zero-allocation pointer to the decoded frame memory, then memmove directly into a pre-allocated reusable ctypes array (_raw_arr). This reduces the per-frame copy path from 2 copies (6MB) to 1 memmove (3MB), with no Python bytes object allocation at all. The memmove happens under _frame_lock so render() on the main thread never reads a partial frame. _raw_arr is allocated once on the first frame (or on resolution change) and reused for every subsequent frame. _Frame no longer carries a pixels field. Tests updated accordingly. Benchmark updated to use the same buffer.map+memmove path as the app.	2 weeks ago
Matteo Benedetto	3e8661e2e5	fix(bench): del ctypes/bytes objects immediately in callback to prevent OOM on 1GB device	2 weeks ago
Matteo Benedetto	a201594a90	perf: reduce NV12 per-frame copies from 5 to 2 via single from_buffer_copy + byref offset	2 weeks ago
Matteo Benedetto	ecdbf5eb04	test: add NV12/mppvideodec decode benchmark with A/V sync and jitter metrics	2 weeks ago
Matteo Benedetto	6715f4b227	player: log last DLNA URL to /tmp/dlna_last_url.txt on play(); tests: auto-load it in diagnostic script	2 weeks ago
Matteo Benedetto	79a1c9a78c	tests: fix identity handoff signal signature (*args for GStreamer 1.28 compat)	2 weeks ago
Matteo Benedetto	1a3549312e	tests: fix benchmark (yuv420p fixture, identity probe frame counter)	2 weeks ago
Matteo Benedetto	9ab0ec4f44	tests: fix benchmark pipelines (decodebin + rank steering, LD_PRELOAD note)	2 weeks ago
Matteo Benedetto	4a0275d145	tests: add FHD H.264 benchmark fixture and decode benchmark script	2 weeks ago
Matteo Benedetto	6e15fcab5a	player: NV12 zero-copy SDL upload path for Rockchip MPP hardware decode - _Frame: add pixel_format ('BGRA'\|'NV12'), uv_pixels, uv_pitch fields - _create_appsink: accept NV12 caps when R36S_HW_DECODE=1 (NV12;BGRA fallback) - render(): choose SDL_PIXELFORMAT_NV12 texture + SDL_UpdateNVTexture for NV12 frames, avoiding any software colourspace conversion on the CPU - _on_new_sample: detect format via VideoInfo.finfo.name, extract Y+UV planes separately from NV12 GStreamer buffers - _destroy_texture: reset _texture_format to 'BGRA' - deploy/arkos/MatHacks.sh: set R36S_HW_DECODE=1 to activate the path - tests/test_player.py: add finfo mock, pixel_format to SimpleNamespace frame	2 weeks ago
Matteo Benedetto	d79bc3e16f	deploy: bundle pre-built MPP libs for RK3326, update setup script and status - deploy/arkos/mpp-libs/: add librockchip_mpp.so, librockchip_vpu.so, libgstrockchipmpp.so — built from source via Docker QEMU (arm64v8/ubuntu:focal) using rockchip-linux/mpp + JeffyCN/mirrors@gstreamer-rockchip (-Drga=disabled) - deploy/arkos/setup_hw_decode.sh: detect mpp-libs/ subdir and install from it automatically, no network required; apt fallback retained - deploy/arkos/mpp-libs/README.md: document origin, target SoC, install steps - tests/test_video_playback_device.py: on-device GStreamer diagnostic script - docs/development-status.md: mark MPP HW decode deployed, mppvideodec verified Verified on physical R36S: mppvideodec found by GStreamer registry with GST_PLUGIN_PATH=/usr/lib/aarch64-linux-gnu/gstreamer-1.0	2 weeks ago
Matteo Benedetto	544ed8bc6d	ui: runtime-scaled layout for 640x480 and 720x720 displays	2 weeks ago
Matteo Benedetto	193e914ffd	Refresh env, add aiohttp dep, improve resource selection and GStreamer flags	2 weeks ago
Matteo Benedetto	1d89c7fdc7	Initial import	2 weeks ago

25 Commits (fe39312cfa33dba49385ea8829ebebe952def53f)