Compare commits

...

27 Commits

Author SHA1 Message Date
Cédric Verstraeten
2c02e0aeb1 Merge pull request #250 from kerberos-io/fix/add-avc-description-fallback
fix/add-avc-description-fallback
2026-02-27 11:48:34 +01:00
cedricve
d5464362bb Add AVC descriptor fallback for SPS parse errors
When setting the AVC descriptor fails in MP4.Close(), attempt a fallback that constructs an AvcC/avc1 sample entry from available SPS/PPS NALUs. Adds github.com/Eyevinn/mp4ff/avc import and two helpers: addAVCDescriptorFallback (builds a visual sample entry, sets tkhd width/height if available, and inserts it into stsd) and buildAVCDecConfRecFromSPS (creates an avc.DecConfRec from SPS/PPS bytes by extracting profile/compat/level and filling defaults). Logs errors and warns when the fallback is used. This provides resilience against SPS parsing errors when writing the MP4 track descriptor.
2026-02-27 11:35:22 +01:00
Cédric Verstraeten
5bcefd0015 Merge pull request #249 from kerberos-io/feature/enhance-avc-hevc-ssp-nalus
feature/enhance-avc-hevc-ssp-nalus
2026-02-27 11:12:03 +01:00
cedricve
5bb9def42d Normalize and debug H264/H265 parameter sets
Replace direct sanitizeParameterSets usage with normalizeH264ParameterSets and normalizeH265ParameterSets in mp4.Close. The new functions split Annex-B blobs, strip start codes, detect NALU types (SPS/PPS for AVC; VPS/SPS/PPS for HEVC), aggregate distinct parameter sets and fall back to sanitizeParameterSets if none are found. Added splitParamSetNALUs and formatNaluDebug helpers and debug logging to output concise parameter-set summaries before setting AVC/HEVC descriptors. These changes improve handling of concatenated Annex-B parameter set blobs and make debugging parameter extraction easier.
2026-02-27 11:09:28 +01:00
Cédric Verstraeten
ff38ccbadf Merge pull request #248 from kerberos-io/fix/sanitize-parameter-sets
fix/sanitize-parameter-sets
2026-02-26 20:43:53 +01:00
cedricve
f64e899de9 Populate/sanitize NALUs and avoid empty MP4
Fill missing SPS/PPS/VPS from camera config before closing recordings and warn when parameter sets are incomplete (for both continuous and motion-detection flows). Sanitize parameter sets (remove Annex-B start codes and drop empty NALUs) before writing AVC/HEVC descriptors. Prevent creation of empty MP4 files by flushing/closing and removing files when no audio/video samples were added, and only add an audio track when audio samples exist.
2026-02-26 20:37:10 +01:00
Cédric Verstraeten
b8a81d18af Merge pull request #247 from kerberos-io/fix/ensure-stsd
fix/ensure-stsd
2026-02-26 17:13:45 +01:00
cedricve
8c2e3e4cdd Recover video parameter sets from Annex B NALUs
Add updateVideoParameterSetsFromAnnexB to parse Annex B NALUs and populate missing SPS/PPS/VPS for H.264/H.265 streams. Call this helper when adding video samples so in-band parameter sets can be recovered early. Also add error logging in Close() when setting AVC/HEVC descriptors fails. These changes improve robustness for streams that carry SPS/PPS/VPS inline.
2026-02-26 17:05:09 +01:00
Cédric Verstraeten
11c4ee518d Merge pull request #246 from kerberos-io/fix/handle-sps-pps-unknown-state
fix/handle-sps-pps-unknown-state
2026-02-26 16:24:54 +01:00
cedricve
51b9d76973 Improve SPS/PPS handling: add warnings for missing SPS/PPS during recording start 2026-02-26 15:24:34 +00:00
cedricve
f3c1cb9b82 Enhance SPS/PPS handling for main stream in gortsplib: add fallback for missing SDP 2026-02-26 15:21:54 +00:00
Cédric Verstraeten
a1368361e4 Merge pull request #242 from kerberos-io/fix/update-workflows-for-nightly-build
fix/update-workflows-for-nightly-build
2026-02-16 12:44:40 +01:00
Cédric Verstraeten
abfdea0179 Update issue-userstory-create.yml 2026-02-16 12:37:49 +01:00
Cédric Verstraeten
8aaeb62fa3 Merge pull request #241 from kerberos-io/fix/update-workflows-for-nightly-build
fix/update-workflows-for-nightly-build
2026-02-16 12:21:06 +01:00
Cédric Verstraeten
e30dd7d4a0 Add nightly build workflow for Docker images 2026-02-16 12:16:39 +01:00
Cédric Verstraeten
ac3f9aa4e8 Merge pull request #240 from kerberos-io/feature/add-issue-generator-workflow
feature/add-issue-generator-workflow
2026-02-16 11:58:06 +01:00
Cédric Verstraeten
04c568f488 Add workflow to create user story issues with customizable inputs 2026-02-16 11:54:07 +01:00
Cédric Verstraeten
e270223968 Merge pull request #238 from kerberos-io/fix/docker-build-release-action
fix/docker-build-release-action
2026-02-13 22:17:33 +01:00
cedricve
01ab1a9218 Disable build provenance in Docker builds
Add --provenance=false to docker build invocations in .github/workflows/release-create.yml (both default and arm64 steps) to suppress Docker provenance metadata during CI builds.
2026-02-13 22:16:23 +01:00
Cédric Verstraeten
6f0794b09c Merge pull request #237 from kerberos-io/feature/fix-quicktime-duration
feature/fix-quicktime-duration
2026-02-13 21:55:41 +01:00
cedricve
1ae6a46d88 Embed build version into binaries
Pass VERSION from CI into Docker builds and embed it into the Go binary via ldflags. Updated .github workflow to supply --build-arg VERSION for both architectures. Added ARG VERSION and logic in Dockerfile and Dockerfile.arm64 to derive the version from git (git describe --tags) or fall back to the provided build-arg, then set it with -X during go build. Changed VERSION in machinery/src/utils/main.go from a const to a var defaulting to "0.0.0" and documented that it is overridden at build time. This ensures released images contain the correct agent version while local/dev builds keep a sensible default.
2026-02-13 21:50:09 +01:00
cedricve
9d83cab5cc Set mdhd.Duration to 0 for fragmented MP4
Uncomment and explicitly set mdhd.Duration = 0 in machinery/src/video/mp4.go for relevant tracks (video H264/H265 and audio track). This ensures mdhd.Duration is zero for fragmented MP4 so players derive duration from fragments (avoiding QuickTime adding fragment durations and doubling the reported duration).
2026-02-13 21:46:32 +01:00
cedricve
6f559c2f00 Align MP4 headers to fragment durations
Compute actual video duration from SegmentDurations and ensure container headers reflect fragment durations. Set mvhd.Duration and mvex/mehd.FragmentDuration to the maximum of video (sum of segments) and audio durations so the overall mvhd matches the longest track. Use the summed segment duration for track tkhd.Duration and keep mdhd.Duration at 0 for fragmented MP4s (to avoid double-counting). Add a warning log when accumulated video duration differs from the recorded VideoTotalDuration. Harden fingerprint generation and private key handling with nil checks.

Add mp4_duration_test.go: unit test that creates a simulated H.264 fragmented MP4 (150 frames at 40ms), closes it, parses the output and verifies that mvhd/mehd and trun sample durations are consistent and that mdhd.Duration is zero.
2026-02-13 21:35:57 +01:00
cedricve
c147944f5a Convert MP4 timestamps to Mac HFS epoch
Add MacEpochOffset constant and convert mp4.StartTime to Mac HFS time for QuickTime compatibility. Compute macTime = mp4.StartTime + MacEpochOffset and use it for mvhd CreationTime/ModificationTime, as well as track tkhd and mdhd creation/modification timestamps for video and audio tracks. Also set mvhd Rate, Volume and NextTrackID. These changes ensure generated MP4s use QuickTime-compatible epoch and include proper mvhd metadata.
2026-02-13 21:01:45 +01:00
Cédric Verstraeten
e8ca776e4e Merge pull request #236 from kerberos-io/fix/debugging-lost-keyframes
fix/debugging-lost-keyframes
2026-02-11 16:51:07 +01:00
Cédric Verstraeten
de5c4b6e0a Merge branch 'master' into fix/debugging-lost-keyframes 2026-02-11 16:48:08 +01:00
Cédric Verstraeten
9ba64de090 add additional logging 2026-02-11 16:48:01 +01:00
12 changed files with 692 additions and 130 deletions

View File

@@ -1,58 +0,0 @@
name: Docker development build
on:
push:
branches: [develop]
jobs:
build-amd64:
runs-on: ubuntu-latest
strategy:
matrix:
architecture: [amd64]
steps:
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Checkout
uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Run Buildx
run: docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
- name: Create new and append to manifest
run: docker buildx imagetools create -t kerberos/agent-dev:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
- name: Create new and append to latest manifest
run: docker buildx imagetools create -t kerberos/agent-dev:latest kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
build-other:
runs-on: ubuntu-latest
strategy:
matrix:
#architecture: [arm64, arm/v7, arm/v6]
architecture: [arm64, arm/v7]
steps:
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Checkout
uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Run Buildx
run: docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
- name: Create new and append to manifest
run: docker buildx imagetools create --append -t kerberos/agent-dev:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
- name: Create new and append to manifest latest
run: docker buildx imagetools create --append -t kerberos/agent-dev:latest kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)

View File

@@ -0,0 +1,51 @@
name: Create User Story Issue
on:
workflow_dispatch:
inputs:
issue_title:
description: 'Title for the issue'
required: true
issue_description:
description: 'Brief description of the feature'
required: true
complexity:
description: 'Complexity of the feature'
required: true
type: choice
options:
- 'Low'
- 'Medium'
- 'High'
default: 'Medium'
duration:
description: 'Estimated duration'
required: true
type: choice
options:
- '1 day'
- '3 days'
- '1 week'
- '2 weeks'
- '1 month'
default: '1 week'
jobs:
create-issue:
runs-on: ubuntu-latest
permissions:
issues: write
steps:
- name: Create Issue with User Story
uses: cedricve/llm-create-issue-user-story@main
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
azure_openai_api_key: ${{ secrets.AZURE_OPENAI_API_KEY }}
azure_openai_endpoint: ${{ secrets.AZURE_OPENAI_ENDPOINT }}
azure_openai_version: ${{ secrets.AZURE_OPENAI_VERSION }}
openai_model: ${{ secrets.OPENAI_MODEL }}
issue_title: ${{ github.event.inputs.issue_title }}
issue_description: ${{ github.event.inputs.issue_description }}
complexity: ${{ github.event.inputs.complexity }}
duration: ${{ github.event.inputs.duration }}
labels: 'user-story,feature'
assignees: ${{ github.actor }}

View File

@@ -1,12 +1,14 @@
name: Docker nightly build
name: Nightly build
on:
# Triggers the workflow every day at 9PM (CET).
schedule:
- cron: "0 22 * * *"
# Allows manual triggering from the Actions tab.
workflow_dispatch:
jobs:
build-amd64:
nightly-build-amd64:
runs-on: ubuntu-latest
strategy:
matrix:
@@ -18,7 +20,9 @@ jobs:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Checkout
run: git clone https://github.com/kerberos-io/agent && cd agent
uses: actions/checkout@v4
with:
ref: master
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
@@ -26,10 +30,10 @@ jobs:
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Run Buildx
run: cd agent && docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
run: docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
- name: Create new and append to manifest
run: cd agent && docker buildx imagetools create -t kerberos/agent-nightly:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
build-other:
run: docker buildx imagetools create -t kerberos/agent-nightly:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
nightly-build-other:
runs-on: ubuntu-latest
strategy:
matrix:
@@ -41,7 +45,9 @@ jobs:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Checkout
run: git clone https://github.com/kerberos-io/agent && cd agent
uses: actions/checkout@v4
with:
ref: master
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
@@ -49,6 +55,6 @@ jobs:
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Run Buildx
run: cd agent && docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
run: docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
- name: Create new and append to manifest
run: cd agent && docker buildx imagetools create --append -t kerberos/agent-nightly:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
run: docker buildx imagetools create --append -t kerberos/agent-nightly:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)

View File

@@ -34,7 +34,7 @@ jobs:
length: 7
- name: Run Build
run: |
docker build -t ${{matrix.architecture}} .
docker build --provenance=false --build-arg VERSION=${{github.event.inputs.tag || github.ref_name}} -t ${{matrix.architecture}} .
CID=$(docker create ${{matrix.architecture}})
docker cp ${CID}:/home/agent ./output-${{matrix.architecture}}
docker rm ${CID}
@@ -71,7 +71,7 @@ jobs:
length: 7
- name: Run Build
run: |
docker build -t ${{matrix.architecture}} -f Dockerfile.arm64 .
docker build --provenance=false --build-arg VERSION=${{github.event.inputs.tag || github.ref_name}} -t ${{matrix.architecture}} -f Dockerfile.arm64 .
CID=$(docker create ${{matrix.architecture}})
docker cp ${CID}:/home/agent ./output-${{matrix.architecture}}
docker rm ${CID}

3
.gitignore vendored
View File

@@ -14,4 +14,5 @@ machinery/test*
machinery/init-dev.sh
machinery/.env.local
machinery/vendor
deployments/docker/private-docker-compose.yaml
deployments/docker/private-docker-compose.yaml
video.mp4

View File

@@ -1,5 +1,6 @@
ARG BASE_IMAGE_VERSION=amd64-ddbe40e
ARG VERSION=0.0.0
FROM kerberos/base:${BASE_IMAGE_VERSION} AS build-machinery
LABEL AUTHOR=uug.ai
@@ -34,7 +35,8 @@ RUN cat /go/src/github.com/kerberos-io/agent/machinery/version
RUN cd /go/src/github.com/kerberos-io/agent/machinery && \
go mod download && \
go build -tags timetzdata,netgo,osusergo --ldflags '-s -w -extldflags "-static -latomic"' main.go && \
VERSION=$(cd /go/src/github.com/kerberos-io/agent && git describe --tags --always 2>/dev/null || echo "${VERSION}") && \
go build -tags timetzdata,netgo,osusergo --ldflags "-s -w -X github.com/kerberos-io/agent/machinery/src/utils.VERSION=${VERSION} -extldflags '-static -latomic'" main.go && \
mkdir -p /agent && \
mv main /agent && \
mv version /agent && \

View File

@@ -1,5 +1,6 @@
ARG BASE_IMAGE_VERSION=arm64-ddbe40e
ARG VERSION=0.0.0
FROM kerberos/base:${BASE_IMAGE_VERSION} AS build-machinery
LABEL AUTHOR=uug.ai
@@ -34,7 +35,8 @@ RUN cat /go/src/github.com/kerberos-io/agent/machinery/version
RUN cd /go/src/github.com/kerberos-io/agent/machinery && \
go mod download && \
go build -tags timetzdata,netgo,osusergo --ldflags '-s -w -extldflags "-static -latomic"' main.go && \
VERSION=$(cd /go/src/github.com/kerberos-io/agent && git describe --tags --always 2>/dev/null || echo "${VERSION}") && \
go build -tags timetzdata,netgo,osusergo --ldflags "-s -w -X github.com/kerberos-io/agent/machinery/src/utils.VERSION=${VERSION} -extldflags '-static -latomic'" main.go && \
mkdir -p /agent && \
mv main /agent && \
mv version /agent && \

View File

@@ -695,14 +695,37 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
g.Streams[g.VideoH264Index].FPS = fps
log.Log.Debug(fmt.Sprintf("capture.golibrtsp.Start(%s): Final FPS=%.2f", streamType, fps))
g.VideoH264Forma.SPS = nalu
if streamType == "main" && len(nalu) > 0 {
// Fallback: store SPS from in-band NALUs when SDP was missing it.
configuration.Config.Capture.IPCamera.SPSNALUs = [][]byte{nalu}
}
}
case h264.NALUTypePPS:
g.VideoH264Forma.PPS = nalu
if streamType == "main" && len(nalu) > 0 {
// Fallback: store PPS from in-band NALUs when SDP was missing it.
configuration.Config.Capture.IPCamera.PPSNALUs = [][]byte{nalu}
}
}
filteredAU = append(filteredAU, nalu)
}
if idrPresent && streamType == "main" {
// Ensure config has parameter sets before recordings start.
if len(configuration.Config.Capture.IPCamera.SPSNALUs) == 0 && len(g.VideoH264Forma.SPS) > 0 {
configuration.Config.Capture.IPCamera.SPSNALUs = [][]byte{g.VideoH264Forma.SPS}
log.Log.Warning("capture.golibrtsp.Start(main): fallback SPS set from keyframe")
}
if len(configuration.Config.Capture.IPCamera.PPSNALUs) == 0 && len(g.VideoH264Forma.PPS) > 0 {
configuration.Config.Capture.IPCamera.PPSNALUs = [][]byte{g.VideoH264Forma.PPS}
log.Log.Warning("capture.golibrtsp.Start(main): fallback PPS set from keyframe")
}
if len(configuration.Config.Capture.IPCamera.SPSNALUs) == 0 || len(configuration.Config.Capture.IPCamera.PPSNALUs) == 0 {
log.Log.Warning("capture.golibrtsp.Start(main): SPS/PPS still missing after IDR keyframe")
}
}
if len(filteredAU) <= 1 || (!nonIDRPresent && !idrPresent) {
return
}

View File

@@ -159,6 +159,19 @@ func HandleRecordStream(queue *packets.Queue, configDirectory string, configurat
}
// Close mp4
if len(mp4Video.SPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.SPSNALUs) > 0 {
mp4Video.SPSNALUs = configuration.Config.Capture.IPCamera.SPSNALUs
}
if len(mp4Video.PPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.PPSNALUs) > 0 {
mp4Video.PPSNALUs = configuration.Config.Capture.IPCamera.PPSNALUs
}
if len(mp4Video.VPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.VPSNALUs) > 0 {
mp4Video.VPSNALUs = configuration.Config.Capture.IPCamera.VPSNALUs
}
if (videoCodec == "H264" && (len(mp4Video.SPSNALUs) == 0 || len(mp4Video.PPSNALUs) == 0)) ||
(videoCodec == "H265" && (len(mp4Video.VPSNALUs) == 0 || len(mp4Video.SPSNALUs) == 0 || len(mp4Video.PPSNALUs) == 0)) {
log.Log.Warning("capture.main.HandleRecordStream(continuous): closing MP4 without full parameter sets, moov may be incomplete")
}
mp4Video.Close(&config)
log.Log.Info("capture.main.HandleRecordStream(continuous): recording finished: file save: " + name)
@@ -279,6 +292,9 @@ func HandleRecordStream(queue *packets.Queue, configDirectory string, configurat
ppsNALUS := configuration.Config.Capture.IPCamera.PPSNALUs
vpsNALUS := configuration.Config.Capture.IPCamera.VPSNALUs
if len(spsNALUS) == 0 || len(ppsNALUS) == 0 {
log.Log.Warning("capture.main.HandleRecordStream(continuous): missing SPS/PPS at recording start")
}
// Create a video file, and set the dimensions.
mp4Video = video.NewMP4(fullName, spsNALUS, ppsNALUS, vpsNALUS, configuration.Config.Capture.MaxLengthRecording)
mp4Video.SetWidth(width)
@@ -499,6 +515,9 @@ func HandleRecordStream(queue *packets.Queue, configDirectory string, configurat
ppsNALUS := configuration.Config.Capture.IPCamera.PPSNALUs
vpsNALUS := configuration.Config.Capture.IPCamera.VPSNALUs
if len(spsNALUS) == 0 || len(ppsNALUS) == 0 {
log.Log.Warning("capture.main.HandleRecordStream(motiondetection): missing SPS/PPS at recording start")
}
// Create a video file, and set the dimensions.
mp4Video := video.NewMP4(fullName, spsNALUS, ppsNALUS, vpsNALUS, configuration.Config.Capture.MaxLengthRecording)
mp4Video.SetWidth(width)
@@ -574,6 +593,19 @@ func HandleRecordStream(queue *packets.Queue, configDirectory string, configurat
lastRecordingTime = pkt.CurrentTime
// This will close the recording and write the last packet.
if len(mp4Video.SPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.SPSNALUs) > 0 {
mp4Video.SPSNALUs = configuration.Config.Capture.IPCamera.SPSNALUs
}
if len(mp4Video.PPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.PPSNALUs) > 0 {
mp4Video.PPSNALUs = configuration.Config.Capture.IPCamera.PPSNALUs
}
if len(mp4Video.VPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.VPSNALUs) > 0 {
mp4Video.VPSNALUs = configuration.Config.Capture.IPCamera.VPSNALUs
}
if (videoCodec == "H264" && (len(mp4Video.SPSNALUs) == 0 || len(mp4Video.PPSNALUs) == 0)) ||
(videoCodec == "H265" && (len(mp4Video.VPSNALUs) == 0 || len(mp4Video.SPSNALUs) == 0 || len(mp4Video.PPSNALUs) == 0)) {
log.Log.Warning("capture.main.HandleRecordStream(motiondetection): closing MP4 without full parameter sets, moov may be incomplete")
}
mp4Video.Close(&config)
log.Log.Info("capture.main.HandleRecordStream(motiondetection): file save: " + name)

View File

@@ -25,7 +25,10 @@ import (
"github.com/nfnt/resize"
)
const VERSION = "3.5.0"
// VERSION is the agent version. It defaults to "0.0.0" for local dev builds
// and is overridden at build time via:
// go build -ldflags "-X github.com/kerberos-io/agent/machinery/src/utils.VERSION=v1.2.3"
var VERSION = "0.0.0"
const letterBytes = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"

View File

@@ -13,6 +13,7 @@ import (
"strings"
"time"
"github.com/Eyevinn/mp4ff/avc"
mp4ff "github.com/Eyevinn/mp4ff/mp4"
"github.com/kerberos-io/agent/machinery/src/encryption"
"github.com/kerberos-io/agent/machinery/src/log"
@@ -22,6 +23,10 @@ import (
var LastPTS uint64 = 0 // Last PTS for the current segment
// MacEpochOffset is the number of seconds between Mac HFS epoch (1904-01-01)
// and Unix epoch (1970-01-01). QuickTime requires timestamps in Mac HFS format.
const MacEpochOffset uint64 = 2082844800
// FragmentDurationMs is the target duration for each fragment in milliseconds.
// Fragments will be flushed at the first keyframe after this duration has elapsed,
// resulting in ~3 second fragments (assuming a typical GOP interval).
@@ -29,42 +34,46 @@ const FragmentDurationMs = 3000
type MP4 struct {
// FileName is the name of the file
FileName string
width int
height int
Segments []*mp4ff.MediaSegment // List of media segments
Segment *mp4ff.MediaSegment
MultiTrackFragment *mp4ff.Fragment
TrackIDs []uint32
FileWriter *os.File
Writer *bufio.Writer
SegmentCount int
SampleCount int
StartPTS uint64
VideoTotalDuration uint64
AudioTotalDuration uint64
AudioPTS uint64
Start bool
SPSNALUs [][]byte // SPS NALUs for H264
PPSNALUs [][]byte // PPS NALUs for H264
VPSNALUs [][]byte // VPS NALUs for H264
FreeBoxSize int64
FragmentStartRawPTS uint64 // Raw PTS for timing when to flush fragments
FragmentStartDTS uint64 // Accumulated VideoTotalDuration at fragment start (matches tfdt)
MoofBoxes int64 // Number of moof boxes in the file
MoofBoxSizes []int64 // Sizes of each moof box
SegmentDurations []uint64 // Duration of each segment in timescale units
SegmentBaseDecTimes []uint64 // Base decode time of each segment
StartTime uint64 // Start time of the MP4 file
VideoTrackName string // Name of the video track
VideoTrack int // Track ID for the video track
AudioTrackName string // Name of the audio track
AudioTrack int // Track ID for the audio track
VideoFullSample *mp4ff.FullSample // Full sample for video track
AudioFullSample *mp4ff.FullSample // Full sample for audio track
LastAudioSampleDTS uint64 // Last PTS for audio sample
LastVideoSampleDTS uint64 // Last PTS for video sample
SampleType string // Type of the sample (e.g., "video", "audio", "subtitle")
FileName string
width int
height int
Segments []*mp4ff.MediaSegment // List of media segments
Segment *mp4ff.MediaSegment
MultiTrackFragment *mp4ff.Fragment
TrackIDs []uint32
FileWriter *os.File
Writer *bufio.Writer
SegmentCount int
SampleCount int
StartPTS uint64
VideoTotalDuration uint64
AudioTotalDuration uint64
AudioPTS uint64
Start bool
SPSNALUs [][]byte // SPS NALUs for H264
PPSNALUs [][]byte // PPS NALUs for H264
VPSNALUs [][]byte // VPS NALUs for H264
FreeBoxSize int64
FragmentStartRawPTS uint64 // Raw PTS for timing when to flush fragments
FragmentStartDTS uint64 // Accumulated VideoTotalDuration at fragment start (matches tfdt)
MoofBoxes int64 // Number of moof boxes in the file
MoofBoxSizes []int64 // Sizes of each moof box
SegmentDurations []uint64 // Duration of each segment in timescale units
SegmentBaseDecTimes []uint64 // Base decode time of each segment
StartTime uint64 // Start time of the MP4 file
VideoTrackName string // Name of the video track
VideoTrack int // Track ID for the video track
AudioTrackName string // Name of the audio track
AudioTrack int // Track ID for the audio track
VideoFullSample *mp4ff.FullSample // Full sample for video track
AudioFullSample *mp4ff.FullSample // Full sample for audio track
LastAudioSampleDTS uint64 // Last PTS for audio sample
LastVideoSampleDTS uint64 // Last PTS for video sample
SampleType string // Type of the sample (e.g., "video", "audio", "subtitle")
TotalKeyframesReceived int // Total keyframes received by AddSampleToTrack
TotalKeyframesWritten int // Total keyframes written to trun boxes
FragmentKeyframeCount int // Keyframes in the current fragment
PendingSampleIsKeyframe bool // Whether the pending video sample is a keyframe
}
// NewMP4 creates a new MP4 object.
@@ -150,6 +159,68 @@ func (mp4 *MP4) AddAudioTrack(codec string) uint32 {
func (mp4 *MP4) AddMediaSegment(segNr int) {
}
// updateVideoParameterSetsFromAnnexB inspects Annex B data to fill missing SPS/PPS/VPS.
func (mp4 *MP4) updateVideoParameterSetsFromAnnexB(data []byte) {
if len(data) == 0 {
return
}
needSPS := len(mp4.SPSNALUs) == 0
needPPS := len(mp4.PPSNALUs) == 0
needVPS := len(mp4.VPSNALUs) == 0
if !(needSPS || needPPS || needVPS) {
return
}
for _, nalu := range splitNALUs(data) {
nalu = removeAnnexBStartCode(nalu)
if len(nalu) == 0 {
continue
}
switch mp4.VideoTrackName {
case "H264", "AVC1":
nalType := nalu[0] & 0x1F
switch nalType {
case 7: // SPS
if needSPS {
mp4.SPSNALUs = [][]byte{nalu}
needSPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): SPS recovered from in-band NALU")
}
case 8: // PPS
if needPPS {
mp4.PPSNALUs = [][]byte{nalu}
needPPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): PPS recovered from in-band NALU")
}
}
case "H265", "HVC1":
nalType := (nalu[0] >> 1) & 0x3F
switch nalType {
case 32: // VPS
if needVPS {
mp4.VPSNALUs = [][]byte{nalu}
needVPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): VPS recovered from in-band NALU")
}
case 33: // SPS
if needSPS {
mp4.SPSNALUs = [][]byte{nalu}
needSPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): SPS recovered from in-band NALU")
}
case 34: // PPS
if needPPS {
mp4.PPSNALUs = [][]byte{nalu}
needPPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): PPS recovered from in-band NALU")
}
}
}
}
}
// flushPendingVideoSample writes the pending video sample to the current fragment.
// If nextPTS is provided (non-zero), it calculates duration from the PTS difference.
// If nextPTS is 0 (e.g., at Close time), it uses the last known duration.
@@ -178,20 +249,33 @@ func (mp4 *MP4) flushPendingVideoSample(nextPTS uint64) bool {
mp4.VideoFullSample.DecodeTime = mp4.VideoTotalDuration - duration
mp4.VideoFullSample.Sample.Dur = uint32(duration)
isKF := mp4.PendingSampleIsKeyframe
err := mp4.MultiTrackFragment.AddFullSampleToTrack(*mp4.VideoFullSample, uint32(mp4.VideoTrack))
if err != nil {
log.Log.Error("mp4.flushPendingVideoSample(): error adding sample: " + err.Error())
}
if isKF {
mp4.TotalKeyframesWritten++
mp4.FragmentKeyframeCount++
log.Log.Debug(fmt.Sprintf("mp4.flushPendingVideoSample(): KEYFRAME WRITTEN to trun - totalWritten=%d, fragmentKF=%d, flags=0x%08x, dur=%d, DTS=%d",
mp4.TotalKeyframesWritten, mp4.FragmentKeyframeCount, mp4.VideoFullSample.Sample.Flags, duration, mp4.VideoFullSample.DecodeTime))
}
mp4.VideoFullSample = nil
mp4.PendingSampleIsKeyframe = false
return true
}
func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, pts uint64) error {
if isKeyframe && trackID == uint32(mp4.VideoTrack) {
log.Log.Debug(fmt.Sprintf("mp4.AddSampleToTrack(): KEYFRAME received - track=%d, PTS=%d, size=%d, sampleCount=%d",
trackID, pts, len(data), mp4.SampleCount))
mp4.TotalKeyframesReceived++
elapsedDbg := uint64(0)
if mp4.Start {
elapsedDbg = pts - mp4.FragmentStartRawPTS
}
log.Log.Debug(fmt.Sprintf("mp4.AddSampleToTrack(): KEYFRAME #%d received - PTS=%d, size=%d, elapsed=%dms, started=%t, segment=%d, fragKF=%d",
mp4.TotalKeyframesReceived, pts, len(data), elapsedDbg, mp4.Start, mp4.SegmentCount, mp4.FragmentKeyframeCount))
}
if isKeyframe {
@@ -215,6 +299,8 @@ func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, p
mp4.flushPendingVideoSample(pts)
}
log.Log.Debug(fmt.Sprintf("mp4.AddSampleToTrack(): FLUSHING segment #%d - keyframes_in_fragment=%d, totalKF_received=%d, totalKF_written=%d",
mp4.SegmentCount, mp4.FragmentKeyframeCount, mp4.TotalKeyframesReceived, mp4.TotalKeyframesWritten))
mp4.MoofBoxes = mp4.MoofBoxes + 1
mp4.MoofBoxSizes = append(mp4.MoofBoxSizes, int64(mp4.Segment.Size()))
// Track the segment's duration and base decode time for sidx.
@@ -253,12 +339,14 @@ func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, p
mp4.StartPTS = pts
mp4.FragmentStartRawPTS = pts
mp4.FragmentStartDTS = mp4.VideoTotalDuration
mp4.FragmentKeyframeCount = 0 // Reset keyframe counter for new fragment
}
}
if mp4.Start {
if trackID == uint32(mp4.VideoTrack) {
mp4.updateVideoParameterSetsFromAnnexB(data)
var lengthPrefixed []byte
var err error
@@ -290,6 +378,7 @@ func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, p
CompositionTimeOffset: 0, // No composition time offset for video
}
mp4.VideoFullSample = &fullSample
mp4.PendingSampleIsKeyframe = isKeyframe
mp4.SampleType = "video"
}
} else if trackID == uint32(mp4.AudioTrack) {
@@ -339,8 +428,16 @@ func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, p
func (mp4 *MP4) Close(config *models.Config) {
log.Log.Info(fmt.Sprintf("mp4.Close(): KEYFRAME SUMMARY - totalReceived=%d, totalWritten=%d, segments=%d, lastFragmentKF=%d",
mp4.TotalKeyframesReceived, mp4.TotalKeyframesWritten, mp4.SegmentCount, mp4.FragmentKeyframeCount))
if mp4.VideoTotalDuration == 0 && mp4.AudioTotalDuration == 0 {
log.Log.Error("mp4.Close(): no video or audio samples added, cannot create MP4 file")
log.Log.Error("mp4.Close(): no video or audio samples added, removing empty MP4 file")
mp4.Writer.Flush()
_ = mp4.FileWriter.Sync()
_ = mp4.FileWriter.Close()
_ = os.Remove(mp4.FileName)
return
}
// Add final pending samples before closing
@@ -412,22 +509,50 @@ func (mp4 *MP4) Close(config *models.Config) {
moov := mp4ff.NewMoovBox()
init.AddChild(moov)
// Set the creation time and modification time for the moov box
// Compute the actual video duration by summing segment durations.
// This must exactly match the sum of sample durations in the trun boxes
// that were written to the file, ensuring QuickTime (which strictly trusts
// header durations) displays the correct value.
var actualVideoDuration uint64
for _, d := range mp4.SegmentDurations {
actualVideoDuration += d
}
if actualVideoDuration != mp4.VideoTotalDuration {
log.Log.Warning(fmt.Sprintf("mp4.Close(): duration mismatch: accumulated VideoTotalDuration=%d, sum of segment durations=%d (diff=%d ms)",
mp4.VideoTotalDuration, actualVideoDuration, int64(mp4.VideoTotalDuration)-int64(actualVideoDuration)))
}
// Set the creation time and modification time for the moov box.
// QuickTime requires timestamps in Mac HFS format (seconds since 1904-01-01),
// so we convert from Unix epoch by adding MacEpochOffset.
videoTimescale := uint32(1000)
audioTimescale := uint32(1000)
macTime := mp4.StartTime + MacEpochOffset
nextTrackID := uint32(len(mp4.TrackIDs) + 1)
// mvhd.Duration must be the duration of the longest track.
// Start with video; if audio is longer, we update below.
movDuration := actualVideoDuration
if mp4.AudioTotalDuration > movDuration {
movDuration = mp4.AudioTotalDuration
}
mvhd := &mp4ff.MvhdBox{
Version: 0,
Flags: 0,
CreationTime: mp4.StartTime,
ModificationTime: mp4.StartTime,
CreationTime: macTime,
ModificationTime: macTime,
Timescale: videoTimescale,
Duration: mp4.VideoTotalDuration,
Duration: movDuration,
Rate: 0x00010000, // 1.0 playback speed (16.16 fixed point)
Volume: 0x0100, // 1.0 full volume (8.8 fixed point)
NextTrackID: nextTrackID,
}
init.Moov.AddChild(mvhd)
// Set the total duration in the moov box
mvex := mp4ff.NewMvexBox()
mvex.AddChild(&mp4ff.MehdBox{FragmentDuration: int64(mp4.VideoTotalDuration)})
mvex.AddChild(&mp4ff.MehdBox{FragmentDuration: int64(movDuration)})
init.Moov.AddChild(mvex)
// Add a track for the video
@@ -435,29 +560,52 @@ func (mp4 *MP4) Close(config *models.Config) {
case "H264", "AVC1":
init.AddEmptyTrack(videoTimescale, "video", "und")
includePS := true
err := init.Moov.Traks[0].SetAVCDescriptor("avc1", mp4.SPSNALUs, mp4.PPSNALUs, includePS)
spsNALUs, ppsNALUs := normalizeH264ParameterSets(mp4.SPSNALUs, mp4.PPSNALUs)
log.Log.Debug("mp4.Close(): AVC parameter sets: SPS=" + formatNaluDebug(spsNALUs) + ", PPS=" + formatNaluDebug(ppsNALUs))
err := init.Moov.Traks[0].SetAVCDescriptor("avc1", spsNALUs, ppsNALUs, includePS)
if err != nil {
log.Log.Error("mp4.Close(): error setting AVC descriptor: " + err.Error())
if fallbackErr := addAVCDescriptorFallback(init.Moov.Traks[0], spsNALUs, ppsNALUs, uint16(mp4.width), uint16(mp4.height)); fallbackErr != nil {
log.Log.Error("mp4.Close(): error setting AVC descriptor fallback: " + fallbackErr.Error())
} else {
log.Log.Warning("mp4.Close(): AVC descriptor fallback used due to SPS parse error")
}
}
init.Moov.Traks[0].Tkhd.Duration = mp4.VideoTotalDuration
init.Moov.Traks[0].Tkhd.Duration = actualVideoDuration
init.Moov.Traks[0].Tkhd.Width = mp4ff.Fixed32(uint32(mp4.width) << 16)
init.Moov.Traks[0].Tkhd.Height = mp4ff.Fixed32(uint32(mp4.height) << 16)
init.Moov.Traks[0].Tkhd.CreationTime = macTime
init.Moov.Traks[0].Tkhd.ModificationTime = macTime
init.Moov.Traks[0].Mdia.Hdlr.Name = "agent " + utils.VERSION
init.Moov.Traks[0].Mdia.Mdhd.Duration = mp4.VideoTotalDuration
// mdhd.Duration MUST be 0 for fragmented MP4. QuickTime adds mdhd.Duration
// to the fragment durations (mehd/sidx), so setting it non-zero doubles the
// reported duration. Leave it at 0 so the player derives duration from fragments.
init.Moov.Traks[0].Mdia.Mdhd.Duration = 0
init.Moov.Traks[0].Mdia.Mdhd.CreationTime = macTime
init.Moov.Traks[0].Mdia.Mdhd.ModificationTime = macTime
case "H265", "HVC1":
init.AddEmptyTrack(videoTimescale, "video", "und")
includePS := true
err := init.Moov.Traks[0].SetHEVCDescriptor("hvc1", mp4.VPSNALUs, mp4.SPSNALUs, mp4.PPSNALUs, [][]byte{}, includePS)
vpsNALUs, spsNALUs, ppsNALUs := normalizeH265ParameterSets(mp4.VPSNALUs, mp4.SPSNALUs, mp4.PPSNALUs)
log.Log.Debug("mp4.Close(): HEVC parameter sets: VPS=" + formatNaluDebug(vpsNALUs) + ", SPS=" + formatNaluDebug(spsNALUs) + ", PPS=" + formatNaluDebug(ppsNALUs))
err := init.Moov.Traks[0].SetHEVCDescriptor("hvc1", vpsNALUs, spsNALUs, ppsNALUs, [][]byte{}, includePS)
if err != nil {
log.Log.Error("mp4.Close(): error setting HEVC descriptor: " + err.Error())
}
init.Moov.Traks[0].Tkhd.Duration = mp4.VideoTotalDuration
init.Moov.Traks[0].Tkhd.Duration = actualVideoDuration
init.Moov.Traks[0].Tkhd.Width = mp4ff.Fixed32(uint32(mp4.width) << 16)
init.Moov.Traks[0].Tkhd.Height = mp4ff.Fixed32(uint32(mp4.height) << 16)
init.Moov.Traks[0].Tkhd.CreationTime = macTime
init.Moov.Traks[0].Tkhd.ModificationTime = macTime
init.Moov.Traks[0].Mdia.Hdlr.Name = "agent " + utils.VERSION
init.Moov.Traks[0].Mdia.Mdhd.Duration = mp4.VideoTotalDuration
// mdhd.Duration MUST be 0 for fragmented MP4 (see H264 case above).
init.Moov.Traks[0].Mdia.Mdhd.Duration = 0
init.Moov.Traks[0].Mdia.Mdhd.CreationTime = macTime
init.Moov.Traks[0].Mdia.Mdhd.ModificationTime = macTime
}
// Try adding audio track if available
if mp4.AudioTrackName == "AAC" || mp4.AudioTrackName == "MP4A" {
// Try adding audio track if available and samples were recorded.
if (mp4.AudioTrackName == "AAC" || mp4.AudioTrackName == "MP4A") && mp4.AudioTotalDuration > 0 {
// Add an audio track to the moov box
init.AddEmptyTrack(audioTimescale, "audio", "und")
@@ -471,8 +619,13 @@ func (mp4 *MP4) Close(config *models.Config) {
if err != nil {
}
init.Moov.Traks[1].Tkhd.Duration = mp4.AudioTotalDuration
init.Moov.Traks[1].Tkhd.CreationTime = macTime
init.Moov.Traks[1].Tkhd.ModificationTime = macTime
init.Moov.Traks[1].Mdia.Hdlr.Name = "agent " + utils.VERSION
init.Moov.Traks[1].Mdia.Mdhd.Duration = mp4.AudioTotalDuration
// mdhd.Duration MUST be 0 for fragmented MP4 (see video track comment).
init.Moov.Traks[1].Mdia.Mdhd.Duration = 0
init.Moov.Traks[1].Mdia.Mdhd.CreationTime = macTime
init.Moov.Traks[1].Mdia.Mdhd.ModificationTime = macTime
}
// Try adding subtitle track if available
@@ -503,9 +656,11 @@ func (mp4 *MP4) Close(config *models.Config) {
// and encrypted with the public key.
fingerprint := fmt.Sprintf("%d", init.Moov.Mvhd.CreationTime) + "_" +
fmt.Sprintf("%d", init.Moov.Mvhd.Duration) + "_" +
init.Moov.Trak.Mdia.Hdlr.Name + "_" +
fmt.Sprintf("%d", mp4.MoofBoxes) + "_" // Number of moof boxes
fmt.Sprintf("%d", init.Moov.Mvhd.Duration) + "_"
if init.Moov.Trak != nil {
fingerprint += init.Moov.Trak.Mdia.Hdlr.Name + "_"
}
fingerprint += fmt.Sprintf("%d", mp4.MoofBoxes) + "_" // Number of moof boxes
for i, size := range mp4.MoofBoxSizes {
fingerprint += fmt.Sprintf("%d", size)
@@ -519,7 +674,10 @@ func (mp4 *MP4) Close(config *models.Config) {
}
// Load the private key from the configuration
privateKey := config.Signing.PrivateKey
var privateKey string
if config.Signing != nil {
privateKey = config.Signing.PrivateKey
}
r := strings.NewReader(privateKey)
pemBytes, _ := ioutil.ReadAll(r)
block, _ := pem.Decode(pemBytes)
@@ -685,6 +843,172 @@ func removeAnnexBStartCode(nalu []byte) []byte {
return nalu
}
// sanitizeParameterSets removes Annex B start codes and drops empty NALUs.
func sanitizeParameterSets(nalus [][]byte) [][]byte {
if len(nalus) == 0 {
return nalus
}
clean := make([][]byte, 0, len(nalus))
for _, nalu := range nalus {
trimmed := removeAnnexBStartCode(nalu)
if len(trimmed) == 0 {
continue
}
clean = append(clean, trimmed)
}
return clean
}
// normalizeH264ParameterSets splits Annex B blobs and extracts SPS/PPS NALUs.
func normalizeH264ParameterSets(spsIn [][]byte, ppsIn [][]byte) ([][]byte, [][]byte) {
all := make([][]byte, 0, len(spsIn)+len(ppsIn))
all = append(all, spsIn...)
all = append(all, ppsIn...)
var spsOut [][]byte
var ppsOut [][]byte
for _, blob := range all {
for _, nalu := range splitParamSetNALUs(blob) {
nalu = removeAnnexBStartCode(nalu)
if len(nalu) == 0 {
continue
}
typ := nalu[0] & 0x1F
switch typ {
case 7:
spsOut = append(spsOut, nalu)
case 8:
ppsOut = append(ppsOut, nalu)
}
}
}
if len(spsOut) == 0 {
spsOut = sanitizeParameterSets(spsIn)
}
if len(ppsOut) == 0 {
ppsOut = sanitizeParameterSets(ppsIn)
}
return spsOut, ppsOut
}
// normalizeH265ParameterSets splits Annex B blobs and extracts VPS/SPS/PPS NALUs.
func normalizeH265ParameterSets(vpsIn [][]byte, spsIn [][]byte, ppsIn [][]byte) ([][]byte, [][]byte, [][]byte) {
all := make([][]byte, 0, len(vpsIn)+len(spsIn)+len(ppsIn))
all = append(all, vpsIn...)
all = append(all, spsIn...)
all = append(all, ppsIn...)
var vpsOut [][]byte
var spsOut [][]byte
var ppsOut [][]byte
for _, blob := range all {
for _, nalu := range splitParamSetNALUs(blob) {
nalu = removeAnnexBStartCode(nalu)
if len(nalu) == 0 {
continue
}
typ := (nalu[0] >> 1) & 0x3F
switch typ {
case 32:
vpsOut = append(vpsOut, nalu)
case 33:
spsOut = append(spsOut, nalu)
case 34:
ppsOut = append(ppsOut, nalu)
}
}
}
if len(vpsOut) == 0 {
vpsOut = sanitizeParameterSets(vpsIn)
}
if len(spsOut) == 0 {
spsOut = sanitizeParameterSets(spsIn)
}
if len(ppsOut) == 0 {
ppsOut = sanitizeParameterSets(ppsIn)
}
return vpsOut, spsOut, ppsOut
}
// splitParamSetNALUs splits Annex B parameter set blobs; raw NALUs are returned as-is.
func splitParamSetNALUs(blob []byte) [][]byte {
if len(blob) == 0 {
return nil
}
if findStartCode(blob, 0) >= 0 {
return splitNALUs(blob)
}
return [][]byte{blob}
}
func formatNaluDebug(nalus [][]byte) string {
if len(nalus) == 0 {
return "none"
}
parts := make([]string, 0, len(nalus))
for _, nalu := range nalus {
if len(nalu) == 0 {
parts = append(parts, "len=0")
continue
}
max := 8
if len(nalu) < max {
max = len(nalu)
}
parts = append(parts, fmt.Sprintf("len=%d head=%x", len(nalu), nalu[:max]))
}
return strings.Join(parts, "; ")
}
func addAVCDescriptorFallback(trak *mp4ff.TrakBox, spsNALUs, ppsNALUs [][]byte, width, height uint16) error {
if trak == nil || trak.Mdia == nil || trak.Mdia.Minf == nil || trak.Mdia.Minf.Stbl == nil || trak.Mdia.Minf.Stbl.Stsd == nil {
return fmt.Errorf("missing trak stsd")
}
if len(spsNALUs) == 0 {
return fmt.Errorf("no SPS NALU available")
}
decConfRec, err := buildAVCDecConfRecFromSPS(spsNALUs, ppsNALUs)
if err != nil {
return err
}
if width == 0 && trak.Tkhd != nil {
width = uint16(uint32(trak.Tkhd.Width) >> 16)
}
if height == 0 && trak.Tkhd != nil {
height = uint16(uint32(trak.Tkhd.Height) >> 16)
}
if width > 0 && height > 0 && trak.Tkhd != nil {
trak.Tkhd.Width = mp4ff.Fixed32(uint32(width) << 16)
trak.Tkhd.Height = mp4ff.Fixed32(uint32(height) << 16)
}
avcC := &mp4ff.AvcCBox{DecConfRec: *decConfRec}
avcx := mp4ff.CreateVisualSampleEntryBox("avc1", width, height, avcC)
trak.Mdia.Minf.Stbl.Stsd.AddChild(avcx)
return nil
}
func buildAVCDecConfRecFromSPS(spsNALUs, ppsNALUs [][]byte) (*avc.DecConfRec, error) {
if len(spsNALUs) == 0 {
return nil, fmt.Errorf("no SPS NALU available")
}
sps := spsNALUs[0]
if len(sps) < 4 {
return nil, fmt.Errorf("SPS too short: len=%d", len(sps))
}
// SPS NALU: byte 0 is NAL header, next 3 bytes are profile/compat/level.
dec := &avc.DecConfRec{
AVCProfileIndication: sps[1],
ProfileCompatibility: sps[2],
AVCLevelIndication: sps[3],
SPSnalus: spsNALUs,
PPSnalus: ppsNALUs,
ChromaFormat: 1,
BitDepthLumaMinus1: 0,
BitDepthChromaMinus1: 0,
NumSPSExt: 0,
NoTrailingInfo: true,
}
return dec, nil
}
// splitNALUs splits Annex B data into raw NAL units without start codes.
func splitNALUs(data []byte) [][]byte {
var nalus [][]byte

View File

@@ -0,0 +1,176 @@
package video
import (
"fmt"
"os"
"testing"
mp4ff "github.com/Eyevinn/mp4ff/mp4"
"github.com/kerberos-io/agent/machinery/src/models"
)
// TestMP4Duration creates an MP4 file simulating a 5-second video recording
// and verifies that the durations in all boxes match the sum of sample durations.
func TestMP4Duration(t *testing.T) {
tmpFile := "/tmp/test_duration.mp4"
defer os.Remove(tmpFile)
// Minimal SPS for H.264 (baseline, 640x480) - proper Annex B format with start code
sps := []byte{0x67, 0x42, 0xc0, 0x1e, 0xd9, 0x00, 0xa0, 0x47, 0xfe, 0xc8}
pps := []byte{0x68, 0xce, 0x38, 0x80}
mp4Video := NewMP4(tmpFile, [][]byte{sps}, [][]byte{pps}, nil, 10)
mp4Video.SetWidth(640)
mp4Video.SetHeight(480)
videoTrack := mp4Video.AddVideoTrack("H264")
// Simulate 5 seconds at 25fps (200 frames, keyframe every 50 frames = 2s)
// PTS in milliseconds (timescale=1000)
frameDuration := uint64(40) // 40ms per frame = 25fps
numFrames := 150
gopSize := 50
// Create a fake Annex B NAL unit (keyframe IDR = type 5, non-keyframe = type 1)
makeFrame := func(isKey bool) []byte {
nalType := byte(0x01) // non-IDR slice
if isKey {
nalType = 0x65 // IDR slice
}
// Start code (4 bytes) + NAL header + some data
frame := []byte{0x00, 0x00, 0x00, 0x01, nalType}
// Add some padding data
for i := 0; i < 100; i++ {
frame = append(frame, byte(i))
}
return frame
}
var expectedDuration uint64
for i := 0; i < numFrames; i++ {
pts := uint64(i) * frameDuration
isKeyframe := i%gopSize == 0
err := mp4Video.AddSampleToTrack(videoTrack, isKeyframe, makeFrame(isKeyframe), pts)
if err != nil {
t.Fatalf("AddSampleToTrack failed at frame %d: %v", i, err)
}
}
expectedDuration = uint64(numFrames) * frameDuration // Should be 6000ms (150 * 40)
// Close with config that has signing key to avoid nil panics
config := &models.Config{
Signing: &models.Signing{
PrivateKey: "",
},
}
mp4Video.Close(config)
// Log what the code computed
t.Logf("VideoTotalDuration: %d ms", mp4Video.VideoTotalDuration)
t.Logf("Expected duration: %d ms", expectedDuration)
t.Logf("Segments: %d", len(mp4Video.SegmentDurations))
var sumSegDur uint64
for i, d := range mp4Video.SegmentDurations {
t.Logf(" Segment %d: duration=%d ms", i, d)
sumSegDur += d
}
t.Logf("Sum of segment durations: %d ms", sumSegDur)
// Now read back the file and inspect the boxes
f, err := os.Open(tmpFile)
if err != nil {
t.Fatalf("Failed to open output file: %v", err)
}
defer f.Close()
fi, err := f.Stat()
if err != nil {
t.Fatalf("Failed to stat output file: %v", err)
}
parsedFile, err := mp4ff.DecodeFile(f)
if err != nil {
t.Fatalf("Failed to decode MP4: %v", err)
}
t.Logf("File size: %d bytes", fi.Size())
// Check moov box
if parsedFile.Moov == nil {
t.Fatal("No moov box found")
}
// Check mvhd duration
mvhd := parsedFile.Moov.Mvhd
t.Logf("mvhd.Duration: %d (timescale=%d) = %.2f seconds", mvhd.Duration, mvhd.Timescale, float64(mvhd.Duration)/float64(mvhd.Timescale))
t.Logf("mvhd.Rate: 0x%08x", mvhd.Rate)
t.Logf("mvhd.Volume: 0x%04x", mvhd.Volume)
// Check each trak
for i, trak := range parsedFile.Moov.Traks {
t.Logf("Track %d:", i)
t.Logf(" tkhd.Duration: %d", trak.Tkhd.Duration)
t.Logf(" mdhd.Duration: %d (timescale=%d) = %.2f seconds", trak.Mdia.Mdhd.Duration, trak.Mdia.Mdhd.Timescale, float64(trak.Mdia.Mdhd.Duration)/float64(trak.Mdia.Mdhd.Timescale))
}
// Check mvex/mehd
if parsedFile.Moov.Mvex != nil && parsedFile.Moov.Mvex.Mehd != nil {
t.Logf("mehd.FragmentDuration: %d", parsedFile.Moov.Mvex.Mehd.FragmentDuration)
}
// Sum up actual sample durations from trun boxes in all segments
var actualTrunDuration uint64
var sampleCount int
for _, seg := range parsedFile.Segments {
for _, frag := range seg.Fragments {
for _, traf := range frag.Moof.Trafs {
// Only count video track (track 1)
if traf.Tfhd.TrackID == 1 {
for _, trun := range traf.Truns {
for _, s := range trun.Samples {
actualTrunDuration += uint64(s.Dur)
sampleCount++
}
}
}
}
}
}
t.Logf("Actual trun sample count: %d", sampleCount)
t.Logf("Actual trun total duration: %d ms", actualTrunDuration)
// Check sidx
if parsedFile.Sidx != nil {
var sidxDuration uint64
for _, ref := range parsedFile.Sidx.SidxRefs {
sidxDuration += uint64(ref.SubSegmentDuration)
}
t.Logf("sidx total duration: %d ms", sidxDuration)
}
// VERIFY: All duration values should be consistent
// The expected duration for 150 frames at 40ms each:
// - The sample-buffering pattern means the LAST sample uses LastVideoSampleDTS as duration
// - So all 150 samples should produce 150 * 40ms = 6000ms total
// But due to the pending sample pattern, the actual trun durations might differ
fmt.Println()
fmt.Println("=== DURATION CONSISTENCY CHECK ===")
fmt.Printf("Expected (150 * 40ms): %d ms\n", expectedDuration)
fmt.Printf("mvhd.Duration: %d ms\n", mvhd.Duration)
fmt.Printf("tkhd.Duration: %d ms\n", parsedFile.Moov.Traks[0].Tkhd.Duration)
fmt.Printf("mdhd.Duration: %d ms\n", parsedFile.Moov.Traks[0].Mdia.Mdhd.Duration)
fmt.Printf("Actual trun durations sum: %d ms\n", actualTrunDuration)
fmt.Printf("VideoTotalDuration: %d ms\n", mp4Video.VideoTotalDuration)
fmt.Printf("Sum of SegmentDurations: %d ms\n", sumSegDur)
fmt.Println()
// The key assertion: header duration must equal trun sum
if mvhd.Duration != actualTrunDuration {
t.Errorf("MISMATCH: mvhd.Duration (%d) != actual trun sum (%d), diff = %d ms",
mvhd.Duration, actualTrunDuration, int64(mvhd.Duration)-int64(actualTrunDuration))
}
if parsedFile.Moov.Traks[0].Mdia.Mdhd.Duration != 0 {
t.Errorf("MISMATCH: mdhd.Duration should be 0 for fragmented MP4, got %d",
parsedFile.Moov.Traks[0].Mdia.Mdhd.Duration)
}
}