Compare commits

...

53 Commits

Author SHA1 Message Date
Cédric Verstraeten
ff38ccbadf Merge pull request #248 from kerberos-io/fix/sanitize-parameter-sets
fix/sanitize-parameter-sets
2026-02-26 20:43:53 +01:00
cedricve
f64e899de9 Populate/sanitize NALUs and avoid empty MP4
Fill missing SPS/PPS/VPS from camera config before closing recordings and warn when parameter sets are incomplete (for both continuous and motion-detection flows). Sanitize parameter sets (remove Annex-B start codes and drop empty NALUs) before writing AVC/HEVC descriptors. Prevent creation of empty MP4 files by flushing/closing and removing files when no audio/video samples were added, and only add an audio track when audio samples exist.
2026-02-26 20:37:10 +01:00
Cédric Verstraeten
b8a81d18af Merge pull request #247 from kerberos-io/fix/ensure-stsd
fix/ensure-stsd
2026-02-26 17:13:45 +01:00
cedricve
8c2e3e4cdd Recover video parameter sets from Annex B NALUs
Add updateVideoParameterSetsFromAnnexB to parse Annex B NALUs and populate missing SPS/PPS/VPS for H.264/H.265 streams. Call this helper when adding video samples so in-band parameter sets can be recovered early. Also add error logging in Close() when setting AVC/HEVC descriptors fails. These changes improve robustness for streams that carry SPS/PPS/VPS inline.
2026-02-26 17:05:09 +01:00
Cédric Verstraeten
11c4ee518d Merge pull request #246 from kerberos-io/fix/handle-sps-pps-unknown-state
fix/handle-sps-pps-unknown-state
2026-02-26 16:24:54 +01:00
cedricve
51b9d76973 Improve SPS/PPS handling: add warnings for missing SPS/PPS during recording start 2026-02-26 15:24:34 +00:00
cedricve
f3c1cb9b82 Enhance SPS/PPS handling for main stream in gortsplib: add fallback for missing SDP 2026-02-26 15:21:54 +00:00
Cédric Verstraeten
a1368361e4 Merge pull request #242 from kerberos-io/fix/update-workflows-for-nightly-build
fix/update-workflows-for-nightly-build
2026-02-16 12:44:40 +01:00
Cédric Verstraeten
abfdea0179 Update issue-userstory-create.yml 2026-02-16 12:37:49 +01:00
Cédric Verstraeten
8aaeb62fa3 Merge pull request #241 from kerberos-io/fix/update-workflows-for-nightly-build
fix/update-workflows-for-nightly-build
2026-02-16 12:21:06 +01:00
Cédric Verstraeten
e30dd7d4a0 Add nightly build workflow for Docker images 2026-02-16 12:16:39 +01:00
Cédric Verstraeten
ac3f9aa4e8 Merge pull request #240 from kerberos-io/feature/add-issue-generator-workflow
feature/add-issue-generator-workflow
2026-02-16 11:58:06 +01:00
Cédric Verstraeten
04c568f488 Add workflow to create user story issues with customizable inputs 2026-02-16 11:54:07 +01:00
Cédric Verstraeten
e270223968 Merge pull request #238 from kerberos-io/fix/docker-build-release-action
fix/docker-build-release-action
2026-02-13 22:17:33 +01:00
cedricve
01ab1a9218 Disable build provenance in Docker builds
Add --provenance=false to docker build invocations in .github/workflows/release-create.yml (both default and arm64 steps) to suppress Docker provenance metadata during CI builds.
2026-02-13 22:16:23 +01:00
Cédric Verstraeten
6f0794b09c Merge pull request #237 from kerberos-io/feature/fix-quicktime-duration
feature/fix-quicktime-duration
2026-02-13 21:55:41 +01:00
cedricve
1ae6a46d88 Embed build version into binaries
Pass VERSION from CI into Docker builds and embed it into the Go binary via ldflags. Updated .github workflow to supply --build-arg VERSION for both architectures. Added ARG VERSION and logic in Dockerfile and Dockerfile.arm64 to derive the version from git (git describe --tags) or fall back to the provided build-arg, then set it with -X during go build. Changed VERSION in machinery/src/utils/main.go from a const to a var defaulting to "0.0.0" and documented that it is overridden at build time. This ensures released images contain the correct agent version while local/dev builds keep a sensible default.
2026-02-13 21:50:09 +01:00
cedricve
9d83cab5cc Set mdhd.Duration to 0 for fragmented MP4
Uncomment and explicitly set mdhd.Duration = 0 in machinery/src/video/mp4.go for relevant tracks (video H264/H265 and audio track). This ensures mdhd.Duration is zero for fragmented MP4 so players derive duration from fragments (avoiding QuickTime adding fragment durations and doubling the reported duration).
2026-02-13 21:46:32 +01:00
cedricve
6f559c2f00 Align MP4 headers to fragment durations
Compute actual video duration from SegmentDurations and ensure container headers reflect fragment durations. Set mvhd.Duration and mvex/mehd.FragmentDuration to the maximum of video (sum of segments) and audio durations so the overall mvhd matches the longest track. Use the summed segment duration for track tkhd.Duration and keep mdhd.Duration at 0 for fragmented MP4s (to avoid double-counting). Add a warning log when accumulated video duration differs from the recorded VideoTotalDuration. Harden fingerprint generation and private key handling with nil checks.

Add mp4_duration_test.go: unit test that creates a simulated H.264 fragmented MP4 (150 frames at 40ms), closes it, parses the output and verifies that mvhd/mehd and trun sample durations are consistent and that mdhd.Duration is zero.
2026-02-13 21:35:57 +01:00
cedricve
c147944f5a Convert MP4 timestamps to Mac HFS epoch
Add MacEpochOffset constant and convert mp4.StartTime to Mac HFS time for QuickTime compatibility. Compute macTime = mp4.StartTime + MacEpochOffset and use it for mvhd CreationTime/ModificationTime, as well as track tkhd and mdhd creation/modification timestamps for video and audio tracks. Also set mvhd Rate, Volume and NextTrackID. These changes ensure generated MP4s use QuickTime-compatible epoch and include proper mvhd metadata.
2026-02-13 21:01:45 +01:00
Cédric Verstraeten
e8ca776e4e Merge pull request #236 from kerberos-io/fix/debugging-lost-keyframes
fix/debugging-lost-keyframes
2026-02-11 16:51:07 +01:00
Cédric Verstraeten
de5c4b6e0a Merge branch 'master' into fix/debugging-lost-keyframes 2026-02-11 16:48:08 +01:00
Cédric Verstraeten
9ba64de090 add additional logging 2026-02-11 16:48:01 +01:00
Cédric Verstraeten
7ceeebe76e Merge pull request #235 from kerberos-io/fix/debugging-lost-keyframes
fix/debugging-lost-keyframes
2026-02-11 16:15:57 +01:00
Cédric Verstraeten
bd7dbcfcf2 Enhance FPS tracking and logging for keyframes in gortsplib and mp4 modules 2026-02-11 15:11:52 +00:00
Cédric Verstraeten
8c7a46e3ae Merge pull request #234 from kerberos-io/fix/fps-gop-size
fix/fps-gop-size
2026-02-11 15:05:31 +01:00
Cédric Verstraeten
57ccfaabf5 Merge branch 'fix/fps-gop-size' of github.com:kerberos-io/agent into fix/fps-gop-size 2026-02-11 14:59:34 +01:00
Cédric Verstraeten
4a9cb51e95 Update machinery/src/capture/gortsplib.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-11 14:59:15 +01:00
Cédric Verstraeten
ab6f621e76 Merge branch 'fix/fps-gop-size' of github.com:kerberos-io/agent into fix/fps-gop-size 2026-02-11 14:58:44 +01:00
Cédric Verstraeten
c365ae5af2 Ensure thread-safe closure of peer connections in InitializeWebRTCConnection 2026-02-11 13:58:29 +00:00
Cédric Verstraeten
b05c3d1baa Update machinery/src/capture/gortsplib.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-11 14:52:40 +01:00
Cédric Verstraeten
c7c7203fad Merge branch 'master' into fix/fps-gop-size 2026-02-11 14:48:05 +01:00
Cédric Verstraeten
d93f85b4f3 Refactor FPS calculation to use per-stream trackers for improved accuracy 2026-02-11 13:45:07 +00:00
Cédric Verstraeten
031212b98c Merge pull request #232 from kerberos-io/fix/fps-gop-size
fix/fps-gop-size
2026-02-11 14:27:18 +01:00
Cédric Verstraeten
a4837b3cb3 Implement PTS-based FPS calculation and GOP size adjustments 2026-02-11 13:14:29 +00:00
Cédric Verstraeten
77629ac9b8 Merge pull request #231 from kerberos-io/feature/improve-keyframe-interval
feature/improve-keyframe-interval
2026-02-11 12:28:33 +01:00
cedricve
59608394af Use Warning instead of Warn in mp4.go
Replace call to log.Log.Warn with log.Log.Warning in MP4.flushPendingVideoSample to match the logger API. This is a non-functional change that preserves the original message and behavior while using the correct logging method name.
2026-02-11 12:26:18 +01:00
cedricve
9dfcaa466f Refactor video sample flushing logic into a dedicated function 2026-02-11 11:48:15 +01:00
cedricve
88442e4525 Add pending video sample to segment before flush
Before flushing a segment when mp4.Start is true, add any pending VideoFullSample for the current video track to the current fragment. The change computes and updates LastVideoSampleDTS and VideoTotalDuration, adjusts the sample DecodeTime and Dur, calls AddFullSampleToTrack, logs errors, and clears VideoFullSample so the pending sample is included in the segment before starting a new one. This ensures segments contain all frames up to (but not including) the keyframe that triggered the flush.
2026-02-11 11:38:51 +01:00
Cédric Verstraeten
891ae2e5d5 Merge pull request #230 from kerberos-io/feature/improve-video-format
feature/improve-video-format
2026-02-10 17:25:23 +01:00
Cédric Verstraeten
32b471f570 Update machinery/src/video/mp4.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-10 17:20:40 +01:00
Cédric Verstraeten
5d745fc989 Update machinery/src/video/mp4.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-10 17:20:29 +01:00
Cédric Verstraeten
edfa6ec4c6 Update machinery/src/video/mp4.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-10 17:20:16 +01:00
Cédric Verstraeten
0c460efea6 Refactor PR description workflow to include organization variable and correct pull request URL format 2026-02-10 16:17:10 +00:00
Cédric Verstraeten
96df049e59 Enhance MP4 initialization by adding max recording duration parameter, improving placeholder size calculation for segments. 2026-02-10 15:59:59 +00:00
Cédric Verstraeten
2cb454e618 Merge branch 'master' into feature/improve-video-format 2026-02-10 16:57:47 +01:00
Cédric Verstraeten
7f2ebb655e Fix sidx.FirstOffset calculation and re-encode init segment for accurate MP4 structure 2026-02-10 15:56:10 +00:00
Cédric Verstraeten
63857fb5cc Merge pull request #229 from kerberos-io/feature/improve-video-format
feature/improve-video-format
2026-02-10 16:53:34 +01:00
Cédric Verstraeten
f4c75f9aa9 Add environment variables for PR number and project name in workflow 2026-02-10 15:31:37 +00:00
Cédric Verstraeten
c3936dc884 Enhance MP4 segment handling by adding segment durations and base decode times, improving fragment management and data integrity 2026-02-10 14:47:47 +00:00
Cédric Verstraeten
2868ddc499 Add fragment duration handling and improve MP4 segment management 2026-02-10 13:52:58 +00:00
Cédric Verstraeten
176610a694 Update mp4.go 2026-02-10 13:39:55 +01:00
Cédric Verstraeten
f60aff4fd6 Enhance MP4 closing process by adding final video and audio samples, ensuring data integrity and updating track metadata 2026-02-10 12:45:46 +01:00
14 changed files with 956 additions and 375 deletions

View File

@@ -1,58 +0,0 @@
name: Docker development build
on:
push:
branches: [develop]
jobs:
build-amd64:
runs-on: ubuntu-latest
strategy:
matrix:
architecture: [amd64]
steps:
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Checkout
uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Run Buildx
run: docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
- name: Create new and append to manifest
run: docker buildx imagetools create -t kerberos/agent-dev:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
- name: Create new and append to latest manifest
run: docker buildx imagetools create -t kerberos/agent-dev:latest kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
build-other:
runs-on: ubuntu-latest
strategy:
matrix:
#architecture: [arm64, arm/v7, arm/v6]
architecture: [arm64, arm/v7]
steps:
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Checkout
uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Run Buildx
run: docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
- name: Create new and append to manifest
run: docker buildx imagetools create --append -t kerberos/agent-dev:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
- name: Create new and append to manifest latest
run: docker buildx imagetools create --append -t kerberos/agent-dev:latest kerberos/agent-dev:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)

View File

@@ -0,0 +1,51 @@
name: Create User Story Issue
on:
workflow_dispatch:
inputs:
issue_title:
description: 'Title for the issue'
required: true
issue_description:
description: 'Brief description of the feature'
required: true
complexity:
description: 'Complexity of the feature'
required: true
type: choice
options:
- 'Low'
- 'Medium'
- 'High'
default: 'Medium'
duration:
description: 'Estimated duration'
required: true
type: choice
options:
- '1 day'
- '3 days'
- '1 week'
- '2 weeks'
- '1 month'
default: '1 week'
jobs:
create-issue:
runs-on: ubuntu-latest
permissions:
issues: write
steps:
- name: Create Issue with User Story
uses: cedricve/llm-create-issue-user-story@main
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
azure_openai_api_key: ${{ secrets.AZURE_OPENAI_API_KEY }}
azure_openai_endpoint: ${{ secrets.AZURE_OPENAI_ENDPOINT }}
azure_openai_version: ${{ secrets.AZURE_OPENAI_VERSION }}
openai_model: ${{ secrets.OPENAI_MODEL }}
issue_title: ${{ github.event.inputs.issue_title }}
issue_description: ${{ github.event.inputs.issue_description }}
complexity: ${{ github.event.inputs.complexity }}
duration: ${{ github.event.inputs.duration }}
labels: 'user-story,feature'
assignees: ${{ github.actor }}

View File

@@ -1,12 +1,14 @@
name: Docker nightly build
name: Nightly build
on:
# Triggers the workflow every day at 9PM (CET).
schedule:
- cron: "0 22 * * *"
# Allows manual triggering from the Actions tab.
workflow_dispatch:
jobs:
build-amd64:
nightly-build-amd64:
runs-on: ubuntu-latest
strategy:
matrix:
@@ -18,7 +20,9 @@ jobs:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Checkout
run: git clone https://github.com/kerberos-io/agent && cd agent
uses: actions/checkout@v4
with:
ref: master
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
@@ -26,10 +30,10 @@ jobs:
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Run Buildx
run: cd agent && docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
run: docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
- name: Create new and append to manifest
run: cd agent && docker buildx imagetools create -t kerberos/agent-nightly:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
build-other:
run: docker buildx imagetools create -t kerberos/agent-nightly:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
nightly-build-other:
runs-on: ubuntu-latest
strategy:
matrix:
@@ -41,7 +45,9 @@ jobs:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Checkout
run: git clone https://github.com/kerberos-io/agent && cd agent
uses: actions/checkout@v4
with:
ref: master
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
@@ -49,6 +55,6 @@ jobs:
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Run Buildx
run: cd agent && docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
run: docker buildx build --platform linux/${{matrix.architecture}} -t kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7) --push .
- name: Create new and append to manifest
run: cd agent && docker buildx imagetools create --append -t kerberos/agent-nightly:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)
run: docker buildx imagetools create --append -t kerberos/agent-nightly:$(echo $GITHUB_SHA | cut -c1-7) kerberos/agent-nightly:arch-$(echo ${{matrix.architecture}} | tr / -)-$(echo $GITHUB_SHA | cut -c1-7)

View File

@@ -2,6 +2,11 @@ name: Autofill PR description
on: pull_request
env:
ORGANIZATION: uugai
PROJECT: ${{ github.event.repository.name }}
PR_NUMBER: ${{ github.event.number }}
jobs:
openai-pr-description:
runs-on: ubuntu-22.04
@@ -16,4 +21,6 @@ jobs:
azure_openai_api_key: ${{ secrets.AZURE_OPENAI_API_KEY }}
azure_openai_endpoint: ${{ secrets.AZURE_OPENAI_ENDPOINT }}
azure_openai_version: ${{ secrets.AZURE_OPENAI_VERSION }}
openai_model: ${{ secrets.OPENAI_MODEL }}
pull_request_url: https://pr${{ env.PR_NUMBER }}.api.kerberos.lol
overwrite_description: true

View File

@@ -34,7 +34,7 @@ jobs:
length: 7
- name: Run Build
run: |
docker build -t ${{matrix.architecture}} .
docker build --provenance=false --build-arg VERSION=${{github.event.inputs.tag || github.ref_name}} -t ${{matrix.architecture}} .
CID=$(docker create ${{matrix.architecture}})
docker cp ${CID}:/home/agent ./output-${{matrix.architecture}}
docker rm ${CID}
@@ -71,7 +71,7 @@ jobs:
length: 7
- name: Run Build
run: |
docker build -t ${{matrix.architecture}} -f Dockerfile.arm64 .
docker build --provenance=false --build-arg VERSION=${{github.event.inputs.tag || github.ref_name}} -t ${{matrix.architecture}} -f Dockerfile.arm64 .
CID=$(docker create ${{matrix.architecture}})
docker cp ${CID}:/home/agent ./output-${{matrix.architecture}}
docker rm ${CID}

3
.gitignore vendored
View File

@@ -14,4 +14,5 @@ machinery/test*
machinery/init-dev.sh
machinery/.env.local
machinery/vendor
deployments/docker/private-docker-compose.yaml
deployments/docker/private-docker-compose.yaml
video.mp4

View File

@@ -1,5 +1,6 @@
ARG BASE_IMAGE_VERSION=amd64-ddbe40e
ARG VERSION=0.0.0
FROM kerberos/base:${BASE_IMAGE_VERSION} AS build-machinery
LABEL AUTHOR=uug.ai
@@ -34,7 +35,8 @@ RUN cat /go/src/github.com/kerberos-io/agent/machinery/version
RUN cd /go/src/github.com/kerberos-io/agent/machinery && \
go mod download && \
go build -tags timetzdata,netgo,osusergo --ldflags '-s -w -extldflags "-static -latomic"' main.go && \
VERSION=$(cd /go/src/github.com/kerberos-io/agent && git describe --tags --always 2>/dev/null || echo "${VERSION}") && \
go build -tags timetzdata,netgo,osusergo --ldflags "-s -w -X github.com/kerberos-io/agent/machinery/src/utils.VERSION=${VERSION} -extldflags '-static -latomic'" main.go && \
mkdir -p /agent && \
mv main /agent && \
mv version /agent && \

View File

@@ -1,5 +1,6 @@
ARG BASE_IMAGE_VERSION=arm64-ddbe40e
ARG VERSION=0.0.0
FROM kerberos/base:${BASE_IMAGE_VERSION} AS build-machinery
LABEL AUTHOR=uug.ai
@@ -34,7 +35,8 @@ RUN cat /go/src/github.com/kerberos-io/agent/machinery/version
RUN cd /go/src/github.com/kerberos-io/agent/machinery && \
go mod download && \
go build -tags timetzdata,netgo,osusergo --ldflags '-s -w -extldflags "-static -latomic"' main.go && \
VERSION=$(cd /go/src/github.com/kerberos-io/agent && git describe --tags --always 2>/dev/null || echo "${VERSION}") && \
go build -tags timetzdata,netgo,osusergo --ldflags "-s -w -X github.com/kerberos-io/agent/machinery/src/utils.VERSION=${VERSION} -extldflags '-static -latomic'" main.go && \
mkdir -p /agent && \
mv main /agent && \
mv version /agent && \

View File

@@ -85,12 +85,8 @@ type Golibrtsp struct {
Streams []packets.Stream
// FPS calculation fields
lastFrameTime time.Time
frameTimeBuffer []time.Duration
frameBufferSize int
frameBufferIndex int
fpsMutex sync.Mutex
// Per-stream FPS calculation (keyed by stream index)
fpsTrackers map[int8]*fpsTracker
// I-frame interval tracking fields
packetsSinceLastKeyframe int
@@ -101,6 +97,78 @@ type Golibrtsp struct {
keyframeMutex sync.Mutex
}
// fpsTracker holds per-stream state for PTS-based FPS calculation.
// Each video stream (H264 / H265) gets its own tracker so PTS
// samples from different codecs never interleave.
type fpsTracker struct {
mu sync.Mutex
lastPTS time.Duration
hasPTS bool
frameTimeBuffer []time.Duration
bufferSize int
bufferIndex int
cachedFPS float64 // latest computed FPS
}
func newFPSTracker(bufferSize int) *fpsTracker {
return &fpsTracker{
frameTimeBuffer: make([]time.Duration, bufferSize),
bufferSize: bufferSize,
}
}
// update records a new PTS sample and returns the latest FPS estimate.
// It must be called once per complete decoded frame (after Decode()
// succeeds), not on every RTP packet fragment.
func (ft *fpsTracker) update(pts time.Duration) float64 {
ft.mu.Lock()
defer ft.mu.Unlock()
if !ft.hasPTS {
ft.lastPTS = pts
ft.hasPTS = true
return 0
}
interval := pts - ft.lastPTS
ft.lastPTS = pts
// Skip invalid intervals (zero, negative, or very large which
// indicate a PTS discontinuity or wrap).
if interval <= 0 || interval > 5*time.Second {
return ft.cachedFPS
}
ft.frameTimeBuffer[ft.bufferIndex] = interval
ft.bufferIndex = (ft.bufferIndex + 1) % ft.bufferSize
var totalInterval time.Duration
validSamples := 0
for _, iv := range ft.frameTimeBuffer {
if iv > 0 {
totalInterval += iv
validSamples++
}
}
if validSamples == 0 {
return ft.cachedFPS
}
avgInterval := totalInterval / time.Duration(validSamples)
if avgInterval == 0 {
return ft.cachedFPS
}
ft.cachedFPS = float64(time.Second) / float64(avgInterval)
return ft.cachedFPS
}
// fps returns the most recent FPS estimate without recording a new sample.
func (ft *fpsTracker) fps() float64 {
ft.mu.Lock()
defer ft.mu.Unlock()
return ft.cachedFPS
}
// Init function
var H264FrameDecoder *Decoder
var H265FrameDecoder *Decoder
@@ -548,18 +616,17 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
if len(rtppkt.Payload) > 0 {
// decode timestamp
pts, ok := g.Client.PacketPTS(g.VideoH264Media, rtppkt)
pts2, ok := g.Client.PacketPTS2(g.VideoH264Media, rtppkt)
if !ok {
log.Log.Debug("capture.golibrtsp.Start(): " + "unable to get PTS")
// decode timestamps — validate each call separately
pts, okPTS := g.Client.PacketPTS(g.VideoH264Media, rtppkt)
pts2, okPTS2 := g.Client.PacketPTS2(g.VideoH264Media, rtppkt)
if !okPTS2 {
log.Log.Debug("capture.golibrtsp.Start(): unable to get PTS2 from PacketPTS2")
return
}
// Extract access units from RTP packets
// We need to do this, because the decoder expects a full
// access unit. Once we have a full access unit, we can
// decode it, and know if it's a keyframe or not.
// Extract access units from RTP packets.
// We need a complete access unit to determine whether
// this is a keyframe.
au, errDecode := g.VideoH264Decoder.Decode(rtppkt)
if errDecode != nil {
if errDecode != rtph264.ErrNonStartingPacketAndNoPrevious && errDecode != rtph264.ErrMorePacketsNeeded {
@@ -568,6 +635,18 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
return
}
// Frame is complete — update per-stream FPS from PTS.
if okPTS {
ft := g.fpsTrackers[g.VideoH264Index]
if ft == nil {
ft = newFPSTracker(30)
g.fpsTrackers[g.VideoH264Index] = ft
}
if ptsFPS := ft.update(pts); ptsFPS > 0 && ptsFPS <= 120 {
g.Streams[g.VideoH264Index].FPS = ptsFPS
}
}
// We'll need to read out a few things.
// prepend an AUD. This is required by some players
filteredAU = [][]byte{
@@ -578,8 +657,10 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
nonIDRPresent := false
idrPresent := false
var naluTypes []string
for _, nalu := range au {
typ := h264.NALUType(nalu[0] & 0x1F)
naluTypes = append(naluTypes, fmt.Sprintf("%s(%d,sz=%d)", typ.String(), int(typ), len(nalu)))
switch typ {
case h264.NALUTypeAccessUnitDelimiter:
continue
@@ -614,18 +695,46 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
g.Streams[g.VideoH264Index].FPS = fps
log.Log.Debug(fmt.Sprintf("capture.golibrtsp.Start(%s): Final FPS=%.2f", streamType, fps))
g.VideoH264Forma.SPS = nalu
if streamType == "main" && len(nalu) > 0 {
// Fallback: store SPS from in-band NALUs when SDP was missing it.
configuration.Config.Capture.IPCamera.SPSNALUs = [][]byte{nalu}
}
}
case h264.NALUTypePPS:
g.VideoH264Forma.PPS = nalu
if streamType == "main" && len(nalu) > 0 {
// Fallback: store PPS from in-band NALUs when SDP was missing it.
configuration.Config.Capture.IPCamera.PPSNALUs = [][]byte{nalu}
}
}
filteredAU = append(filteredAU, nalu)
}
if idrPresent && streamType == "main" {
// Ensure config has parameter sets before recordings start.
if len(configuration.Config.Capture.IPCamera.SPSNALUs) == 0 && len(g.VideoH264Forma.SPS) > 0 {
configuration.Config.Capture.IPCamera.SPSNALUs = [][]byte{g.VideoH264Forma.SPS}
log.Log.Warning("capture.golibrtsp.Start(main): fallback SPS set from keyframe")
}
if len(configuration.Config.Capture.IPCamera.PPSNALUs) == 0 && len(g.VideoH264Forma.PPS) > 0 {
configuration.Config.Capture.IPCamera.PPSNALUs = [][]byte{g.VideoH264Forma.PPS}
log.Log.Warning("capture.golibrtsp.Start(main): fallback PPS set from keyframe")
}
if len(configuration.Config.Capture.IPCamera.SPSNALUs) == 0 || len(configuration.Config.Capture.IPCamera.PPSNALUs) == 0 {
log.Log.Warning("capture.golibrtsp.Start(main): SPS/PPS still missing after IDR keyframe")
}
}
if len(filteredAU) <= 1 || (!nonIDRPresent && !idrPresent) {
return
}
if idrPresent {
log.Log.Debug(fmt.Sprintf("capture.golibrtsp.Start(%s): IDR frame NALUs: [%s]",
streamType, fmt.Sprintf("%v", naluTypes)))
}
// Convert to packet.
enc, err := h264.AnnexBMarshal(filteredAU)
if err != nil {
@@ -651,7 +760,11 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
keyframeInterval := g.trackKeyframeInterval(idrPresent)
if idrPresent && keyframeInterval > 0 {
avgInterval := g.getAverageKeyframeInterval()
gopDuration := float64(keyframeInterval) / g.Streams[g.VideoH265Index].FPS
fps := g.Streams[g.VideoH264Index].FPS
if fps <= 0 {
fps = 25.0 // Default fallback FPS
}
gopDuration := float64(keyframeInterval) / fps
gopSize := int(avgInterval) // Store GOP size in a separate variable
g.Streams[g.VideoH264Index].GopSize = gopSize
log.Log.Debug(fmt.Sprintf("capture.golibrtsp.Start(%s): Keyframe interval=%d packets, Avg=%.1f, GOP=%.1fs, GOPSize=%d",
@@ -716,18 +829,17 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
if len(rtppkt.Payload) > 0 {
// decode timestamp
pts, ok := g.Client.PacketPTS(g.VideoH265Media, rtppkt)
pts2, ok := g.Client.PacketPTS2(g.VideoH265Media, rtppkt)
if !ok {
log.Log.Debug("capture.golibrtsp.Start(): " + "unable to get PTS")
// decode timestamps — validate each call separately
pts, okPTS := g.Client.PacketPTS(g.VideoH265Media, rtppkt)
pts2, okPTS2 := g.Client.PacketPTS2(g.VideoH265Media, rtppkt)
if !okPTS2 {
log.Log.Debug("capture.golibrtsp.Start(): unable to get PTS")
return
}
// Extract access units from RTP packets
// We need to do this, because the decoder expects a full
// access unit. Once we have a full access unit, we can
// decode it, and know if it's a keyframe or not.
// Extract access units from RTP packets.
// We need a complete access unit to determine whether
// this is a keyframe.
au, errDecode := g.VideoH265Decoder.Decode(rtppkt)
if errDecode != nil {
if errDecode != rtph265.ErrNonStartingPacketAndNoPrevious && errDecode != rtph265.ErrMorePacketsNeeded {
@@ -736,6 +848,18 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
return
}
// Frame is complete — update per-stream FPS from PTS.
if okPTS {
ft := g.fpsTrackers[g.VideoH265Index]
if ft == nil {
ft = newFPSTracker(30)
g.fpsTrackers[g.VideoH265Index] = ft
}
if ptsFPS := ft.update(pts); ptsFPS > 0 && ptsFPS <= 120 {
g.Streams[g.VideoH265Index].FPS = ptsFPS
}
}
filteredAU = [][]byte{
{byte(h265.NALUType_AUD_NUT) << 1, 1, 0x50},
}
@@ -796,7 +920,11 @@ func (g *Golibrtsp) Start(ctx context.Context, streamType string, queue *packets
keyframeInterval := g.trackKeyframeInterval(isRandomAccess)
if isRandomAccess && keyframeInterval > 0 {
avgInterval := g.getAverageKeyframeInterval()
gopDuration := float64(keyframeInterval) / g.Streams[g.VideoH265Index].FPS
fps := g.Streams[g.VideoH265Index].FPS
if fps <= 0 {
fps = 25.0 // Default fallback FPS
}
gopDuration := float64(keyframeInterval) / fps
gopSize := int(avgInterval) // Store GOP size in a separate variable
g.Streams[g.VideoH265Index].GopSize = gopSize
log.Log.Debug(fmt.Sprintf("capture.golibrtsp.Start(%s): Keyframe interval=%d packets, Avg=%.1f, GOP=%.1fs, GOPSize=%d",
@@ -1179,10 +1307,11 @@ func WriteMPEG4Audio(forma *format.MPEG4Audio, aus [][]byte) ([]byte, error) {
// Initialize FPS calculation buffers
func (g *Golibrtsp) initFPSCalculation() {
g.frameBufferSize = 30 // Store last 30 frame intervals
g.frameTimeBuffer = make([]time.Duration, g.frameBufferSize)
g.frameBufferIndex = 0
g.lastFrameTime = time.Time{}
// Ensure the per-stream FPS trackers map exists. Individual trackers
// can be created lazily when a given stream index is first used.
if g.fpsTrackers == nil {
g.fpsTrackers = make(map[int8]*fpsTracker)
}
// Initialize I-frame interval tracking
g.keyframeBufferSize = 10 // Store last 10 keyframe intervals
@@ -1192,50 +1321,11 @@ func (g *Golibrtsp) initFPSCalculation() {
g.lastKeyframePacketCount = 0
}
// Calculate FPS from frame timestamps
func (g *Golibrtsp) calculateFPSFromTimestamps() float64 {
g.fpsMutex.Lock()
defer g.fpsMutex.Unlock()
if g.lastFrameTime.IsZero() {
g.lastFrameTime = time.Now()
return 0
}
now := time.Now()
interval := now.Sub(g.lastFrameTime)
g.lastFrameTime = now
// Store the interval
g.frameTimeBuffer[g.frameBufferIndex] = interval
g.frameBufferIndex = (g.frameBufferIndex + 1) % g.frameBufferSize
// Calculate average FPS from stored intervals
var totalInterval time.Duration
validSamples := 0
for _, interval := range g.frameTimeBuffer {
if interval > 0 {
totalInterval += interval
validSamples++
}
}
if validSamples == 0 {
return 0
}
avgInterval := totalInterval / time.Duration(validSamples)
if avgInterval == 0 {
return 0
}
return float64(time.Second) / float64(avgInterval)
}
// Get enhanced FPS information from SPS with fallback
// Get enhanced FPS information from SPS with fallback to PTS-based calculation.
// The PTS-based FPS is computed per completed frame via fpsTracker.update(),
// so by the time this is called we already have a good estimate.
func (g *Golibrtsp) getEnhancedFPS(sps *h264.SPS, streamIndex int8) float64 {
// First try to get FPS from SPS
// First try to get FPS from SPS VUI parameters
spsFPS := sps.FPS()
// Check if SPS FPS is reasonable (between 1 and 120 fps)
@@ -1244,11 +1334,13 @@ func (g *Golibrtsp) getEnhancedFPS(sps *h264.SPS, streamIndex int8) float64 {
return spsFPS
}
// Fallback to timestamp-based calculation
timestampFPS := g.calculateFPSFromTimestamps()
if timestampFPS > 0 && timestampFPS <= 120 {
log.Log.Debug(fmt.Sprintf("capture.golibrtsp.getEnhancedFPS(): Timestamp FPS: %.2f", timestampFPS))
return timestampFPS
// Fallback to PTS-based FPS (already calculated per-frame)
if ft := g.fpsTrackers[streamIndex]; ft != nil {
ptsFPS := ft.fps()
if ptsFPS > 0 && ptsFPS <= 120 {
log.Log.Debug(fmt.Sprintf("capture.golibrtsp.getEnhancedFPS(): PTS FPS: %.2f", ptsFPS))
return ptsFPS
}
}
// Return SPS FPS even if it seems unreasonable, or default

View File

@@ -159,6 +159,19 @@ func HandleRecordStream(queue *packets.Queue, configDirectory string, configurat
}
// Close mp4
if len(mp4Video.SPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.SPSNALUs) > 0 {
mp4Video.SPSNALUs = configuration.Config.Capture.IPCamera.SPSNALUs
}
if len(mp4Video.PPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.PPSNALUs) > 0 {
mp4Video.PPSNALUs = configuration.Config.Capture.IPCamera.PPSNALUs
}
if len(mp4Video.VPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.VPSNALUs) > 0 {
mp4Video.VPSNALUs = configuration.Config.Capture.IPCamera.VPSNALUs
}
if (videoCodec == "H264" && (len(mp4Video.SPSNALUs) == 0 || len(mp4Video.PPSNALUs) == 0)) ||
(videoCodec == "H265" && (len(mp4Video.VPSNALUs) == 0 || len(mp4Video.SPSNALUs) == 0 || len(mp4Video.PPSNALUs) == 0)) {
log.Log.Warning("capture.main.HandleRecordStream(continuous): closing MP4 without full parameter sets, moov may be incomplete")
}
mp4Video.Close(&config)
log.Log.Info("capture.main.HandleRecordStream(continuous): recording finished: file save: " + name)
@@ -279,8 +292,11 @@ func HandleRecordStream(queue *packets.Queue, configDirectory string, configurat
ppsNALUS := configuration.Config.Capture.IPCamera.PPSNALUs
vpsNALUS := configuration.Config.Capture.IPCamera.VPSNALUs
if len(spsNALUS) == 0 || len(ppsNALUS) == 0 {
log.Log.Warning("capture.main.HandleRecordStream(continuous): missing SPS/PPS at recording start")
}
// Create a video file, and set the dimensions.
mp4Video = video.NewMP4(fullName, spsNALUS, ppsNALUS, vpsNALUS)
mp4Video = video.NewMP4(fullName, spsNALUS, ppsNALUS, vpsNALUS, configuration.Config.Capture.MaxLengthRecording)
mp4Video.SetWidth(width)
mp4Video.SetHeight(height)
@@ -499,8 +515,11 @@ func HandleRecordStream(queue *packets.Queue, configDirectory string, configurat
ppsNALUS := configuration.Config.Capture.IPCamera.PPSNALUs
vpsNALUS := configuration.Config.Capture.IPCamera.VPSNALUs
if len(spsNALUS) == 0 || len(ppsNALUS) == 0 {
log.Log.Warning("capture.main.HandleRecordStream(motiondetection): missing SPS/PPS at recording start")
}
// Create a video file, and set the dimensions.
mp4Video := video.NewMP4(fullName, spsNALUS, ppsNALUS, vpsNALUS)
mp4Video := video.NewMP4(fullName, spsNALUS, ppsNALUS, vpsNALUS, configuration.Config.Capture.MaxLengthRecording)
mp4Video.SetWidth(width)
mp4Video.SetHeight(height)
@@ -574,6 +593,19 @@ func HandleRecordStream(queue *packets.Queue, configDirectory string, configurat
lastRecordingTime = pkt.CurrentTime
// This will close the recording and write the last packet.
if len(mp4Video.SPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.SPSNALUs) > 0 {
mp4Video.SPSNALUs = configuration.Config.Capture.IPCamera.SPSNALUs
}
if len(mp4Video.PPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.PPSNALUs) > 0 {
mp4Video.PPSNALUs = configuration.Config.Capture.IPCamera.PPSNALUs
}
if len(mp4Video.VPSNALUs) == 0 && len(configuration.Config.Capture.IPCamera.VPSNALUs) > 0 {
mp4Video.VPSNALUs = configuration.Config.Capture.IPCamera.VPSNALUs
}
if (videoCodec == "H264" && (len(mp4Video.SPSNALUs) == 0 || len(mp4Video.PPSNALUs) == 0)) ||
(videoCodec == "H265" && (len(mp4Video.VPSNALUs) == 0 || len(mp4Video.SPSNALUs) == 0 || len(mp4Video.PPSNALUs) == 0)) {
log.Log.Warning("capture.main.HandleRecordStream(motiondetection): closing MP4 without full parameter sets, moov may be incomplete")
}
mp4Video.Close(&config)
log.Log.Info("capture.main.HandleRecordStream(motiondetection): file save: " + name)

View File

@@ -25,7 +25,10 @@ import (
"github.com/nfnt/resize"
)
const VERSION = "3.5.0"
// VERSION is the agent version. It defaults to "0.0.0" for local dev builds
// and is overridden at build time via:
// go build -ldflags "-X github.com/kerberos-io/agent/machinery/src/utils.VERSION=v1.2.3"
var VERSION = "0.0.0"
const letterBytes = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"

View File

@@ -22,52 +22,81 @@ import (
var LastPTS uint64 = 0 // Last PTS for the current segment
// MacEpochOffset is the number of seconds between Mac HFS epoch (1904-01-01)
// and Unix epoch (1970-01-01). QuickTime requires timestamps in Mac HFS format.
const MacEpochOffset uint64 = 2082844800
// FragmentDurationMs is the target duration for each fragment in milliseconds.
// Fragments will be flushed at the first keyframe after this duration has elapsed,
// resulting in ~3 second fragments (assuming a typical GOP interval).
const FragmentDurationMs = 3000
type MP4 struct {
// FileName is the name of the file
FileName string
width int
height int
Segments []*mp4ff.MediaSegment // List of media segments
Segment *mp4ff.MediaSegment
MultiTrackFragment *mp4ff.Fragment
TrackIDs []uint32
FileWriter *os.File
Writer *bufio.Writer
SegmentCount int
SampleCount int
StartPTS uint64
VideoTotalDuration uint64
AudioTotalDuration uint64
AudioPTS uint64
Start bool
SPSNALUs [][]byte // SPS NALUs for H264
PPSNALUs [][]byte // PPS NALUs for H264
VPSNALUs [][]byte // VPS NALUs for H264
FreeBoxSize int64
MoofBoxes int64 // Number of moof boxes in the file
MoofBoxSizes []int64 // Sizes of each moof box
StartTime uint64 // Start time of the MP4 file
VideoTrackName string // Name of the video track
VideoTrack int // Track ID for the video track
AudioTrackName string // Name of the audio track
AudioTrack int // Track ID for the audio track
VideoFullSample *mp4ff.FullSample // Full sample for video track
AudioFullSample *mp4ff.FullSample // Full sample for audio track
LastAudioSampleDTS uint64 // Last PTS for audio sample
LastVideoSampleDTS uint64 // Last PTS for video sample
SampleType string // Type of the sample (e.g., "video", "audio", "subtitle")
FileName string
width int
height int
Segments []*mp4ff.MediaSegment // List of media segments
Segment *mp4ff.MediaSegment
MultiTrackFragment *mp4ff.Fragment
TrackIDs []uint32
FileWriter *os.File
Writer *bufio.Writer
SegmentCount int
SampleCount int
StartPTS uint64
VideoTotalDuration uint64
AudioTotalDuration uint64
AudioPTS uint64
Start bool
SPSNALUs [][]byte // SPS NALUs for H264
PPSNALUs [][]byte // PPS NALUs for H264
VPSNALUs [][]byte // VPS NALUs for H264
FreeBoxSize int64
FragmentStartRawPTS uint64 // Raw PTS for timing when to flush fragments
FragmentStartDTS uint64 // Accumulated VideoTotalDuration at fragment start (matches tfdt)
MoofBoxes int64 // Number of moof boxes in the file
MoofBoxSizes []int64 // Sizes of each moof box
SegmentDurations []uint64 // Duration of each segment in timescale units
SegmentBaseDecTimes []uint64 // Base decode time of each segment
StartTime uint64 // Start time of the MP4 file
VideoTrackName string // Name of the video track
VideoTrack int // Track ID for the video track
AudioTrackName string // Name of the audio track
AudioTrack int // Track ID for the audio track
VideoFullSample *mp4ff.FullSample // Full sample for video track
AudioFullSample *mp4ff.FullSample // Full sample for audio track
LastAudioSampleDTS uint64 // Last PTS for audio sample
LastVideoSampleDTS uint64 // Last PTS for video sample
SampleType string // Type of the sample (e.g., "video", "audio", "subtitle")
TotalKeyframesReceived int // Total keyframes received by AddSampleToTrack
TotalKeyframesWritten int // Total keyframes written to trun boxes
FragmentKeyframeCount int // Keyframes in the current fragment
PendingSampleIsKeyframe bool // Whether the pending video sample is a keyframe
}
// NewMP4 creates a new MP4 object
func NewMP4(fileName string, spsNALUs [][]byte, ppsNALUs [][]byte, vpsNALUs [][]byte) *MP4 {
// NewMP4 creates a new MP4 object.
// maxDurationSec is the maximum expected recording duration in seconds,
// used to calculate the free-box placeholder size for ftyp+moov+sidx.
func NewMP4(fileName string, spsNALUs [][]byte, ppsNALUs [][]byte, vpsNALUs [][]byte, maxDurationSec int64) *MP4 {
init := mp4ff.NewMP4Init()
// Add a free box to the init segment
// Prepend a free box to the init segment with a size of 4096 bytes, so we can overwrite it later with the actual init segment.
freeBoxSize := 4096
// Calculate the placeholder size needed at the start of the file.
// Components:
// ftyp: ~32 bytes
// moov: ~1500 bytes (mvhd + mvex + video trak + audio trak + UUID)
// sidx: 24 bytes fixed + 12 bytes per segment reference
// Segments are ~FragmentDurationMs each, so:
// numSegments = ceil(maxDurationSec * 1000 / FragmentDurationMs) + 1 (safety margin)
// sidxSize = 24 + 12 * numSegments
baseSize := int64(2560) // ftyp + moov + extra headroom for large UUID signatures
numSegments := int64(0)
if maxDurationSec > 0 {
// Use integer ceiling division to avoid underestimating the number of segments.
numSegments = ((maxDurationSec*1000)+FragmentDurationMs-1)/FragmentDurationMs + 1
}
sidxSize := int64(24 + 12*numSegments)
freeBoxSize := int(baseSize + sidxSize)
free := mp4ff.NewFreeBox(make([]byte, freeBoxSize))
init.AddChild(free)
// Create a writer
ofd, err := os.Create(fileName)
@@ -77,16 +106,15 @@ func NewMP4(fileName string, spsNALUs [][]byte, ppsNALUs [][]byte, vpsNALUs [][]
// Create a buffered writer
bufferedWriter := bufio.NewWriterSize(ofd, 64*1024) // 64KB buffer
// We will write the empty init segment to the file
// so we can overwrite it later with the actual init segment.
err = init.Encode(bufferedWriter)
// Write the free box placeholder at the start of the file
err = free.Encode(bufferedWriter)
if err != nil {
}
return &MP4{
FileName: fileName,
StartTime: uint64(time.Now().Unix()),
FreeBoxSize: int64(freeBoxSize),
FreeBoxSize: int64(freeBoxSize) + 8, // payload + 8 byte box header
FileWriter: ofd,
Writer: bufferedWriter,
SPSNALUs: spsNALUs,
@@ -130,47 +158,194 @@ func (mp4 *MP4) AddAudioTrack(codec string) uint32 {
func (mp4 *MP4) AddMediaSegment(segNr int) {
}
// updateVideoParameterSetsFromAnnexB inspects Annex B data to fill missing SPS/PPS/VPS.
func (mp4 *MP4) updateVideoParameterSetsFromAnnexB(data []byte) {
if len(data) == 0 {
return
}
needSPS := len(mp4.SPSNALUs) == 0
needPPS := len(mp4.PPSNALUs) == 0
needVPS := len(mp4.VPSNALUs) == 0
if !(needSPS || needPPS || needVPS) {
return
}
for _, nalu := range splitNALUs(data) {
nalu = removeAnnexBStartCode(nalu)
if len(nalu) == 0 {
continue
}
switch mp4.VideoTrackName {
case "H264", "AVC1":
nalType := nalu[0] & 0x1F
switch nalType {
case 7: // SPS
if needSPS {
mp4.SPSNALUs = [][]byte{nalu}
needSPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): SPS recovered from in-band NALU")
}
case 8: // PPS
if needPPS {
mp4.PPSNALUs = [][]byte{nalu}
needPPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): PPS recovered from in-band NALU")
}
}
case "H265", "HVC1":
nalType := (nalu[0] >> 1) & 0x3F
switch nalType {
case 32: // VPS
if needVPS {
mp4.VPSNALUs = [][]byte{nalu}
needVPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): VPS recovered from in-band NALU")
}
case 33: // SPS
if needSPS {
mp4.SPSNALUs = [][]byte{nalu}
needSPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): SPS recovered from in-band NALU")
}
case 34: // PPS
if needPPS {
mp4.PPSNALUs = [][]byte{nalu}
needPPS = false
log.Log.Warning("mp4.updateVideoParameterSetsFromAnnexB(): PPS recovered from in-band NALU")
}
}
}
}
}
// flushPendingVideoSample writes the pending video sample to the current fragment.
// If nextPTS is provided (non-zero), it calculates duration from the PTS difference.
// If nextPTS is 0 (e.g., at Close time), it uses the last known duration.
// Returns true if a sample was flushed, false if there was no pending sample.
func (mp4 *MP4) flushPendingVideoSample(nextPTS uint64) bool {
if mp4.VideoFullSample == nil || mp4.MultiTrackFragment == nil {
return false
}
var duration uint64
if nextPTS > 0 && nextPTS > mp4.VideoFullSample.DecodeTime {
duration = nextPTS - mp4.VideoFullSample.DecodeTime
} else {
// No valid nextPTS (Close case) or PTS went backwards (jitter/discontinuity)
if nextPTS > 0 {
log.Log.Warning(fmt.Sprintf("mp4.flushPendingVideoSample(): video PTS went backwards or zero duration (nextPTS=%d, prevDTS=%d), using last known duration", nextPTS, mp4.VideoFullSample.DecodeTime))
}
duration = mp4.LastVideoSampleDTS
if duration == 0 {
duration = 33 // Default ~30fps frame duration
}
}
mp4.LastVideoSampleDTS = duration
mp4.VideoTotalDuration += duration
mp4.VideoFullSample.DecodeTime = mp4.VideoTotalDuration - duration
mp4.VideoFullSample.Sample.Dur = uint32(duration)
isKF := mp4.PendingSampleIsKeyframe
err := mp4.MultiTrackFragment.AddFullSampleToTrack(*mp4.VideoFullSample, uint32(mp4.VideoTrack))
if err != nil {
log.Log.Error("mp4.flushPendingVideoSample(): error adding sample: " + err.Error())
}
if isKF {
mp4.TotalKeyframesWritten++
mp4.FragmentKeyframeCount++
log.Log.Debug(fmt.Sprintf("mp4.flushPendingVideoSample(): KEYFRAME WRITTEN to trun - totalWritten=%d, fragmentKF=%d, flags=0x%08x, dur=%d, DTS=%d",
mp4.TotalKeyframesWritten, mp4.FragmentKeyframeCount, mp4.VideoFullSample.Sample.Flags, duration, mp4.VideoFullSample.DecodeTime))
}
mp4.VideoFullSample = nil
mp4.PendingSampleIsKeyframe = false
return true
}
func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, pts uint64) error {
if isKeyframe && trackID == uint32(mp4.VideoTrack) {
mp4.TotalKeyframesReceived++
elapsedDbg := uint64(0)
if mp4.Start {
elapsedDbg = pts - mp4.FragmentStartRawPTS
}
log.Log.Debug(fmt.Sprintf("mp4.AddSampleToTrack(): KEYFRAME #%d received - PTS=%d, size=%d, elapsed=%dms, started=%t, segment=%d, fragKF=%d",
mp4.TotalKeyframesReceived, pts, len(data), elapsedDbg, mp4.Start, mp4.SegmentCount, mp4.FragmentKeyframeCount))
}
if isKeyframe {
// Write the segment to the file
// Determine whether to start a new fragment.
// We only flush at a keyframe boundary once at least FragmentDurationMs
// of content has been accumulated, resulting in ~3 second fragments.
elapsed := uint64(0)
if mp4.Start {
mp4.MoofBoxes = mp4.MoofBoxes + 1
mp4.MoofBoxSizes = append(mp4.MoofBoxSizes, int64(mp4.Segment.Size()))
err := mp4.Segment.Encode(mp4.Writer)
if err != nil {
log.Log.Error("mp4.AddSampleToTrack(): error encoding segment: " + err.Error())
elapsed = pts - mp4.FragmentStartRawPTS
}
shouldFlush := !mp4.Start || elapsed >= FragmentDurationMs
if shouldFlush {
// Write the previous segment to the file
if mp4.Start {
// IMPORTANT: Add any pending video sample to the current segment BEFORE flushing.
// This ensures the segment contains all frames up to (but not including) this keyframe,
// and the new segment will start cleanly with this keyframe.
if trackID == uint32(mp4.VideoTrack) {
mp4.flushPendingVideoSample(pts)
}
log.Log.Debug(fmt.Sprintf("mp4.AddSampleToTrack(): FLUSHING segment #%d - keyframes_in_fragment=%d, totalKF_received=%d, totalKF_written=%d",
mp4.SegmentCount, mp4.FragmentKeyframeCount, mp4.TotalKeyframesReceived, mp4.TotalKeyframesWritten))
mp4.MoofBoxes = mp4.MoofBoxes + 1
mp4.MoofBoxSizes = append(mp4.MoofBoxSizes, int64(mp4.Segment.Size()))
// Track the segment's duration and base decode time for sidx.
// Use accumulated VideoTotalDuration which matches the tfdt values
// in the trun boxes, NOT raw PTS from the camera.
segDuration := mp4.VideoTotalDuration - mp4.FragmentStartDTS
mp4.SegmentDurations = append(mp4.SegmentDurations, segDuration)
mp4.SegmentBaseDecTimes = append(mp4.SegmentBaseDecTimes, mp4.FragmentStartDTS)
err := mp4.Segment.Encode(mp4.Writer)
if err != nil {
log.Log.Error("mp4.AddSampleToTrack(): error encoding segment: " + err.Error())
}
mp4.Segments = append(mp4.Segments, mp4.Segment)
}
mp4.Segments = append(mp4.Segments, mp4.Segment)
mp4.Start = true
// Increment the segment count
mp4.SegmentCount = mp4.SegmentCount + 1
// Create a new media segment
seg := mp4ff.NewMediaSegment()
// Create a video fragment
multiTrackFragment, err := mp4ff.CreateMultiTrackFragment(uint32(mp4.SegmentCount), mp4.TrackIDs)
if err != nil {
log.Log.Error("mp4.AddSampleToTrack(): error creating multi track fragment: " + err.Error())
}
mp4.MultiTrackFragment = multiTrackFragment
seg.AddFragment(multiTrackFragment)
// Set to MP4 struct
mp4.Segment = seg
// Set the start PTS for the next segment
mp4.StartPTS = pts
mp4.FragmentStartRawPTS = pts
mp4.FragmentStartDTS = mp4.VideoTotalDuration
mp4.FragmentKeyframeCount = 0 // Reset keyframe counter for new fragment
}
mp4.Start = true
// Increment the segment count
mp4.SegmentCount = mp4.SegmentCount + 1
// Create a new media segment
seg := mp4ff.NewMediaSegment()
// Create a video fragment
multiTrackFragment, err := mp4ff.CreateMultiTrackFragment(uint32(mp4.SegmentCount), mp4.TrackIDs) // Assuming 1 for video track and 2 for audio track
if err != nil {
log.Log.Error("mp4.AddSampleToTrack(): error creating multi track fragment: " + err.Error())
}
mp4.MultiTrackFragment = multiTrackFragment
seg.AddFragment(multiTrackFragment)
// Set to MP4 struct
mp4.Segment = seg
// Set the start PTS for the next segment
mp4.StartPTS = pts
}
if mp4.Start {
if trackID == uint32(mp4.VideoTrack) {
mp4.updateVideoParameterSetsFromAnnexB(data)
var lengthPrefixed []byte
var err error
@@ -182,18 +357,10 @@ func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, p
}
if err == nil {
// Flush previous pending sample before storing the new one
if mp4.VideoFullSample != nil {
duration := pts - mp4.VideoFullSample.DecodeTime
log.Log.Debug("Adding sample to track " + fmt.Sprintf("%d, PTS: %d, Duration: %d, size: %d, Keyframe: %t", trackID, pts, duration, len(lengthPrefixed), isKeyframe))
mp4.LastVideoSampleDTS = duration
mp4.VideoTotalDuration += duration
mp4.VideoFullSample.DecodeTime = mp4.VideoTotalDuration - duration
mp4.VideoFullSample.Sample.Dur = uint32(duration)
err := mp4.MultiTrackFragment.AddFullSampleToTrack(*mp4.VideoFullSample, trackID)
if err != nil {
log.Log.Error("mp4.AddSampleToTrack(): error adding sample to track " + fmt.Sprintf("%d: %v", trackID, err))
}
log.Log.Debug("Adding sample to track " + fmt.Sprintf("%d, PTS: %d, size: %d, Keyframe: %t", trackID, pts, len(lengthPrefixed), isKeyframe))
mp4.flushPendingVideoSample(pts)
}
// Set the sample data
@@ -210,6 +377,7 @@ func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, p
CompositionTimeOffset: 0, // No composition time offset for video
}
mp4.VideoFullSample = &fullSample
mp4.PendingSampleIsKeyframe = isKeyframe
mp4.SampleType = "video"
}
} else if trackID == uint32(mp4.AudioTrack) {
@@ -259,12 +427,59 @@ func (mp4 *MP4) AddSampleToTrack(trackID uint32, isKeyframe bool, data []byte, p
func (mp4 *MP4) Close(config *models.Config) {
log.Log.Info(fmt.Sprintf("mp4.Close(): KEYFRAME SUMMARY - totalReceived=%d, totalWritten=%d, segments=%d, lastFragmentKF=%d",
mp4.TotalKeyframesReceived, mp4.TotalKeyframesWritten, mp4.SegmentCount, mp4.FragmentKeyframeCount))
if mp4.VideoTotalDuration == 0 && mp4.AudioTotalDuration == 0 {
log.Log.Error("mp4.Close(): no video or audio samples added, cannot create MP4 file")
log.Log.Error("mp4.Close(): no video or audio samples added, removing empty MP4 file")
mp4.Writer.Flush()
_ = mp4.FileWriter.Sync()
_ = mp4.FileWriter.Close()
_ = os.Remove(mp4.FileName)
return
}
// Add final pending samples before closing
if mp4.Segment != nil {
// Add final video sample if pending (pass 0 as nextPTS to use last known duration)
mp4.flushPendingVideoSample(0)
// Add final audio sample if pending
if mp4.AudioFullSample != nil && mp4.AudioTrack > 0 {
SplitAACFrame(mp4.AudioFullSample.Data, func(started bool, aac []byte) {
sampleToAdd := *mp4.AudioFullSample
dts := mp4.LastAudioSampleDTS
if dts == 0 {
dts = 1024 // Default AAC frame duration
}
mp4.AudioTotalDuration += dts
mp4.AudioPTS += dts
sampleToAdd.Data = aac[7:]
sampleToAdd.DecodeTime = mp4.AudioPTS - dts
sampleToAdd.Sample.Dur = uint32(dts)
sampleToAdd.Sample.Size = uint32(len(aac[7:]))
err := mp4.MultiTrackFragment.AddFullSampleToTrack(sampleToAdd, uint32(mp4.AudioTrack))
if err != nil {
log.Log.Error("mp4.Close(): error adding final audio sample: " + err.Error())
}
})
mp4.AudioFullSample = nil
}
}
// Encode the last segment
if mp4.Segment != nil {
// Track the last segment's size, duration and base decode time.
// Use accumulated VideoTotalDuration which matches tfdt values.
mp4.MoofBoxes = mp4.MoofBoxes + 1
mp4.MoofBoxSizes = append(mp4.MoofBoxSizes, int64(mp4.Segment.Size()))
lastSegDuration := mp4.VideoTotalDuration - mp4.FragmentStartDTS
if lastSegDuration == 0 {
lastSegDuration = mp4.LastVideoSampleDTS
}
mp4.SegmentDurations = append(mp4.SegmentDurations, lastSegDuration)
mp4.SegmentBaseDecTimes = append(mp4.SegmentBaseDecTimes, mp4.FragmentStartDTS)
err := mp4.Segment.Encode(mp4.Writer)
if err != nil {
log.Log.Error("mp4.Close(): error encoding last segment: " + err.Error())
@@ -272,10 +487,14 @@ func (mp4 *MP4) Close(config *models.Config) {
}
mp4.Writer.Flush()
defer mp4.FileWriter.Close()
// Ensure all segment data is on disk before we overwrite the placeholder at offset 0.
if err := mp4.FileWriter.Sync(); err != nil {
log.Log.Error("mp4.Close(): error syncing file: " + err.Error())
}
// Now we have all the moof and mdat boxes written to the file.
// We can now generate the ftyp and moov boxes, and replace it with the free box we added earlier (size of 2048 bytes).
// We build the ftyp + moov init segment and write it at the start,
// overwriting the free box placeholder we reserved in NewMP4.
init := mp4ff.NewMP4Init()
// Create a new ftyp box
@@ -289,22 +508,50 @@ func (mp4 *MP4) Close(config *models.Config) {
moov := mp4ff.NewMoovBox()
init.AddChild(moov)
// Set the creation time and modification time for the moov box
// Compute the actual video duration by summing segment durations.
// This must exactly match the sum of sample durations in the trun boxes
// that were written to the file, ensuring QuickTime (which strictly trusts
// header durations) displays the correct value.
var actualVideoDuration uint64
for _, d := range mp4.SegmentDurations {
actualVideoDuration += d
}
if actualVideoDuration != mp4.VideoTotalDuration {
log.Log.Warning(fmt.Sprintf("mp4.Close(): duration mismatch: accumulated VideoTotalDuration=%d, sum of segment durations=%d (diff=%d ms)",
mp4.VideoTotalDuration, actualVideoDuration, int64(mp4.VideoTotalDuration)-int64(actualVideoDuration)))
}
// Set the creation time and modification time for the moov box.
// QuickTime requires timestamps in Mac HFS format (seconds since 1904-01-01),
// so we convert from Unix epoch by adding MacEpochOffset.
videoTimescale := uint32(1000)
audioTimescale := uint32(1000)
macTime := mp4.StartTime + MacEpochOffset
nextTrackID := uint32(len(mp4.TrackIDs) + 1)
// mvhd.Duration must be the duration of the longest track.
// Start with video; if audio is longer, we update below.
movDuration := actualVideoDuration
if mp4.AudioTotalDuration > movDuration {
movDuration = mp4.AudioTotalDuration
}
mvhd := &mp4ff.MvhdBox{
Version: 0,
Flags: 0,
CreationTime: mp4.StartTime,
ModificationTime: mp4.StartTime,
CreationTime: macTime,
ModificationTime: macTime,
Timescale: videoTimescale,
Duration: mp4.VideoTotalDuration,
Duration: movDuration,
Rate: 0x00010000, // 1.0 playback speed (16.16 fixed point)
Volume: 0x0100, // 1.0 full volume (8.8 fixed point)
NextTrackID: nextTrackID,
}
init.Moov.AddChild(mvhd)
// Set the total duration in the moov box
mvex := mp4ff.NewMvexBox()
mvex.AddChild(&mp4ff.MehdBox{FragmentDuration: int64(mp4.VideoTotalDuration)})
mvex.AddChild(&mp4ff.MehdBox{FragmentDuration: int64(movDuration)})
init.Moov.AddChild(mvex)
// Add a track for the video
@@ -312,25 +559,48 @@ func (mp4 *MP4) Close(config *models.Config) {
case "H264", "AVC1":
init.AddEmptyTrack(videoTimescale, "video", "und")
includePS := true
err := init.Moov.Traks[0].SetAVCDescriptor("avc1", mp4.SPSNALUs, mp4.PPSNALUs, includePS)
spsNALUs := sanitizeParameterSets(mp4.SPSNALUs)
ppsNALUs := sanitizeParameterSets(mp4.PPSNALUs)
err := init.Moov.Traks[0].SetAVCDescriptor("avc1", spsNALUs, ppsNALUs, includePS)
if err != nil {
log.Log.Error("mp4.Close(): error setting AVC descriptor: " + err.Error())
}
init.Moov.Traks[0].Tkhd.Duration = mp4.VideoTotalDuration
init.Moov.Traks[0].Tkhd.Duration = actualVideoDuration
init.Moov.Traks[0].Tkhd.Width = mp4ff.Fixed32(uint32(mp4.width) << 16)
init.Moov.Traks[0].Tkhd.Height = mp4ff.Fixed32(uint32(mp4.height) << 16)
init.Moov.Traks[0].Tkhd.CreationTime = macTime
init.Moov.Traks[0].Tkhd.ModificationTime = macTime
init.Moov.Traks[0].Mdia.Hdlr.Name = "agent " + utils.VERSION
//init.Moov.Traks[0].Mdia.Mdhd.Duration = mp4.VideoTotalDuration
// mdhd.Duration MUST be 0 for fragmented MP4. QuickTime adds mdhd.Duration
// to the fragment durations (mehd/sidx), so setting it non-zero doubles the
// reported duration. Leave it at 0 so the player derives duration from fragments.
init.Moov.Traks[0].Mdia.Mdhd.Duration = 0
init.Moov.Traks[0].Mdia.Mdhd.CreationTime = macTime
init.Moov.Traks[0].Mdia.Mdhd.ModificationTime = macTime
case "H265", "HVC1":
init.AddEmptyTrack(videoTimescale, "video", "und")
includePS := true
err := init.Moov.Traks[0].SetHEVCDescriptor("hvc1", mp4.VPSNALUs, mp4.SPSNALUs, mp4.PPSNALUs, [][]byte{}, includePS)
vpsNALUs := sanitizeParameterSets(mp4.VPSNALUs)
spsNALUs := sanitizeParameterSets(mp4.SPSNALUs)
ppsNALUs := sanitizeParameterSets(mp4.PPSNALUs)
err := init.Moov.Traks[0].SetHEVCDescriptor("hvc1", vpsNALUs, spsNALUs, ppsNALUs, [][]byte{}, includePS)
if err != nil {
log.Log.Error("mp4.Close(): error setting HEVC descriptor: " + err.Error())
}
init.Moov.Traks[0].Tkhd.Duration = mp4.VideoTotalDuration
init.Moov.Traks[0].Tkhd.Duration = actualVideoDuration
init.Moov.Traks[0].Tkhd.Width = mp4ff.Fixed32(uint32(mp4.width) << 16)
init.Moov.Traks[0].Tkhd.Height = mp4ff.Fixed32(uint32(mp4.height) << 16)
init.Moov.Traks[0].Tkhd.CreationTime = macTime
init.Moov.Traks[0].Tkhd.ModificationTime = macTime
init.Moov.Traks[0].Mdia.Hdlr.Name = "agent " + utils.VERSION
//init.Moov.Traks[0].Mdia.Mdhd.Duration = mp4.VideoTotalDuration
// mdhd.Duration MUST be 0 for fragmented MP4 (see H264 case above).
init.Moov.Traks[0].Mdia.Mdhd.Duration = 0
init.Moov.Traks[0].Mdia.Mdhd.CreationTime = macTime
init.Moov.Traks[0].Mdia.Mdhd.ModificationTime = macTime
}
// Try adding audio track if available
if mp4.AudioTrackName == "AAC" || mp4.AudioTrackName == "MP4A" {
// Try adding audio track if available and samples were recorded.
if (mp4.AudioTrackName == "AAC" || mp4.AudioTrackName == "MP4A") && mp4.AudioTotalDuration > 0 {
// Add an audio track to the moov box
init.AddEmptyTrack(audioTimescale, "audio", "und")
@@ -344,8 +614,13 @@ func (mp4 *MP4) Close(config *models.Config) {
if err != nil {
}
init.Moov.Traks[1].Tkhd.Duration = mp4.AudioTotalDuration
init.Moov.Traks[1].Tkhd.CreationTime = macTime
init.Moov.Traks[1].Tkhd.ModificationTime = macTime
init.Moov.Traks[1].Mdia.Hdlr.Name = "agent " + utils.VERSION
//init.Moov.Traks[1].Mdia.Mdhd.Duration = mp4.AudioTotalDuration
// mdhd.Duration MUST be 0 for fragmented MP4 (see video track comment).
init.Moov.Traks[1].Mdia.Mdhd.Duration = 0
init.Moov.Traks[1].Mdia.Mdhd.CreationTime = macTime
init.Moov.Traks[1].Mdia.Mdhd.ModificationTime = macTime
}
// Try adding subtitle track if available
@@ -376,9 +651,11 @@ func (mp4 *MP4) Close(config *models.Config) {
// and encrypted with the public key.
fingerprint := fmt.Sprintf("%d", init.Moov.Mvhd.CreationTime) + "_" +
fmt.Sprintf("%d", init.Moov.Mvhd.Duration) + "_" +
init.Moov.Trak.Mdia.Hdlr.Name + "_" +
fmt.Sprintf("%d", mp4.MoofBoxes) + "_" // Number of moof boxes
fmt.Sprintf("%d", init.Moov.Mvhd.Duration) + "_"
if init.Moov.Trak != nil {
fingerprint += init.Moov.Trak.Mdia.Hdlr.Name + "_"
}
fingerprint += fmt.Sprintf("%d", mp4.MoofBoxes) + "_" // Number of moof boxes
for i, size := range mp4.MoofBoxSizes {
fingerprint += fmt.Sprintf("%d", size)
@@ -392,7 +669,10 @@ func (mp4 *MP4) Close(config *models.Config) {
}
// Load the private key from the configuration
privateKey := config.Signing.PrivateKey
var privateKey string
if config.Signing != nil {
privateKey = config.Signing.PrivateKey
}
r := strings.NewReader(privateKey)
pemBytes, _ := ioutil.ReadAll(r)
block, _ := pem.Decode(pemBytes)
@@ -423,126 +703,94 @@ func (mp4 *MP4) Close(config *models.Config) {
}
}
// We will also calculate the SIDX box, which is a segment index box that contains information about the segments in the file.
// This is useful for seeking in the file, and for streaming the file.
/*sidx := &mp4ff.SidxBox{
Version: 0,
Flags: 0,
ReferenceID: 0,
Timescale: videoTimescale,
EarliestPresentationTime: 0,
FirstOffset: 0,
SidxRefs: make([]mp4ff.SidxRef, 0),
}
referenceTrak := init.Moov.Trak
trex, ok := init.Moov.Mvex.GetTrex(referenceTrak.Tkhd.TrackID)
if !ok {
// We have an issue.
// Build a Segment Index (sidx) box so players can seek directly to any
// fragment without scanning the entire file.
if len(mp4.SegmentDurations) > 0 {
sidx := &mp4ff.SidxBox{
Version: 1,
Flags: 0,
ReferenceID: uint32(mp4.VideoTrack),
Timescale: videoTimescale,
EarliestPresentationTime: 0,
FirstOffset: 0,
SidxRefs: make([]mp4ff.SidxRef, 0, len(mp4.SegmentDurations)),
}
for i, dur := range mp4.SegmentDurations {
sidx.SidxRefs = append(sidx.SidxRefs, mp4ff.SidxRef{
ReferenceType: 0, // media reference
ReferencedSize: uint32(mp4.MoofBoxSizes[i]),
SubSegmentDuration: uint32(dur),
StartsWithSAP: 1,
SAPType: 1,
})
}
init.AddChild(sidx)
}
segDatas, err := findSegmentData(mp4.Segments, referenceTrak, trex)
if err != nil {
// We have an issue.
}
fillSidx(sidx, referenceTrak, segDatas, true)
// Add the SIDX box to the moov box
init.AddChild(sidx)*/
// Get a bit slice writer for the init segment
// Get a byte buffer of FreeBoxSize bytes to write the init segment
buffer := bytes.NewBuffer(make([]byte, 0))
init.Encode(buffer)
// The first FreeBoxSize bytes of the file is a free box, so we can read it and replace it with the moov box.
// The init box might not be FreeBoxSize bytes, so we need to read the first FreeBoxSize bytes and then replace it with the moov box.
// while the remaining bytes are for a new free box.
// Write the init segment at the beginning of the file, replacing the free box
if _, err := mp4.FileWriter.WriteAt(buffer.Bytes(), 0); err != nil {
// Encode the ftyp + moov + sidx into a buffer to measure the total size.
// Then compute the correct sidx.FirstOffset (the gap between the end of
// the sidx box and the first moof, occupied by the trailing free box)
// and re-encode with the corrected value.
var initBuf bytes.Buffer
if err := init.Encode(&initBuf); err != nil {
log.Log.Error("mp4.Close(): error encoding init segment: " + err.Error())
}
// Calculate the remaining size for the free box
remainingSize := mp4.FreeBoxSize - int64(buffer.Len())
if remainingSize > 0 {
newFreeBox := mp4ff.NewFreeBox(make([]byte, remainingSize))
initSize := int64(initBuf.Len())
// The sidx.FirstOffset is defined as the distance (in bytes) from the
// anchor point (first byte after the sidx box) to the first byte of
// the first referenced moof/mdat. Since sidx is the last box in init,
// the anchor point is at initSize, and the first moof is at FreeBoxSize.
if len(mp4.SegmentDurations) > 0 {
if mp4.FreeBoxSize < initSize {
// Avoid computing a negative offset and wrapping it to uint64.
log.Log.Error("mp4.Close(): FreeBoxSize is smaller than initSize; skipping sidx FirstOffset adjustment")
} else {
firstOffset := uint64(mp4.FreeBoxSize - initSize)
// Find the sidx we added and update its FirstOffset
for _, child := range init.Children {
if sidxBox, ok := child.(*mp4ff.SidxBox); ok {
sidxBox.FirstOffset = firstOffset
break
}
}
// Re-encode with the corrected FirstOffset (same size, no layout change)
initBuf.Reset()
if err := init.Encode(&initBuf); err != nil {
log.Log.Error("mp4.Close(): error re-encoding init segment: " + err.Error())
}
initSize = int64(initBuf.Len())
}
}
if initSize > mp4.FreeBoxSize {
log.Log.Error(fmt.Sprintf("mp4.Close(): init segment (%d bytes) exceeds reserved space (%d bytes), file may be corrupt", initSize, mp4.FreeBoxSize))
}
// Write the init segment at the beginning of the file, overwriting the free box placeholder.
if _, err := mp4.FileWriter.WriteAt(initBuf.Bytes(), 0); err != nil {
log.Log.Error("mp4.Close(): error writing init segment: " + err.Error())
}
// Fill any remaining reserved space with a new (smaller) free box so
// the byte offsets of the moof/mdat boxes that follow are preserved.
remainingSize := mp4.FreeBoxSize - initSize
if remainingSize >= 8 { // minimum box size is 8 bytes (header only)
newFree := mp4ff.NewFreeBox(make([]byte, remainingSize-8))
var freeBuf bytes.Buffer
if err := newFreeBox.Encode(&freeBuf); err != nil {
if err := newFree.Encode(&freeBuf); err != nil {
log.Log.Error("mp4.Close(): error encoding free box: " + err.Error())
}
if _, err := mp4.FileWriter.WriteAt(freeBuf.Bytes(), int64(buffer.Len())); err != nil {
if _, err := mp4.FileWriter.WriteAt(freeBuf.Bytes(), initSize); err != nil {
log.Log.Error("mp4.Close(): error writing free box: " + err.Error())
}
}
}
type segData struct {
startPos uint64
presentationTime uint64
baseDecodeTime uint64
dur uint32
size uint32
}
func fillSidx(sidx *mp4ff.SidxBox, refTrak *mp4ff.TrakBox, segDatas []segData, nonZeroEPT bool) {
ept := uint64(0)
if nonZeroEPT {
ept = segDatas[0].presentationTime
if err := mp4.FileWriter.Sync(); err != nil {
log.Log.Error("mp4.Close(): error syncing file: " + err.Error())
}
sidx.Version = 1
sidx.Timescale = refTrak.Mdia.Mdhd.Timescale
sidx.ReferenceID = 1
sidx.EarliestPresentationTime = ept
sidx.FirstOffset = 0
sidx.SidxRefs = make([]mp4ff.SidxRef, 0, len(segDatas))
for _, segData := range segDatas {
size := segData.size
sidx.SidxRefs = append(sidx.SidxRefs, mp4ff.SidxRef{
ReferencedSize: size,
SubSegmentDuration: segData.dur,
StartsWithSAP: 1,
SAPType: 1,
})
}
}
// findSegmentData returns a slice of segment media data using a reference track.
func findSegmentData(segs []*mp4ff.MediaSegment, refTrak *mp4ff.TrakBox, trex *mp4ff.TrexBox) ([]segData, error) {
segDatas := make([]segData, 0, len(segs))
for _, seg := range segs {
var firstCompositionTimeOffest int64
dur := uint32(0)
var baseTime uint64
for fIdx, frag := range seg.Fragments {
for _, traf := range frag.Moof.Trafs {
tfhd := traf.Tfhd
if tfhd.TrackID == refTrak.Tkhd.TrackID { // Find track that gives sidx time values
if fIdx == 0 {
baseTime = traf.Tfdt.BaseMediaDecodeTime()
}
for i, trun := range traf.Truns {
trun.AddSampleDefaultValues(tfhd, trex)
samples := trun.GetSamples()
for j, sample := range samples {
if fIdx == 0 && i == 0 && j == 0 {
firstCompositionTimeOffest = int64(sample.CompositionTimeOffset)
}
dur += sample.Dur
}
}
}
}
}
sd := segData{
startPos: seg.StartPos,
presentationTime: uint64(int64(baseTime) + firstCompositionTimeOffest),
baseDecodeTime: baseTime,
dur: dur,
size: uint32(seg.Size()),
}
segDatas = append(segDatas, sd)
}
return segDatas, nil
mp4.FileWriter.Close()
}
// annexBToLengthPrefixed converts Annex B formatted H264 data (with start codes)
@@ -590,6 +838,22 @@ func removeAnnexBStartCode(nalu []byte) []byte {
return nalu
}
// sanitizeParameterSets removes Annex B start codes and drops empty NALUs.
func sanitizeParameterSets(nalus [][]byte) [][]byte {
if len(nalus) == 0 {
return nalus
}
clean := make([][]byte, 0, len(nalus))
for _, nalu := range nalus {
trimmed := removeAnnexBStartCode(nalu)
if len(trimmed) == 0 {
continue
}
clean = append(clean, trimmed)
}
return clean
}
// splitNALUs splits Annex B data into raw NAL units without start codes.
func splitNALUs(data []byte) [][]byte {
var nalus [][]byte

View File

@@ -0,0 +1,176 @@
package video
import (
"fmt"
"os"
"testing"
mp4ff "github.com/Eyevinn/mp4ff/mp4"
"github.com/kerberos-io/agent/machinery/src/models"
)
// TestMP4Duration creates an MP4 file simulating a 5-second video recording
// and verifies that the durations in all boxes match the sum of sample durations.
func TestMP4Duration(t *testing.T) {
tmpFile := "/tmp/test_duration.mp4"
defer os.Remove(tmpFile)
// Minimal SPS for H.264 (baseline, 640x480) - proper Annex B format with start code
sps := []byte{0x67, 0x42, 0xc0, 0x1e, 0xd9, 0x00, 0xa0, 0x47, 0xfe, 0xc8}
pps := []byte{0x68, 0xce, 0x38, 0x80}
mp4Video := NewMP4(tmpFile, [][]byte{sps}, [][]byte{pps}, nil, 10)
mp4Video.SetWidth(640)
mp4Video.SetHeight(480)
videoTrack := mp4Video.AddVideoTrack("H264")
// Simulate 5 seconds at 25fps (200 frames, keyframe every 50 frames = 2s)
// PTS in milliseconds (timescale=1000)
frameDuration := uint64(40) // 40ms per frame = 25fps
numFrames := 150
gopSize := 50
// Create a fake Annex B NAL unit (keyframe IDR = type 5, non-keyframe = type 1)
makeFrame := func(isKey bool) []byte {
nalType := byte(0x01) // non-IDR slice
if isKey {
nalType = 0x65 // IDR slice
}
// Start code (4 bytes) + NAL header + some data
frame := []byte{0x00, 0x00, 0x00, 0x01, nalType}
// Add some padding data
for i := 0; i < 100; i++ {
frame = append(frame, byte(i))
}
return frame
}
var expectedDuration uint64
for i := 0; i < numFrames; i++ {
pts := uint64(i) * frameDuration
isKeyframe := i%gopSize == 0
err := mp4Video.AddSampleToTrack(videoTrack, isKeyframe, makeFrame(isKeyframe), pts)
if err != nil {
t.Fatalf("AddSampleToTrack failed at frame %d: %v", i, err)
}
}
expectedDuration = uint64(numFrames) * frameDuration // Should be 6000ms (150 * 40)
// Close with config that has signing key to avoid nil panics
config := &models.Config{
Signing: &models.Signing{
PrivateKey: "",
},
}
mp4Video.Close(config)
// Log what the code computed
t.Logf("VideoTotalDuration: %d ms", mp4Video.VideoTotalDuration)
t.Logf("Expected duration: %d ms", expectedDuration)
t.Logf("Segments: %d", len(mp4Video.SegmentDurations))
var sumSegDur uint64
for i, d := range mp4Video.SegmentDurations {
t.Logf(" Segment %d: duration=%d ms", i, d)
sumSegDur += d
}
t.Logf("Sum of segment durations: %d ms", sumSegDur)
// Now read back the file and inspect the boxes
f, err := os.Open(tmpFile)
if err != nil {
t.Fatalf("Failed to open output file: %v", err)
}
defer f.Close()
fi, err := f.Stat()
if err != nil {
t.Fatalf("Failed to stat output file: %v", err)
}
parsedFile, err := mp4ff.DecodeFile(f)
if err != nil {
t.Fatalf("Failed to decode MP4: %v", err)
}
t.Logf("File size: %d bytes", fi.Size())
// Check moov box
if parsedFile.Moov == nil {
t.Fatal("No moov box found")
}
// Check mvhd duration
mvhd := parsedFile.Moov.Mvhd
t.Logf("mvhd.Duration: %d (timescale=%d) = %.2f seconds", mvhd.Duration, mvhd.Timescale, float64(mvhd.Duration)/float64(mvhd.Timescale))
t.Logf("mvhd.Rate: 0x%08x", mvhd.Rate)
t.Logf("mvhd.Volume: 0x%04x", mvhd.Volume)
// Check each trak
for i, trak := range parsedFile.Moov.Traks {
t.Logf("Track %d:", i)
t.Logf(" tkhd.Duration: %d", trak.Tkhd.Duration)
t.Logf(" mdhd.Duration: %d (timescale=%d) = %.2f seconds", trak.Mdia.Mdhd.Duration, trak.Mdia.Mdhd.Timescale, float64(trak.Mdia.Mdhd.Duration)/float64(trak.Mdia.Mdhd.Timescale))
}
// Check mvex/mehd
if parsedFile.Moov.Mvex != nil && parsedFile.Moov.Mvex.Mehd != nil {
t.Logf("mehd.FragmentDuration: %d", parsedFile.Moov.Mvex.Mehd.FragmentDuration)
}
// Sum up actual sample durations from trun boxes in all segments
var actualTrunDuration uint64
var sampleCount int
for _, seg := range parsedFile.Segments {
for _, frag := range seg.Fragments {
for _, traf := range frag.Moof.Trafs {
// Only count video track (track 1)
if traf.Tfhd.TrackID == 1 {
for _, trun := range traf.Truns {
for _, s := range trun.Samples {
actualTrunDuration += uint64(s.Dur)
sampleCount++
}
}
}
}
}
}
t.Logf("Actual trun sample count: %d", sampleCount)
t.Logf("Actual trun total duration: %d ms", actualTrunDuration)
// Check sidx
if parsedFile.Sidx != nil {
var sidxDuration uint64
for _, ref := range parsedFile.Sidx.SidxRefs {
sidxDuration += uint64(ref.SubSegmentDuration)
}
t.Logf("sidx total duration: %d ms", sidxDuration)
}
// VERIFY: All duration values should be consistent
// The expected duration for 150 frames at 40ms each:
// - The sample-buffering pattern means the LAST sample uses LastVideoSampleDTS as duration
// - So all 150 samples should produce 150 * 40ms = 6000ms total
// But due to the pending sample pattern, the actual trun durations might differ
fmt.Println()
fmt.Println("=== DURATION CONSISTENCY CHECK ===")
fmt.Printf("Expected (150 * 40ms): %d ms\n", expectedDuration)
fmt.Printf("mvhd.Duration: %d ms\n", mvhd.Duration)
fmt.Printf("tkhd.Duration: %d ms\n", parsedFile.Moov.Traks[0].Tkhd.Duration)
fmt.Printf("mdhd.Duration: %d ms\n", parsedFile.Moov.Traks[0].Mdia.Mdhd.Duration)
fmt.Printf("Actual trun durations sum: %d ms\n", actualTrunDuration)
fmt.Printf("VideoTotalDuration: %d ms\n", mp4Video.VideoTotalDuration)
fmt.Printf("Sum of SegmentDurations: %d ms\n", sumSegDur)
fmt.Println()
// The key assertion: header duration must equal trun sum
if mvhd.Duration != actualTrunDuration {
t.Errorf("MISMATCH: mvhd.Duration (%d) != actual trun sum (%d), diff = %d ms",
mvhd.Duration, actualTrunDuration, int64(mvhd.Duration)-int64(actualTrunDuration))
}
if parsedFile.Moov.Traks[0].Mdia.Mdhd.Duration != 0 {
t.Errorf("MISMATCH: mdhd.Duration should be 0 for fragmented MP4, got %d",
parsedFile.Moov.Traks[0].Mdia.Mdhd.Duration)
}
}

View File

@@ -49,6 +49,7 @@ type peerConnectionWrapper struct {
conn *pionWebRTC.PeerConnection
cancelCtx context.CancelFunc
done chan struct{}
closeOnce sync.Once
}
var globalConnectionManager = NewConnectionManager()
@@ -339,18 +340,20 @@ func InitializeWebRTCConnection(configuration *models.Configuration, communicati
switch connectionState {
case pionWebRTC.PeerConnectionStateDisconnected, pionWebRTC.PeerConnectionStateClosed:
count := globalConnectionManager.DecrementPeerCount()
log.Log.Info("webrtc.main.InitializeWebRTCConnection(): Peer disconnected. Active peers: " + string(rune(count)))
wrapper.closeOnce.Do(func() {
count := globalConnectionManager.DecrementPeerCount()
log.Log.Info("webrtc.main.InitializeWebRTCConnection(): Peer disconnected. Active peers: " + string(rune(count)))
// Clean up resources
globalConnectionManager.CloseCandidateChannel(sessionKey)
// Clean up resources
globalConnectionManager.CloseCandidateChannel(sessionKey)
if err := peerConnection.Close(); err != nil {
log.Log.Error("webrtc.main.InitializeWebRTCConnection(): error closing peer connection: " + err.Error())
}
if err := peerConnection.Close(); err != nil {
log.Log.Error("webrtc.main.InitializeWebRTCConnection(): error closing peer connection: " + err.Error())
}
globalConnectionManager.RemovePeerConnection(handshake.SessionID)
close(wrapper.done)
globalConnectionManager.RemovePeerConnection(handshake.SessionID)
close(wrapper.done)
})
case pionWebRTC.PeerConnectionStateConnected:
count := globalConnectionManager.IncrementPeerCount()