leaking file descriptors #18

Closed
opened 2025-12-23 10:30:31 +01:00 by backuprepo · 11 comments
Owner

Originally created by @mcerveny on GitHub (Feb 26, 2024).

Hello.
I am using code in alloc/free loop.

When I used "scale_rkrga" filter in cycle (avfilter_graph_alloc() ... avfilter_graph_free()) usually leaks 1 FD "anon_inode:sync_file" in /proc/PID/fd (maybe some sort of sync primitive leak).
When I used "h264_rkmpp" encoder in cycle (avcodec_alloc_context3() ... avcodec_free_context()) usually leaks 1 FD "/dmabuf:276910-main" in /proc/PID/fd.
("hevc_rkmpp" decoder does not need to restart, because it supports avcodec_flush_buffers()).

I cannot determine if there is problem with rk libraries or ffmpeg integration code.
Does anyone have hint/solution to this ?

Thanks, Martin

Originally created by @mcerveny on GitHub (Feb 26, 2024). Hello. I am using code in alloc/free loop. When I used "scale_rkrga" filter in cycle (avfilter_graph_alloc() ... avfilter_graph_free()) usually leaks 1 FD "anon_inode:sync_file" in /proc/PID/fd (maybe some sort of sync primitive leak). When I used "h264_rkmpp" encoder in cycle (avcodec_alloc_context3() ... avcodec_free_context()) usually leaks 1 FD "/dmabuf:276910-main" in /proc/PID/fd. ("hevc_rkmpp" decoder does not need to restart, because it supports avcodec_flush_buffers()). I cannot determine if there is problem with *rk* libraries or *ffmpeg* integration code. Does anyone have hint/solution to this ? Thanks, Martin
backuprepo 2025-12-23 10:30:31 +01:00
Author
Owner

@nyanmisaka commented on GitHub (Feb 26, 2024):

@mcerveny Can you reproduce the same issue when using FFmpeg via CLI?

https://github.com/nyanmisaka/ffmpeg-rockchip/wiki/Video-Transcode#mpp-decode--mpp-encode-fastest

@nyanmisaka commented on GitHub (Feb 26, 2024): @mcerveny Can you reproduce the same issue when using FFmpeg via CLI? https://github.com/nyanmisaka/ffmpeg-rockchip/wiki/Video-Transcode#mpp-decode--mpp-encode-fastest
Author
Owner

@mcerveny commented on GitHub (Feb 26, 2024):

Probably it is visible after all cleanup but before exit, so it can be possible in crafted ffmpeg code with sleep(100000) before exit. Now I must finish the code/application with workaround (restart application after ~500 cycles). I will try next week to prepare some minimum code to demonstrate this behavior.
For reference (out of FD descriptor), usual error:

 RgaBlit(1485) RGA_BLIT fail: Too many open files
 RgaBlit(1486) RGA_BLIT fail: Too many open files
handl-fd-vir-phy-hnd-format[0, 53, (nil), (nil), 0, 2560]
rect[0, 0, 2304, 1296, 2304, 1296, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
handl-fd-vir-phy-hnd-format[0, 1023, (nil), (nil), 0, 2560]
rect[0, 0, 720, 406, 768, 406, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
This output the user parameters when rga call blit fail
[hwscale @ 0x55822b8080] RGA blit failed: -24

And output from /proc/PID/fd before fail:

/proc/.../fd# ls -l | awk '{ print $NF; }' | sort | uniq -c
      1 0
    698 anon_inode:sync_file
      4 /dev/dma_heap/cma
      4 /dev/dma_heap/system
      4 /dev/dri/card0
      1 /dev/mpp_service
      3 /dev/pts/0
      1 /dev/rga
    207 /dmabuf:350251-main
      1 /share/cam/0A210205
      1 /share/cam/0A210205/186abd710ca.ts
      1 socket:[780242]
@mcerveny commented on GitHub (Feb 26, 2024): Probably it is visible after all cleanup but before exit, so it can be possible in crafted ffmpeg code with sleep(100000) before exit. Now I must finish the code/application with workaround (restart application after ~500 cycles). I will try next week to prepare some minimum code to demonstrate this behavior. For reference (out of FD descriptor), usual error: ``` RgaBlit(1485) RGA_BLIT fail: Too many open files RgaBlit(1486) RGA_BLIT fail: Too many open files handl-fd-vir-phy-hnd-format[0, 53, (nil), (nil), 0, 2560] rect[0, 0, 2304, 1296, 2304, 1296, 2560, 0] f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1] handl-fd-vir-phy-hnd-format[0, 1023, (nil), (nil), 0, 2560] rect[0, 0, 720, 406, 768, 406, 2560, 0] f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1] This output the user parameters when rga call blit fail [hwscale @ 0x55822b8080] RGA blit failed: -24 ``` And output from /proc/PID/fd before fail: ``` /proc/.../fd# ls -l | awk '{ print $NF; }' | sort | uniq -c 1 0 698 anon_inode:sync_file 4 /dev/dma_heap/cma 4 /dev/dma_heap/system 4 /dev/dri/card0 1 /dev/mpp_service 3 /dev/pts/0 1 /dev/rga 207 /dmabuf:350251-main 1 /share/cam/0A210205 1 /share/cam/0A210205/186abd710ca.ts 1 socket:[780242] ```
Author
Owner

@nyanmisaka commented on GitHub (Feb 26, 2024):

 RgaBlit(1485) RGA_BLIT fail: Too many open files
 RgaBlit(1486) RGA_BLIT fail: Too many open files

This error only occurs when async RGA is used but the out_fence_fd returned is not invalidated by the user, so the FD resource will be exhausted quicky. I encountered it in earlier debugging and development, but it shouldn't be present in scale_rkrga now.

BTW what's your SoC model, Linux kernel version and MPP/RGA libs commit date?

@nyanmisaka commented on GitHub (Feb 26, 2024): ``` RgaBlit(1485) RGA_BLIT fail: Too many open files RgaBlit(1486) RGA_BLIT fail: Too many open files ``` This error only occurs when async RGA is used but the `out_fence_fd` returned is not invalidated by the user, so the FD resource will be exhausted quicky. I encountered it in earlier debugging and development, but it shouldn't be present in `scale_rkrga` now. BTW what's your SoC model, Linux kernel version and MPP/RGA libs commit date?
Author
Owner

@mcerveny commented on GitHub (Feb 26, 2024):

@mcerveny commented on GitHub (Feb 26, 2024): - MPP+RGA+FFMPEG as described https://github.com/nyanmisaka/ffmpeg-rockchip/wiki/Compilation (pulled 23.2.2024) - kernel 6.1.43-rockchip-rk3588 - //github.com/orangepi-xunlong/orangepi-build.git (pulled 23.2.2024) - hw rk3588s OrangePI 5
Author
Owner

@nyanmisaka commented on GitHub (Feb 26, 2024):

Then it should all work fine. I may need to wait for a demo from you to see what's going on.

@nyanmisaka commented on GitHub (Feb 26, 2024): Then it should all work fine. I may need to wait for a demo from you to see what's going on.
Author
Owner
@mcerveny commented on GitHub (Feb 26, 2024): https://github.com/nyanmisaka/rk-mirrors/blob/jellyfin-rga/core/NormalRga.cpp#L1485
Author
Owner

@nyanmisaka commented on GitHub (Feb 29, 2024):

Probably it is visible after all cleanup but before exit, so it can be possible in crafted ffmpeg code with sleep(100000) before exit. Now I must finish the code/application with workaround (restart application after ~500 cycles). I will try next week to prepare some minimum code to demonstrate this behavior. For reference (out of FD descriptor), usual error:

 RgaBlit(1485) RGA_BLIT fail: Too many open files
 RgaBlit(1486) RGA_BLIT fail: Too many open files
handl-fd-vir-phy-hnd-format[0, 53, (nil), (nil), 0, 2560]
rect[0, 0, 2304, 1296, 2304, 1296, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
handl-fd-vir-phy-hnd-format[0, 1023, (nil), (nil), 0, 2560]
rect[0, 0, 720, 406, 768, 406, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
This output the user parameters when rga call blit fail
[hwscale @ 0x55822b8080] RGA blit failed: -24

And output from /proc/PID/fd before fail:

/proc/.../fd# ls -l | awk '{ print $NF; }' | sort | uniq -c
      1 0
    698 anon_inode:sync_file
      4 /dev/dma_heap/cma
      4 /dev/dma_heap/system
      4 /dev/dri/card0
      1 /dev/mpp_service
      3 /dev/pts/0
      1 /dev/rga
    207 /dmabuf:350251-main
      1 /share/cam/0A210205
      1 /share/cam/0A210205/186abd710ca.ts
      1 socket:[780242]

@mcerveny Hopefully the commit 377fa2c will fix this issue. Let me know if it helps.

@nyanmisaka commented on GitHub (Feb 29, 2024): > Probably it is visible after all cleanup but before exit, so it can be possible in crafted ffmpeg code with sleep(100000) before exit. Now I must finish the code/application with workaround (restart application after ~500 cycles). I will try next week to prepare some minimum code to demonstrate this behavior. For reference (out of FD descriptor), usual error: > > ``` > RgaBlit(1485) RGA_BLIT fail: Too many open files > RgaBlit(1486) RGA_BLIT fail: Too many open files > handl-fd-vir-phy-hnd-format[0, 53, (nil), (nil), 0, 2560] > rect[0, 0, 2304, 1296, 2304, 1296, 2560, 0] > f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1] > handl-fd-vir-phy-hnd-format[0, 1023, (nil), (nil), 0, 2560] > rect[0, 0, 720, 406, 768, 406, 2560, 0] > f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1] > This output the user parameters when rga call blit fail > [hwscale @ 0x55822b8080] RGA blit failed: -24 > ``` > > And output from /proc/PID/fd before fail: > > ``` > /proc/.../fd# ls -l | awk '{ print $NF; }' | sort | uniq -c > 1 0 > 698 anon_inode:sync_file > 4 /dev/dma_heap/cma > 4 /dev/dma_heap/system > 4 /dev/dri/card0 > 1 /dev/mpp_service > 3 /dev/pts/0 > 1 /dev/rga > 207 /dmabuf:350251-main > 1 /share/cam/0A210205 > 1 /share/cam/0A210205/186abd710ca.ts > 1 socket:[780242] > ``` @mcerveny Hopefully the commit 377fa2c will fix this issue. Let me know if it helps.
Author
Owner

@mcerveny commented on GitHub (Mar 5, 2024):

Yes, it partially works, "non_inode:sync_file" is gone but dmabuf:*-main remains (encoder leak).

@mcerveny commented on GitHub (Mar 5, 2024): Yes, it partially works, "non_inode:sync_file" is gone but dmabuf:*-main remains (encoder leak).
Author
Owner

@nyanmisaka commented on GitHub (Mar 6, 2024):

Yes, it partially works, "non_inode:sync_file" is gone but dmabuf:*-main remains (encoder leak).

@mcerveny From my understanding, MPP async encoding requires rkmppenc to retain references to some input frames, which will be notified by MPP callbacks and thus be dynamically released. But when codec->close() is called, these reserved frames will be released unconditionally.

Can you help me trace in which rkmppenc function the leak occurred?
d43f4f54e6/libavcodec/rkmppenc.c (L800)

@nyanmisaka commented on GitHub (Mar 6, 2024): > Yes, it partially works, "non_inode:sync_file" is gone but dmabuf:*-main remains (encoder leak). @mcerveny From my understanding, MPP async encoding requires `rkmppenc` to retain references to some input frames, which will be notified by MPP callbacks and thus be dynamically released. But when `codec->close()` is called, these reserved frames will be released unconditionally. Can you help me trace in which `rkmppenc` function the leak occurred? https://github.com/nyanmisaka/ffmpeg-rockchip/blob/d43f4f54e6c732cd47d5e1ab69b600afd8966897/libavcodec/rkmppenc.c#L800
Author
Owner

@mcerveny commented on GitHub (Mar 9, 2024):

It seems to be related to #35. The problem is gone with my quick patch. (tested with loop 100 frames input * 2000 (different video segments)).

@mcerveny commented on GitHub (Mar 9, 2024): It seems to be related to #35. The problem is gone with my quick patch. (tested with loop 100 frames input * 2000 (different video segments)).
Author
Owner

@nyanmisaka commented on GitHub (Mar 10, 2024):

Closed by 7a0200b

@nyanmisaka commented on GitHub (Mar 10, 2024): Closed by 7a0200b
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: starred/ffmpeg-rockchip#18
No description provided.