반응형

FFmpeg로 IP Camera에서 스트리밍 되는 영상을 저장해 보자.

 

ffmpeg -rtsp_transport tcp -i rtsp://admin:123456@192.168.0.56:554/stream1 -b:a 4k -t 10 -y output.mp4

 

-rtsp_transport = RTSP 전송 프로토콜 설정.

FFmpeg Protocols Documentation

 

-b:a = audio bitrate. 너무 높으면 아래와 같은 경고가 출력된다. audio sampling frequency(-ar)를 높이거나 bitrate(-b:a or -ab)를 낮춰야 한다.

[aac @ 000001b64b5b7c00] Too many bits 8832.000000 > 6144 per frame requested, clamping to max

 

-t = duration

 

※ 참고

HikVision Camera RTSP Stream

HikVision Camera RTSP with Authentication
rtsp://<username>:<password>@<IP address of device>:<RTSP port>/Streaming/channels/<channel number><stream number>
NOTE: <stream number> represents main stream (01), or the sub stream (02)
Example:
rtsp://192.168.0.100:554/Streaming/channels/101 – get the main stream of the 1st channel
rtsp://192.168.0.100:554/Streaming/channels/102 – get the sub stream of the 1st channel

 

https://stackoverflow.com/questions/56423581/save-rtsp-stream-continuously-into-multi-mp4-files-with-specific-length-10-minu

https://butteryoon.github.io/dev/2019/04/10/using_ffmpeg.html

 

 

FFmpeg Formats Documentation

4.68 segment , stream-segment, ssegment 참고

 

반응형
Posted by J-sean
:
반응형

동영상에 자막을 인코딩 해 보자.

 

● 조금 자세한 자막 인코딩 명령

ffmpeg -i LoveStory.mp4 -vf "subtitles='LoveStory.srt':force_style='FontSize=24,PrimaryColour=&HFFFFFF&, BorderStyle=1, Outline=1'" -c:a aac -b:a 160k -c:v libx265 -crf 24 -preset veryfast output.mp4

 

● 옵션 생략
ffmpeg -i LoveStory.mp4 -vf "subtitles='LoveStory.srt':force_style='FontSize=24,PrimaryColour=&HFFFFFF&, BorderStyle=1, Outline=1'" output.mp4

 

● 간단한 옵션 설명
-i = 소스 파일 지정
-c:a aac = 오디오 코덱을 aac로 지정, copy 로 지정하면 기존 오디오 코덱 유지(빠름)
-c:v libx265 = 비디오 코덱을 h.265로 지정, copy 로 지정하면 기존 비디오 코덱 유지(빠름)
-b:a = 오디오 비트레이트. 128k가 기본
-b:v = 비디오 비트레이트. 128k가 기본
-preset = 압축률과 시간을 설정한다. fast, medium, slow등 여러가지.
-crf = 화질을 결정. 0~59. 0이 무손실.
-vf = 필터 그래프를 생성하고 스트림 필터로 사용. 이걸로 자막을 지정. (Filtering)
필터의 자막 부분은 FFmpeg Filters Documentation 에서 4.1 Filtergraph syntax, 11.249 subtitles 참고.

 

아래 링크의 문서 중 5번이 옵션에 관한 전체적인 설명이다.

ffmpeg Documentation

 

● 간단 지정
ffmpeg -i LoveStory.mp4 -vf subtitles=LoveStory.srt output.mp4

 

smi 자막은 지원하지 않는다. srt를 사용하자.

 

※ 참고 자료

FFmpeg_Book.z01
19.53MB
FFmpeg_Book.zip
2.84MB

 

2.230에 자막 관련 내용이 있다.

 

 

이번엔 동영상에서 자막을 추출해 보자.

 

● 명령

ffmpeg -i topgun.mkv sub.srt

특별한 옵션 없이 사용하면 영상에서 첫 번째 자막을 추출한다.

 

ffmpeg -i topgun.mkv -map 0:3 sub.srt

-map 0:4 = 4번째 스트림의 데이터를 추출한다. 자막이 여러개인 경우 두 번째 자막이 3. (보통 0 = Video, 1 = Audio, 2 부터 자막이다)

두 번째 자막 추출 시 -map 0:3 대신 -map s:1로 해도 된다. (s:0 = 첫 번째 자막, s:1 = 두 번째 자막)

 

자막 중에서도 실행 시 subrip (srt)로 표시되면 문제없이 추출되지만 hdmv_pgs_subtitle (pgssub)로 표시되는 자막을 추출하려 하면 아래와 같은 에러가 발생한다. (srt는 텍스트 데이터지만 pgs는 알파 채널이 포함된 비트맵 데이터 자막이다. 영상위에 비트맵으로 된 자막을 그리는 것이다.)
Subtitle encoding currently only possible from text to text or bitmap to bitmap

아래 명령어로 추출은 가능하지만 sub.sup파일은 다른 OCR 프로그램을 이용해 텍스트 파일로 변환해야 한다.

ffmpeg -i topgun.mkv -map 0:4 -c copy sub.sup

 

ffmpeg -i topgun.mkv -vn -an -c:s copy -map s:0 sub.srt 명령도 첫 번째 자막을 추출한다.

-vn = 비디오 스트림을 차단한다.

-an = 오디오 스트림을 차단한다.

-sn = 자막 스트림을 차단한다.

 

● 아래와 같은 파이썬 명령으로도 자막을 추출할 수 있다.

import ffmpeg
input = ffmpeg.input('topgun.mkv').output('sub.srt', map='s:0').run(quiet=True)

ffmpeg-python

 

● 2초에 영상 한 프레임 추출하기

ffmpeg -ss 0:02 -i video.mp4 -frames:v 1 -update 1 pic.png

-ss 옵션을 -i 앞에 넣으면 파이프라인의 디코딩 과정을 생략하기 때문에 훨씬 빠르다.

import ffmpeg
input = ffmpeg.input('video.mp4').output('pic.png', ss='0:02', frames=1, update=1).run(quiet=True)

파이썬에서는 위와 같이 해도 빠르다.

 

반응형

'FFmpeg' 카테고리의 다른 글

FFmpeg RTSP IP Camera 영상 저장  (0) 2025.03.14
Posted by J-sean
:
반응형

동기 함수인 detect()를 사용해 카메라 영상 내 사람의 자세를 감지해 보자.

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import numpy as np
import cv2
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
 
def draw_landmarks_on_image(rgb_image, detection_result, bg_black):
  pose_landmarks_list = detection_result.pose_landmarks
  # Black Background
  if bg_black:
      annotated_image = np.zeros_like(rgb_image)
  else:
      annotated_image = np.copy(rgb_image)
  
  # Loop through the detected poses to visualize.
  for idx in range(len(pose_landmarks_list)):
    pose_landmarks = pose_landmarks_list[idx]
 
    # Draw the pose landmarks.
    pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
    pose_landmarks_proto.landmark.extend([
      landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks
    ])
    solutions.drawing_utils.draw_landmarks(
      annotated_image,
      pose_landmarks_proto,
      solutions.pose.POSE_CONNECTIONS,
      solutions.drawing_styles.get_default_pose_landmarks_style())
  return annotated_image
 
# Create an PoseLandmarker object.
base_options = python.BaseOptions(model_asset_path='pose_landmarker_full.task')
options = vision.PoseLandmarkerOptions(base_options=base_options, output_segmentation_masks=True)
detector = vision.PoseLandmarker.create_from_options(options)
 
cap = cv2.VideoCapture(0)
 
while True:
    # Load the input frame.
    ret, cv_frame = cap.read()
    if not ret:
        break
    frame = mp.Image(image_format = mp.ImageFormat.SRGB, data = cv2.cvtColor(cv_frame, cv2.COLOR_BGR2RGB))
 
    # Detect pose landmarks from the input image.
    detection_result = detector.detect(frame)
    
    # Process the detection result. In this case, visualize it.
    annotated_frame = draw_landmarks_on_image(frame.numpy_view(), detection_result, True)
    cv2.imshow('sean', cv2.cvtColor(annotated_frame, cv2.COLOR_RGB2BGR))
 
    key = cv2.waitKey(25)
    if key == 27# ESC
        break
 
if cap.isOpened():
    cap.release()
cv2.destroyAllWindows()
 

 

이번엔 비동기 detect_async()를 사용해 자세를 감지해 보자.

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
import time
import numpy as np
import cv2
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
 
landmark_result = None
 
# The user-defined result callback for processing live stream data.
# The result callback should only be specified when the running mode is set to the live stream mode.
# The result_callback provides:
# The pose landmarker detection results.
# The input image that the pose landmarker runs on.
# The input timestamp in milliseconds.
def print_result(result: vision.PoseLandmarkerResult, output_image: mp.Image, timestamp_ms: int):
    global landmark_result
    landmark_result = result
 
    #print(output_image.numpy_view())
    # output_image에 접근은 가능하지만 이 콜백 함수에서 cv2.imshow()를 이용한 이미지 출력은 안되는거 같다.
    # 여러가지 방법으로 해 봤지만 정상적인 작동은 되지 않는다.
        
    # Structure of PoseLandmakerResult
    # mp.tasks.vision.PoseLandmarkerResult(
    # pose_landmarks: List[List[landmark_module.NormalizedLandmark]],
    # pose_world_landmarks: List[List[landmark_module.Landmark]],
    # segmentation_masks: Optional[List[image_module.Image]] = None
    # )
    
    #print('pose landmarker result: {}'.format(result))
    #print("pose landmark: ", result.pose_landmarks[0][0].visibility)
    #print("pose world landmark: ", result.pose_world_landmarks[0][0].visibility)
 
    # pose_landmarks_list = result.pose_landmarks    
    # for idx in range(len(pose_landmarks_list)):
    #     pose_landmarks = pose_landmarks_list[idx]        
    #     for landmark in pose_landmarks:
    #         print("x: %.2f, y: %.2f, z: %.2f visibility: %.2f, presence: %.2f" %(landmark.x, landmark.y,
    #               landmark.z, landmark.visibility, landmark.presence))
 
def draw_landmarks_on_image(rgb_image, detection_result, bg_black):
    # Black Background
    if bg_black:
        annotated_image = np.zeros_like(rgb_image)
    else:
        annotated_image = np.copy(rgb_image)
    
    # 비동기 detect_async()가 사용 되었기 때문에 처음 몇 프레임은 detection_result가 None일 수 있다.
    # 또, 이미지(프레임)에 사람이 없을땐 detection_result.pose_landmarks 리스트가 비어있게 된다.
    # 그에 대한 처리를 하지 않으면 에러가 발생한다.
    if detection_result is None or detection_result.pose_landmarks == []:
        return annotated_image
    
    pose_landmarks_list = detection_result.pose_landmarks
        
    # Loop through the detected poses to visualize.
    for idx in range(len(pose_landmarks_list)):
        pose_landmarks = pose_landmarks_list[idx]
    
    # Draw the pose landmarks.
    pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
    pose_landmarks_proto.landmark.extend([
        landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks
    ])
    solutions.drawing_utils.draw_landmarks(
        annotated_image,
        pose_landmarks_proto,
        solutions.pose.POSE_CONNECTIONS,
        solutions.drawing_styles.get_default_pose_landmarks_style())
 
    return annotated_image
 
base_options = python.BaseOptions(model_asset_path='pose_landmarker_full.task')
options = vision.PoseLandmarkerOptions(base_options=base_options,running_mode=mp.tasks.vision.RunningMode.LIVE_STREAM,
                                       result_callback=print_result, output_segmentation_masks=False)
# The running mode of the task. Default to the image mode. PoseLandmarker has three running modes:
# 1) The image mode for detecting pose landmarks on single image inputs.
# 2) The video mode for detecting pose landmarks on the decoded frames of a video.
# 3) The live stream mode for detecting pose landmarks on the live stream of input data, such as from camera.
# In this mode, the "result_callback" below must be specified to receive the detection results asynchronously.
detector = vision.PoseLandmarker.create_from_options(options)
 
cap = cv2.VideoCapture(0)
 
while True:
    ret, cv_image = cap.read()
    if not ret:
        break
    frame = mp.Image(image_format = mp.ImageFormat.SRGB, data = cv2.cvtColor(cv_image, cv2.COLOR_BGR2RGB))
        
    # Sends live image data to perform pose landmarks detection.
    # The results will be available via the "result_callback" provided in the PoseLandmarkerOptions.
    # Only use this method when the PoseLandmarker is created with the live stream running mode.
    # Only use this method when the PoseLandmarker is created with the live stream running mode.
    # The input timestamps should be monotonically increasing for adjacent calls of this method.
    # This method will return immediately after the input image is accepted. The results will be available via
    # the result_callback provided in the PoseLandmarkerOptions. The detect_async method is designed to process
    # live stream data such as camera input. To lower the overall latency, pose landmarker may drop the input
    # images if needed. In other words, it's not guaranteed to have output per input image.
    detector.detect_async(frame, int(time.time()*1000))
 
    annotated_image = draw_landmarks_on_image(frame.numpy_view(), landmark_result, True)
    cv2.imshow('sean', cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR))
    
    key = cv2.waitKey(25)
    if key == 27# ESC
        break
 
if cap.isOpened():
    cap.release()
cv2.destroyAllWindows()
detector.close()
 

 

 

 

이번엔 감지한 자세를 부위별로 나눠 위치 정보를 표시해 보자.

 

 

0 - nose
1 - left eye (inner)
2 - left eye
3 - left eye (outer)
4 - right eye (inner)
5 - right eye
6 - right eye (outer)
7 - left ear
8 - right ear
9 - mouth (left)
10 - mouth (right)
11 - left shoulder
12 - right shoulder
13 - left elbow
14 - right elbow
15 - left wrist
16 - right wrist
17 - left pinky
18 - right pinky
19 - left index
20 - right index
21 - left thumb
22 - right thumb
23 - left hip
24 - right hip
25 - left knee
26 - right knee
27 - left ankle
28 - right ankle
29 - left heel
30 - right heel
31 - left foot index
32 - right foot index

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
import time
import numpy as np
import cv2
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
 
landmark_result = None
 
def print_result(result: vision.PoseLandmarkerResult, output_image: mp.Image, timestamp_ms: int):
    global landmark_result
    landmark_result = result
 
    if result is None or result.pose_landmarks == []:
        return
 
    print("       Nose(0): (x: %.2f, y: %.2f, z: %5.2f, presense: %.2f, visibility: %.2f)"
          %(result.pose_landmarks[0][0].x, result.pose_landmarks[0][0].y, result.pose_landmarks[0][0].z,
            result.pose_landmarks[0][0].presence, result.pose_landmarks[0][0].visibility))
    print("Right Knee(26): (x: %.2f, y: %.2f, z: %5.2f, presense: %.2f, visibility: %.2f)"
          %(result.pose_landmarks[0][26].x, result.pose_landmarks[0][26].y, result.pose_landmarks[0][26].z,
            result.pose_landmarks[0][26].presence, result.pose_landmarks[0][26].visibility))
 
def draw_landmarks_on_image(rgb_image, detection_result, bg_black):    
    if bg_black:
        annotated_image = np.zeros_like(rgb_image)
    else:
        annotated_image = np.copy(rgb_image)
 
    if detection_result is None or detection_result.pose_landmarks == []:
        return annotated_image
    
    pose_landmarks_list = detection_result.pose_landmarks
 
    for idx in range(len(pose_landmarks_list)):
        pose_landmarks = pose_landmarks_list[idx]
 
    pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
    pose_landmarks_proto.landmark.extend([
        landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks
    ])
    solutions.drawing_utils.draw_landmarks(
        annotated_image,
        pose_landmarks_proto,
        solutions.pose.POSE_CONNECTIONS,
        solutions.drawing_styles.get_default_pose_landmarks_style())
 
    return annotated_image
 
base_options = python.BaseOptions(model_asset_path='pose_landmarker_full.task')
options = vision.PoseLandmarkerOptions(base_options=base_options,running_mode=mp.tasks.vision.RunningMode.LIVE_STREAM,
                                       result_callback=print_result, output_segmentation_masks=False)
detector = vision.PoseLandmarker.create_from_options(options)
 
cap = cv2.VideoCapture(0)
 
while True:
    ret, cv_image = cap.read()
    if not ret:
        break
    frame = mp.Image(image_format = mp.ImageFormat.SRGB, data = cv2.cvtColor(cv_image, cv2.COLOR_BGR2RGB))
 
    detector.detect_async(frame, int(time.time()*1000))
 
    annotated_image = draw_landmarks_on_image(frame.numpy_view(), landmark_result, True)
    cv2.imshow('sean', cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR))
    
    key = cv2.waitKey(25)
    if key == 27# ESC
        break
 
if cap.isOpened():
    cap.release()
cv2.destroyAllWindows()
detector.close()
 

 

 

0번 코, 26번 오른쪽 무릎에 대한 정보가 표시된다.

 

  • The output contains the following normalized coordinates (Landmarks):
  • x and y: Landmark coordinates normalized between 0.0 and 1.0 by the image width (x) and height (y).
  • z: The landmark depth, with the depth at the midpoint of the hips as the origin. The smaller the value, the closer the landmark is to the camera. The magnitude of z uses roughly the same scale as x.
  • visibility: The likelihood of the landmark being visible within the image.

 

※ 참고

A Tutorial on Finger Counting in Real-Time Video in Python with OpenCV and MediaPipe

 

 

반응형
Posted by J-sean
:
반응형

A YouTubePlayer provides methods for loading, playing and controlling YouTube video playback.


Copy 'YouTubeAndroidPlayerApi.jar' to 'app/libs' and sync project with gradle files.


<AndroidManifest.xml>

1
    <uses-permission android:name="android.permission.INTERNET"/>



<activity_main.xml>

1
2
3
4
    <com.google.android.youtube.player.YouTubePlayerView
        android:layout_width="match_parent"
        android:layout_height="match_parent"
        android:id="@+id/youTubePlayerView"/>



<MainActivity.java>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
public class MainActivity extends YouTubeBaseActivity {
 
    YouTubePlayerView youTubePlayerView;
    YouTubePlayer player;
 
    private static String API_KEY = "AIyaSyDpsLddBj2ISc-NHU4sxWFh4JlcHNELir8";  // Your API Key
    private static String videoId = "Mx5GmonOiKo"// Youtube video ID from https://youtu.be/Mx5GmonOiKo
 
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
 
        initPlayer();
 
        Button button = findViewById(R.id.button);
        button.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View v) {
                loadVideo();
            }
        });
 
        Button button2 = findViewById(R.id.button2);
        button2.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View v) {
                playVideo();
            }
        });
    }
 
    public void initPlayer() {
        youTubePlayerView = findViewById(R.id.youTubePlayerView);
        youTubePlayerView.initialize(API_KEY, new YouTubePlayer.OnInitializedListener() {
            @Override
            public void onInitializationSuccess(YouTubePlayer.Provider provider, final YouTubePlayer youTubePlayer, boolean b) {
 
                player = youTubePlayer;
 
                youTubePlayer.setPlayerStateChangeListener(new YouTubePlayer.PlayerStateChangeListener() {
                    @Override
                    public void onLoading() {
 
                    }
 
                    @Override
                    public void onLoaded(String s) {
                        Toast.makeText(getApplicationContext(), s + " loaded", Toast.LENGTH_SHORT).show();
                    }
 
                    @Override
                    public void onAdStarted() {
 
                    }
 
                    @Override
                    public void onVideoStarted() {
 
                    }
 
                    @Override
                    public void onVideoEnded() {
 
                    }
 
                    @Override
                    public void onError(YouTubePlayer.ErrorReason errorReason) {
 
                    }
                });
            }
 
            @Override
            public void onInitializationFailure(YouTubePlayer.Provider provider, YouTubeInitializationResult youTubeInitializationResult) {
 
            }
        });
    }
 
    public void loadVideo() {
        if (player != null) {
            player.cueVideo(videoId);
            // Loads the specified video's thumbnail and prepares the player to play the video, but does not download any of the video stream
            // until play() is called.
        }
    }
 
    public void playVideo() {
        if (player != null) {
            if (player.isPlaying()) {
                player.pause();
            } else {
                player.play();
            }
        }
    }
}


Your activity needs to extend YouTubeBaseActivity.



Run the app and click the LOAD button.


It loads the video.


Play and enjoy the video.


반응형
Posted by J-sean
: