How to detect language of undefined subtitles track

Hello All,
OS: Ubuntu 20.04
I'd like to know if there is a way to detect subtitles language of a video.
ffmpeg -i $file shows "undefined" for each subtitles track.
In Vlc, I can see "de", "en", "it" etc.
I do not know why ffmpeg can't detect two digits language code.
Which tool do you recommend me to classify all my video files?

Thanks in advance
Boris

@baris35 , supply more detail - format of the video file(s) for starters .mp4/avi/mkv/webm ....

have you searched online for potential answers ?

have you tried ffprobe ?

tks

Hello Dear Munke,
Yes,I have used search engine before posting here.
There was similar question on reddit without response.
Extension: mp4

ffmpeg -i LEA.2011.mp4

  Duration: 01:33:08.00, start: 0.000000, bitrate: 2451 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 864x432 [SAR 1:1 DAR 2:1], 2316 kb/s, 25 fps, 25 tbr, 19200 tbn, 50 tbc (default)
    Metadata:
      handler_name    : USP Video Handler
    Stream #0:1(fra): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : USP Sound Handler
    Stream #0:2(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s (default)
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:3(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:4(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:5(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:6(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:7(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:8(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler

In VLC, I can see the languages.
Please see attached
sshot
Maybe I need to extract all subtitles, then I need to use a tool, let's say a python script which will automatically detect the language (if there is) and will add language code into filename.

ok, supplying information like this save everybody time - when we know what's been tried etc .

Does the video in question actually have subtitles in those languages ? (have you tried)
Are there any supplemental subtitle files associated with this ? ....

1 Like

When I download the file, all subtitles come inside. I do not manually try to embed/hardcode external subtitles. The weird point is how VLC can see the lang. but ffmpeg can not.

@baris35 , please try

ffprobe -show_streams YOUR-movie-FILE 

and see what that produces , see if you local the various languages amongst the output ... (TAG:title maybe)

Also, what version of vlc are you running ?

Dear Munke,
Thank you but it is not giving any clue regarding languages of subtitle tracks.
Vlc -> Version 3.0.20 Vetinari (Intel 64bit)
Don't worry, I can live with it.

ffprobe -show_streams lea.2011.mp4
ffprobe version 4.2.7-0ubuntu0.1 Copyright (c) 2007-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'lea.2011.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.29.100
  Duration: 01:33:08.00, start: 0.000000, bitrate: 2451 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 864x432 [SAR 1:1 DAR 2:1], 2316 kb/s, 25 fps, 25 tbr, 19200 tbn, 50 tbc (default)
    Metadata:
      handler_name    : USP Video Handler
    Stream #0:1(fra): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : USP Sound Handler
    Stream #0:2(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s (default)
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:3(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:4(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:5(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:6(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:7(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:8(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
    Metadata:
      handler_name    : SubtitleHandler
[STREAM]
index=0
codec_name=h264
codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
profile=Main
codec_type=video
codec_time_base=1/50
codec_tag_string=avc1
codec_tag=0x31637661
width=864
height=432
coded_width=864
coded_height=432
has_b_frames=2
sample_aspect_ratio=1:1
display_aspect_ratio=2:1
pix_fmt=yuv420p
level=31
color_range=unknown
color_space=unknown
color_transfer=unknown
color_primaries=unknown
chroma_location=left
field_order=unknown
timecode=N/A
refs=1
is_avc=true
nal_length_size=4
id=N/A
r_frame_rate=25/1
avg_frame_rate=25/1
time_base=1/19200
start_pts=1939
start_time=0.100990
duration_ts=107289600
duration=5588.000000
bit_rate=2316817
max_bit_rate=N/A
bits_per_raw_sample=8
nb_frames=139700
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=1
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:language=und
TAG:handler_name=USP Video Handler
[/STREAM]
[STREAM]
index=1
codec_name=aac
codec_long_name=AAC (Advanced Audio Coding)
profile=LC
codec_type=audio
codec_time_base=1/48000
codec_tag_string=mp4a
codec_tag=0x6134706d
sample_fmt=fltp
sample_rate=48000
channels=2
channel_layout=stereo
bits_per_sample=0
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/48000
start_pts=0
start_time=0.000000
duration_ts=268223488
duration=5587.989333
bit_rate=128346
max_bit_rate=128346
bits_per_raw_sample=N/A
nb_frames=261938
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=1
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:language=fra
TAG:handler_name=USP Sound Handler
[/STREAM]
[STREAM]
index=2
codec_name=mov_text
codec_long_name=MOV text
profile=unknown
codec_type=subtitle
codec_time_base=0/1
codec_tag_string=tx3g
codec_tag=0x67337874
width=N/A
height=N/A
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/1000000
start_pts=0
start_time=0.000000
duration_ts=5584000000
duration=5584.000000
bit_rate=64
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=1830
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=1
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:language=und
TAG:handler_name=SubtitleHandler
[/STREAM]
[STREAM]
index=3
codec_name=mov_text
codec_long_name=MOV text
profile=unknown
codec_type=subtitle
codec_time_base=0/1
codec_tag_string=tx3g
codec_tag=0x67337874
width=N/A
height=N/A
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/1000000
start_pts=0
start_time=0.000000
duration_ts=5082000000
duration=5082.000000
bit_rate=58
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=1631
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=0
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:language=und
TAG:handler_name=SubtitleHandler
[/STREAM]
[STREAM]
index=4
codec_name=mov_text
codec_long_name=MOV text
profile=unknown
codec_type=subtitle
codec_time_base=0/1
codec_tag_string=tx3g
codec_tag=0x67337874
width=N/A
height=N/A
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/1000000
start_pts=0
start_time=0.000000
duration_ts=5082000000
duration=5082.000000
bit_rate=61
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=1639
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=0
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:language=und
TAG:handler_name=SubtitleHandler
[/STREAM]
[STREAM]
index=5
codec_name=mov_text
codec_long_name=MOV text
profile=unknown
codec_type=subtitle
codec_time_base=0/1
codec_tag_string=tx3g
codec_tag=0x67337874
width=N/A
height=N/A
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/1000000
start_pts=0
start_time=0.000000
duration_ts=5082000000
duration=5082.000000
bit_rate=63
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=1631
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=0
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:language=und
TAG:handler_name=SubtitleHandler
[/STREAM]
[STREAM]
index=6
codec_name=mov_text
codec_long_name=MOV text
profile=unknown
codec_type=subtitle
codec_time_base=0/1
codec_tag_string=tx3g
codec_tag=0x67337874
width=N/A
height=N/A
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/1000000
start_pts=0
start_time=0.000000
duration_ts=5082000000
duration=5082.000000
bit_rate=60
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=1678
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=0
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:language=und
TAG:handler_name=SubtitleHandler
[/STREAM]
[STREAM]
index=7
codec_name=mov_text
codec_long_name=MOV text
profile=unknown
codec_type=subtitle
codec_time_base=0/1
codec_tag_string=tx3g
codec_tag=0x67337874
width=N/A
height=N/A
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/1000000
start_pts=0
start_time=0.000000
duration_ts=5082000000
duration=5082.000000
bit_rate=61
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=1628
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=0
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:language=und
TAG:handler_name=SubtitleHandler
[/STREAM]
[STREAM]
index=8
codec_name=mov_text
codec_long_name=MOV text
profile=unknown
codec_type=subtitle
codec_time_base=0/1
codec_tag_string=tx3g
codec_tag=0x67337874
width=N/A
height=N/A
id=N/A
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/1000000
start_pts=0
start_time=0.000000
duration_ts=5082000000
duration=5082.000000
bit_rate=64
max_bit_rate=N/A
bits_per_raw_sample=N

@baris35 ,

there's definitely 'something' in there .... , mapping/decoding it requires more investigation

grep -E '(name|type|text|subtitle)' baris.dump 
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
      handler_name    : USP Video Handler
      handler_name    : USP Sound Handler
    Stream #0:2(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s (default)
      handler_name    : SubtitleHandler
    Stream #0:3(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
      handler_name    : SubtitleHandler
    Stream #0:4(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
      handler_name    : SubtitleHandler
    Stream #0:5(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
      handler_name    : SubtitleHandler
    Stream #0:6(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
      handler_name    : SubtitleHandler
    Stream #0:7(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
      handler_name    : SubtitleHandler
    Stream #0:8(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
      handler_name    : SubtitleHandler
codec_name=h264
codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
codec_type=video
TAG:handler_name=USP Video Handler
codec_name=aac
codec_long_name=AAC (Advanced Audio Coding)
codec_type=audio
TAG:handler_name=USP Sound Handler
codec_name=mov_text
codec_long_name=MOV text
codec_type=subtitle
TAG:handler_name=SubtitleHandler
codec_name=mov_text
codec_long_name=MOV text
codec_type=subtitle
TAG:handler_name=SubtitleHandler
codec_name=mov_text
codec_long_name=MOV text
codec_type=subtitle
TAG:handler_name=SubtitleHandler
codec_name=mov_text
codec_long_name=MOV text
codec_type=subtitle
TAG:handler_name=SubtitleHandler
codec_name=mov_text
codec_long_name=MOV text
codec_type=subtitle
TAG:handler_name=SubtitleHandler
codec_name=mov_text
codec_long_name=MOV text
codec_type=subtitle
TAG:handler_name=SubtitleHandler
codec_name=mov_text
codec_long_name=MOV text
codec_type=subtitle

Thank you dear @munkeHoller ,
Let me try more with ffmpeg similar cli tools.

Kind regards
Boris

FYI:
here's a dump from 'Pricilla Queen of the desert', a .mkv formatted file, exhibiting language details ...

priscilla.dump (24.7 KB)

1 Like

Thank you Dear @munkeHoller ,
In my understanding the root cause of the issue is a bug in the software I am running to download the source url. What I am going to do is to download all subtitle tracks for each video and then check language in each file and re-encode/re-map again.
Maybe a linux tool detecting the content of each file and tell the language on command line interface would make the process easier.
For example, something handy than below command:

file -e soft priscilla.ell.srt

Thanks
Boris