LongShOT Leaderboard

48 models

Rankings

Model Leaderboard

Performance rankings across 16 task categories on LongShOTBench. All scores in %.

Last updated: March 30, 2026

Qwen3-Omni-30B-A3B-Thinking

Qwen3-Omni-30B-A3B-Thinking

T130B (3B active)
61.52%
cp
49.3
re
71.6
info
62.8
mm
62.5
Qwen3-VL-32B-Thinking

Qwen3-VL-32B-Thinking

T332B
45.66%
cp
36.7
re
52.6
info
43.7
mm
49.6
Qwen3-VL-30B-A3B-Thinking

Qwen3-VL-30B-A3B-Thinking

T330B (3B active)
42.40%
cp
34.2
re
48.8
info
39.9
mm
46.6
CP Core Perception TasksRE Reasoning TasksInfo Information TasksMM Multimodal Tasks
Score Category Modality Duration
#ModelParamsInputOverallCPREInfoMMVisualAudioSpeechShortMedLong
Qwen3-Omni-30B-A3B-Thinking
Qwen3-Omni-30B-A3B-Thinking
Native Video + Audio
30B (3B active)V+A61.52
49.2671.5962.7762.4559.9953.6961.7758.6460.7558.17
Qwen3-VL-32B-Thinking
Qwen3-VL-32B-Thinking
Mid-Size Video VLMs (10B–50B)
32BVideo45.66
36.7352.6243.7049.6143.6342.4244.4542.4842.8545.37
Qwen3-VL-30B-A3B-Thinking
Qwen3-VL-30B-A3B-Thinking
Mid-Size Video VLMs (10B–50B)
30B (3B active)Video42.40
34.2148.8239.9246.6440.2138.7441.0638.5839.3342.27
4
Qwen3-VL-8B-Thinking
Qwen3-VL-8B-Thinking
Small Video/Vision Models (5B–10B)
8BVideo41.68
32.9047.7438.7047.3938.7037.3739.4541.4737.8840.46
5
Intern-S1-mini
Intern-S1-mini
Mid-Size Video VLMs (10B–50B)
14BVideo39.85
31.7445.5236.8945.2637.2435.3737.9736.7636.3039.45
6
Gemma-3-27B-IT
Gemma-3-27B-IT
Mid-Size Video VLMs (10B–50B)
27BFrames39.65
30.7848.2636.5543.0237.1535.0837.9639.9436.1739.09
7
Gemma-3-12B-IT
Gemma-3-12B-IT
Mid-Size Video VLMs (10B–50B)
12BFrames38.19
29.9045.3634.1243.3735.2633.2136.0933.8134.2037.63
8
Qwen3-VL-4B-Thinking
Qwen3-VL-4B-Thinking
Compact Models (≤5B)
4BVideo36.26
28.5741.3533.0842.0333.6432.2334.2135.2232.9335.12
9
Qwen3-VL-32B-Instruct
Qwen3-VL-32B-Instruct
Mid-Size Video VLMs (10B–50B)
32BVideo35.99
29.5440.2932.4841.6733.5231.9734.0038.0533.2233.88
10
MiMo-VL-7B-RL-2508
MiMo-VL-7B-RL-2508
Small Video/Vision Models (5B–10B)
7BVideo35.12
26.3541.5831.1341.4232.1030.1032.7731.3331.3333.79
11
Gemma-3-4B-IT
Gemma-3-4B-IT
Compact Models (≤5B)
4BFrames34.38
25.8740.1130.5540.9731.1630.1031.9134.3430.1433.34
12
Qwen3-Omni-30B-A3B-Instruct
Qwen3-Omni-30B-A3B-Instruct
Native Video + Audio
30B (3B active)V+A33.76
25.6442.5930.2436.5830.7227.0831.6522.6030.6531.38
13
Qwen2.5-VL-32B-Instruct
Qwen2.5-VL-32B-Instruct
Mid-Size Video VLMs (10B–50B)
32BVideo31.36
26.1635.2328.5535.5129.5027.0429.9631.3928.5731.53
14
GLM-4.6V-Flash
GLM-4.6V-Flash
Mid-Size Video VLMs (10B–50B)
9BVideo30.46
22.7434.7526.6737.6927.2025.3327.7729.4426.0829.63
15
Qwen3-VL-8B-Instruct
Qwen3-VL-8B-Instruct
Small Video/Vision Models (5B–10B)
8BVideo30.01
24.3833.1725.0237.4626.6425.3027.0532.8026.0127.86
16
Ovis2.6-30B-A3B
Ovis2.6-30B-A3B
Mid-Size Video VLMs (10B–50B)
30B (3B active)Video29.99
22.0033.1926.9237.8527.3725.3727.9132.8026.8628.18
17
Qwen3.5-35B-A3B
Qwen3.5-35B-A3B
Mid-Size Video VLMs (10B–50B)
35B (3B active)Video26.86
17.6532.5922.1335.0722.6020.2423.2323.7822.0023.91
18
Qwen3.5-27B
Qwen3.5-27B
Mid-Size Video VLMs (10B–50B)
27BVideo26.64
18.1933.2723.0432.0423.7020.6024.3223.2422.8825.58
19
Qwen2.5-VL-72B-Instruct
Qwen2.5-VL-72B-Instruct
Large Video VLMs (50B+)
72BVideo26.34
20.5430.0922.8031.9124.1521.3924.6024.6623.2426.17
20
Qwen3-VL-30B-A3B-Instruct
Qwen3-VL-30B-A3B-Instruct
Mid-Size Video VLMs (10B–50B)
30B (3B active)Video25.82
21.5327.9122.4631.3723.8723.1524.2629.5023.3224.81
21
InternVL3-78B
InternVL3-78B
Large Video VLMs (50B+)
78BVideo23.58
18.4725.1221.1629.5820.8918.7121.2125.0720.3121.80
22
InternVL3-38B
InternVL3-38B
Mid-Size Video VLMs (10B–50B)
38BVideo23.48
19.2926.7920.0727.7521.1418.9221.4926.0820.6522.13
23
Qwen3-VL-4B-Instruct
Qwen3-VL-4B-Instruct
Compact Models (≤5B)
4BVideo23.34
18.3325.4219.3730.2520.6719.1321.0029.0320.1021.71
24
Qwen3.5-9B
Qwen3.5-9B
Small Video/Vision Models (5B–10B)
9BVideo23.30
14.6329.2419.8829.4719.0517.2819.5521.1818.5320.15
25
Ovis2.5-9B
Ovis2.5-9B
Small Video/Vision Models (5B–10B)
9BVideo23.19
17.5025.8419.1630.2620.3118.7620.6025.0119.6921.50
26
Molmo2-8B
Molmo2-8B
Small Video/Vision Models (5B–10B)
8BVideo23.10
17.5224.7218.7831.3820.1218.7220.3228.6119.7720.43
27
InternVL3-14B
InternVL3-14B
Mid-Size Video VLMs (10B–50B)
14BVideo21.21
17.3523.7518.5625.1919.6917.5619.9820.0619.0721.02
28
InternVL3.5-8B
InternVL3.5-8B
Small Video/Vision Models (5B–10B)
8BVideo21.10
15.3221.4917.5730.0217.9716.6218.1721.0017.4818.90
29
LongVT-RL
LongVT-RL
Small Video/Vision Models (5B–10B)
7BVideo20.91
16.4421.4118.0827.7318.8118.2219.0321.8318.3119.73
30
InternVL3.5-38B
InternVL3.5-38B
Mid-Size Video VLMs (10B–50B)
38BVideo20.73
15.8022.0616.4128.6518.4217.2218.6621.4217.7619.76
31
InternVL2.5-78B
InternVL2.5-78B
Large Video VLMs (50B+)
78BVideo20.38
16.4021.9716.3026.8717.4516.1717.7420.0616.9218.52
32
Ovis2-16B
Ovis2-16B
Mid-Size Video VLMs (10B–50B)
16BFrames19.84
15.0022.4315.0926.8617.4416.2217.7519.9417.0718.17
33
InternVL3-8B
InternVL3-8B
Small Video/Vision Models (5B–10B)
8BVideo18.94
15.3618.9215.9725.5216.8815.5517.0020.0016.4517.68
34
InternVL2.5-38B
InternVL2.5-38B
Mid-Size Video VLMs (10B–50B)
38BVideo18.86
15.9620.3114.1325.0216.8715.9717.1717.2916.2518.23
35
Qwen2.5-VL-7B-Instruct
Qwen2.5-VL-7B-Instruct
Small Video/Vision Models (5B–10B)
7BVideo18.39
15.2620.4417.0020.8417.7516.5318.0420.4717.3218.60
36
Kimi-VL-A3B-Thinking
Kimi-VL-A3B-Thinking
Mid-Size Video VLMs (10B–50B)
8B (3B active)Frames17.89
11.5220.9414.4824.6114.8513.3715.2015.2814.6615.11
37
Qwen3.5-4B
Qwen3.5-4B
Compact Models (≤5B)
4BVideo17.78
11.4923.0215.4021.2015.4914.3915.8516.6415.5015.23
38
InternVL3.5-4B-Instruct
InternVL3.5-4B-Instruct
Compact Models (≤5B)
4BVideo17.11
13.8418.0213.3923.2015.3313.7515.5713.3914.9016.43
39
Keye-VL-1.5-8B
Keye-VL-1.5-8B
Small Video/Vision Models (5B–10B)
8BVideo17.09
12.8119.2614.3321.9415.1214.5615.3718.2914.5516.24
40
MiniCPM-o 4.5
MiniCPM-o 4.5
Native Video + Audio
8BVideo17.05
15.0820.0516.9216.130.000.000.000.000.000.00
41
MiniCPM-V 4.5
MiniCPM-V 4.5
Small Video/Vision Models (5B–10B)
8BVideo16.29
12.5417.8412.7322.0614.0113.8714.2914.9313.5315.13
42
Qwen2.5-Omni-7B
Qwen2.5-Omni-7B
Native Video + Audio
7BV+A15.89
12.4916.9214.3619.8114.7512.9715.0817.1714.7314.81
43
Ovis2-8B
Ovis2-8B
Small Video/Vision Models (5B–10B)
8BFrames15.16
11.6116.2612.0420.7412.9212.5113.1012.8012.3014.36
44
MiniCPM-o 2.6
MiniCPM-o 2.6
Native Video + Audio
8BVideo14.41
11.2714.3611.7820.240.000.000.000.000.000.00
45
Ovis2-4B
Ovis2-4B
Compact Models (≤5B)
4BFrames13.07
9.8213.4710.1818.8311.0410.8211.0910.5010.3412.63
46
Kimi-VL-A3B-Instruct
Kimi-VL-A3B-Instruct
Mid-Size Video VLMs (10B–50B)
8B (3B active)Frames11.35
8.1211.238.1217.929.379.159.568.859.239.65
47
LLaVA-OneVision-7B
LLaVA-OneVision-7B
Small Video/Vision Models (5B–10B)
7BVideo9.14
7.208.867.5812.927.917.898.069.917.857.95
48
Qwen2-Audio-7B-Instruct
Qwen2-Audio-7B-Instruct
Audio-Only LLMs
7BAudio8.07
6.047.616.1612.477.006.887.146.196.647.92