IET IMAGE PROCESSING, vol.14, no.5, pp.845-852, 2020 (SCI-Expanded)
There is an urgent need for a robust video quality assessment (VQA) model that can efficiently evaluate the quality of a video content varying in terms of the distortion and content type in the absence of the reference video. Considering this need, a novel no reference (NR) model relying on the spatiotemporal statistics of the distorted video in a three-dimensional (3D)-discrete cosine transform (DCT) domain is proposed in this study. While developing the model, as the first contribution, the video contents are adaptively segmented into the cubes of different sizes and spatiotemporal contents in line with the human visual system (HVS) properties. Then, the 3D-DCT is applied to these cubes. Following that, as the second contribution, different efficient features (i.e. spectral behaviour, energy variation, distances between spatiotemporal frequency bands, and DC variation) associated with the contents of these cubes are extracted. After that, these features are associated with the subjective experimental results obtained from the EPFL-PoliMi video database using the linear regression analysis for building the model. The evaluation results present that the proposed model, unlike many top-performing NR-VQA models (e.g. V-BLIINDS, VIIDEO, and SSEQ), achieves high and stable performance across the videos with different contents and distortions.