Forecasting TV ratings of Turkish television series using a two-level machine learning framework


Akgul B., Kucukyilmaz T.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, vol.30, no.3, pp.750-766, 2022 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 30 Issue: 3
  • Publication Date: 2022
  • Doi Number: 10.3906/elk-2105-265
  • Journal Name: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, TR DİZİN (ULAKBİM)
  • Page Numbers: pp.750-766
  • Keywords: TV rating forecast, TV rating prediction, episode rating forecast, regression, Google trends
  • TED University Affiliated: Yes

Abstract

TV rating is a numeric estimate of the popularity of television programs. Forecasting TV ratings is considered an important asset for investment planning of media due to its potential of reducing the risks of future ventures. The aim of this study is to develop a machine learning model capable of efficiently forecasting the TV ratings of Turkish TV series in a practical manner. To this end, two prediction models were proposed for forecasting the TV ratings of television series, facilitating an extensive set of features. A contribution of this study is the inclusion of social media-based features using search trends around television series and exploration of the viability of using these features in place of temporal features. The study presents an extensive evaluation of the forecast performance of the proposed models. The performance of the proposed models were evaluated using a data collection composed of ratings and various attributes of series and their episodes aired at prime-time Turkish broadcast during 2014 and 2018. In the experiments, a theoretical forecast performance was first established with the inclusion of temporal features. Next, a set of practical models were proposed, replacing temporal attributes with social media-based attributes, relating to internet popularity and visibility of the series. The experiments show that, the proposed models achieve up to 1.65% error rate for the theoretical setting and 7.06% error rate for the practical setting.