This work explores a set of well-studied visual saliency features through seven saliency prediction methods with the aim of assessing how applicable they are for estimating visual saliency in dynamic virtual reality (VR) environments that are experienced with head-mounted displays. An in-depth analysis of how the saliency methods that make use of depth cues compare to ones that are based on purely image-based (2D) features is presented. To this end, a user study was conducted to collect gaze data from participants as they were shown the same set of three dynamic scenes in 2D desktop viewing and 3D VR viewing using a head-mounted display. The scenes convey varying visual experiences in terms of contents and range of depth-of-field so that an extensive analysis encompassing a comprehensive array of viewing behaviors could be provided. The results indicate that 2D features matter as much as depth for both viewing conditions, yet depth cue is slightly more important for 3D VR viewing. Furthermore, including depth as an additional cue to the 2D saliency methods improves prediction for both viewing conditions, and the benefit margin is greater in 3D VR viewing.