The spontaneity of Level-1 visual perspective taking has been widely discussed in the field of visual perspective taking. Many studies have confirmed that Level-1 visual perspective taking is spontaneously activated, but this finding has mainly been observed in the context of a single avatar’s presence. Scenarios involving two or more avatars have received scant attention. Specifically, no suitable experimental paradigm has been developed for situations with multiple avatars in this domain. Therefore, the paradigm adapted from those employed by Samson et al. (2010) and Mattan et al. (2015) was used in this study. The stimuli of virtual scenes were modeled by 3D Max, and the experimental procedures were programmed by E-Prime, recording the accuracy rate and reaction time. In contrast to previous studies, this paper explores whether and how the multiple avatars affect the process of Level-1 visual perspective taking, and clarifies the influencing factors by varying the conditions.
This study comprises three experiments. In Experiment 1, the paradigm adapted from the classic “dot-perspective task” was employed (see Figure 1) to investigate whether participants would spontaneously compute another perspective in the presence of a single avatar (target avatar). In Experiment 2, an additional avatar (irrelevant avatar) is introduced to explore how the consistency in the number of dots seen by the avatars would affect the perspective-taking process (see Figure 2). Subsequently, Experiment 3 excluded the influence of the consistency in the number of dots seen by the avatars and investigated whether consistency in the line of sight would impact the presence of multiple avatars (see Figure 3).
The results of three experiments are shown in Table 1-3.
In Experiment 1, the results of 2×2 repeated measures analysis of variance for accuracy were as follows: (1) The main effect of judging perspective was not significant, F(1, 31) = 0.01, p = 0.93. (2) The main effect of self-avatar dot-number consistency was significant, F(1, 31) = 32.31, p < 0.001, η2 p = 0.51, 95% CI = [0.03, 0.07], the accuracy under consistent condition (M = 0.97, SE = 0.01) was higher than that under inconsistent condition (M = 0.92, SE = 0.01). (3) The interaction effect of judging perspective and self-avatar dot-number consistency was not significant, F(1, 31) = 0.12, p = 0.93.
The results for response time of correct responses were as follows: (1) The main effect of judging perspective was significant, F(1, 31) = 53.83, p < 0.001, η2 p = 0.64, 95% CI = [−91.45, −51.66], the response time was shorter when judged from the self’s perspective (M = 732.72 ms, SE = 29.01 ms) than from the avatar’s perspective (M = 804.27 ms, SE = 29.24 ms). (2) The main effect of self-avatar dot-number consistency was significant, F(1, 31) = 96.09, p < 0.001, η2 p = 0.76, 95% CI = [−80.78, −52.96], the response time was shorter under consistent condition (M = 735.06 ms, SE = 27.38 ms) than that under inconsistent condition (M = 801.93 ms, SE = 30.38 ms). (3) The interaction effect of judging perspective and self-avatar dot-number consistency was significant, F(1, 31) = 42.86, p < 0.001, η2 p = 0.58. Simple effects analysis showed that under the self’s perspective, the response time was significantly shorter when self-avatar dot-number was consistent (M = 715.49 ms, SE = 27.87 ms) than when inconsistency was present (M = 749.94 ms, SE = 30.72 ms), F(1, 31) = 16.12, p < 0.001; under the avatar’s perspective, the response time was also significantly shorter when self-avatar dot-number was consistent (M = 754.62 ms, SE = 27.72 ms) than when inconsistency was present (M = 853.92 ms, SE = 31.24 ms), F(1, 31) = 143.93, p < 0.001; however, under the avatar’s perspective, the regression line for the response time affected by self-avatar dot-number consistency was steeper (see Figure 4).
In Experiment 2, the repeated measures ANOVA with 2×2×2 design was conducted. The results of accuracy rate are as follows: (1) The main effect of judging perspective was significant, F(1, 33) = 13.18, p < 0.01, η2 p = 0.29, 95% CI = [0.02, 0.06]. When judged from self’s perspective (M = 0.95, SE = 0.01), the accuracy rate was higher than that from avatar A’s perspective (M = 0.91, SE = 0.01). (2) The main effect of self-avatar A dot-number consistency was significant, F(1, 33) = 25.10, p < 0.001, η2 p = 0.43, 95% CI = [0.03, 0.07]. When self’s and avatar A’s dot number was consistent (M = 0.96, SE < 0.01), the accuracy rate is higher than that when inconsistent (M = 0.91, SE = 0.01). (3) The main effect of avatar A-B dot-number consistency was not significant, F(1, 33) = 0.53, p = 0.47. (4) The interaction effect of judging perspective and self-avatar A dot-number consistency was significant, F(1, 33) = 10.80, p < 0.01, η2 p = 0.25. When judged from avatar A’s perspective, the accuracy rate is significantly higher when self’s and avatar A’s dot number was consistent (M = 0.96, SE = 0.01) than when inconsistent (M = 0.87, SE = 0.02), F(1, 33) = 21.34, p < 0.001; while judged from self’s perspective, there was no significant difference in accuracy rate between the two conditions, F(1, 33) = 2.60, p = 0.12. (5) The interaction effect of judging perspective and self-avatar A dot-number consistency was not significant, F(1, 33) = 3.24, p = 0.08. (6) The interaction effect of self-avatar A dot-number consistency and avatar A-B dot-number consistency was not significant, F(1, 33) = 0.77, p = 0.39. (7) The three-way interaction effect of three variables was significant, F(1, 33) = 6.90, p < 0.05, η2 p = 0.17. Only when judged from self’s perspective and self’s and avatar A’s dot number was inconsistent, the accuracy rate of avatar A-B dot-number consistency (M = 0.96, SE = 0.01) was significantly higher than that of inconsistency (M = 0.93, SE = 0.01), F(1, 33) = 5.41, p = 0.03. No significant effect was found in other directions, Fs < 2.45, ps > 0.13. See Figure 5 for details.
The results for response time of correct responses were as follows: (1) The main effect of judging perspective was significant, F(1, 33) = 99.37, p < 0.001, η2 p = 0.75, 95% CI = [−189.28, −125.11]. When judged from self’s perspective (M = 796.72 ms, SE = 36.88 ms), the correct response time was shorter than that judged from other perspective (M = 953.91 ms, SE = 40.29 ms). (2) The main effect of self-avatar A dot-number consistency was significant, F(1, 33) = 121.44, p < 0.001, η2 p = 0.79, 95% CI = [−79.47, −54.70]. When self’s and avatar A’s dot number was consistent (M = 841.77 ms, SE = 37.35 ms), the correct response time was shorter than that when inconsistent (M = 908.86 ms, SE = 38.50 ms). (3) The main effect of avatar A-B dot-number consistency was not significant, F(1, 33) = 0.04, p = 0.85. (4) The interaction effect of judging perspective and self-avatar A dot-number consistency was significant, F(1, 33) = 75.07, p < 0.001, η2 p = 0.70. When judged from avatar A’s perspective, the correct response time when self’s and avatar A’s dot number was consistent (M = 888.16 ms, SE = 38.75 ms) was significantly shorter than that when inconsistent (M = 1019.67 ms, SE = 42.48 ms), F(1, 33) = 146.30, p < 0.001; but this significant difference did not exist when judged from self’s perspective, F(1, 33) = 0.11, p = 0.75. (5) The interaction effect of judging perspective and avatar A-B dot-number consistency was significant, F(1, 33) = 11.04, p < 0.01, η2 p = 0.25. When judged from self’s perspective, the correct response time when avatar A’s and B’s dot number was consistent (M = 787.07 ms, SE = 37.09 ms) was significantly shorter than that when inconsistent (M = 806.36 ms, SE = 37.21 ms); but this significant difference did not exist when judged from avatar A’s perspective, F(1, 33) = 2.29, p = 0.14. (6) The interaction effect of self-avatar A dot-number consistency and avatar A-B dot-number consistency was significant, F(1, 33) = 21.25, p < 0.001, η2 p = 0.39. When the self’s and avatar A’s dot number was consistent, the correct response time when avatar A’s and B’s dot number was consistent (M = 825.45 ms, SE = 38.13 ms) was significantly shorter than that when inconsistent (M = 858.10 ms, SE = 37.05 ms), F(1, 33) = 14.42, p < 0.001; when the self’s and avatar A’s dot number was inconsistent, the correct response time when avatar A’s and B’s dot number was inconsistent (M = 894.13 ms, SE = 36.78 ms) was significantly shorter than that when consistent (M = 923.58 ms, SE = 41.11 ms), F(1, 33) = 5.63, p < 0.05. (7) The three-way interaction effect of three variables was not significant, F(1, 33) = 2.58, p = 0.12; see Figure 6 for details.
In Experiment 3, the repeated measures ANOVA with 2×2×2 design was conducted. The results of accuracy rate are as follows: (1) The main effect of judging perspective was significant, F(1, 36) = 106.09, p < 0.001, η2 p = 0.75, 95% CI = [0.04, 0.06]. When judged from self’s perspective (M = 0.96, SE = 0.01), the accuracy rate was significantly higher than that judged from avatar A’s perspective (M = 0.91, SE = 0.01). (2) The main effect of self-avatar A dot-number consistency was significant, F(1, 36) = 105.11, p < 0.001, η2 p = 0.75, 95% CI = [0.04, 0.06]. When self’s and avatar A’s dot number was consistent (M = 0.96, SE = 0.01), the accuracy rate was significantly higher than that when inconsistent (M = 0.91, SE = 0.01). (3) The main effect of avatar A-B line of sight consistency was significant, F(1, 36) = 10.36, p < 0.01, η2 p = 0.22, 95% CI = [0.01, 0.03]. When avatar A’s and B’s line of sight was consistent (M = 0.94, SE = 0.01), the accuracy rate was significantly higher than that when inconsistent (M = 0.92, SE = 0.01). (4) The interaction effect of judging perspective and self-avatar A dot-number consistency was significant, F(1, 36) = 74.76, p < 0.001, η2 p = 0.68. When judged from avatar A’s perspective, the accuracy rate when self’s and avatar A’s dot number was consistent (M = 0.96, SE = 0.01) was significantly higher than that when inconsistent (M = 0.86, SE = 0.01), F(1, 36) = 129.44, p < 0.001; but this significant difference did not exist when judged from self’s perspective, F(1, 36) = 0.67, p = 0.42. (5) The interaction effect of judging perspective and avatar A-B line of sight consistency was not significant, F(1, 36) < 0.001, p = 0.99. (6) The interaction effect of self-avatar A dot-number consistency and avatar A-B line of sight consistency was not significant, F(1, 36) = 0.72, p = 0.40. (7) The three-way interaction effect of three variables was significant, F(1, 36) = 12.16, p < 0.001, η2 p = 0.25. When judged from self’s perspective and self’s and avatar A’s dot number was consistent (F(1, 36) = 10.60, p < 0.01), or when judged from avatar A’s perspective and self’s and avatar A’s dot number was inconsistent (F(1, 36) = 9.52, p < 0.01), the accuracy rates under the condition of avatar A-B line of sight consistency (M = 0.97, SE = 0.01; M = 0.88, SE = 0.01 separately) were significantly higher than that under the condition of inconsistency (M = 0.94, SE = 0.01; M = 0.84, SE = 0.01 separately), the effects in other directions were not significant, Fs < 1.10, ps > 0.30; see Figure 7 for details.
The results for response time of correct responses were as follows: (1) The main effect of judging perspective was significant, F(1, 36) = 116.77, p < 0.001, η2 p = 0.76, 95% CI = [−152.23, −104.12], indicating that the correct response time was shorter when judged from self’s perspective (M = 786.71 ms, SE = 28.17 ms) than from avatar A’s perspective (M = 914.89 ms, SE = 27.66 ms). (2) The main effect of self-avatar A dot-number consistency was significant, F(1, 36) = 62.78, p < 0.001, η2 p = 0.64, 95% CI = [−66.50, −39.40], indicating that the correct response time was shorter when self-avatar A dot-number consistency was present (M = 824.32 ms, SE = 27.10 ms) than when it was absent (M = 877.27 ms, SE = 27.87 ms). (3) The main effect of avatar A-B line of sight consistency was significant, F(1, 36) = 11.20, p < 0.01, η2 p = 0.24, 95% CI = [6.84, 27.73], indicating that the correct response time was shorter when their line of sight was inconsistent (M = 842.16 ms, SE = 27.37 ms) than when consistent (M = 858.43 ms, SE = 27.44 ms). (4) The interaction effect between judging perspective and self-avatar A dot-number consistency was significant, F(1, 36) = 18.53, p < 0.001, η2 p = 0.34. The correct response time was significantly shorter when self-avatar A dot-number consistency was present (M = 774.27 ms, SE = 28.01 ms; M = 874.38 ms, SE = 28.04 ms separately) than when it was absent (M = 799.15 ms, SE = 29.05 ms; M = 955.40 ms, SE = 28.11 ms separately) under both self’s perspective (F(1, 36) = 7.58, p = 0.01) and avatar A’s perspective (F(1, 36) = 70.78, p < 0.001). However, the regression line of self-avatar A dot-number consistency on correct response time was steeper under avatar A’s perspective than under self’s perspective. (5) The interaction effect between judging perspective and avatar A-B line of sight consistency was significant, F(1, 36) = 12.00, p < 0.01, η2 p = 0.25. When judged from avatar A’s perspective, the correct response time was significantly shorter when avatars’ line of sight was inconsistent (M = 896.55 ms, SE = 27.39 ms) than when consistent (M = 933.23 ms, SE = 28.37 ms), F(1, 36) = 26.79, p < 0.001; however, there was no such difference when judged from self’s perspective (F(1, 36) = 0.07, p = 0.79). (6) The interaction effect between self-avatar A dot-number consistency and avatar A-B line of sight consistency was significant, F(1, 36) = 5.22, p < 0.05, η2 p = 0.13. When self’s and avatar A’s dot number was inconsistent, the correct response time was significantly shorter when avatars’ line of sight was inconsistent (M = 863.01 ms, SE = 27.64 ms) than when consistent (M = 891.54 ms, SE = 28.61 ms), F(1, 36) = 12.88, p < 0.01; however, there was no such difference when self’s and avatar A’s dot number was consistent (F(1, 36) = 0.93, p = 0.34). (7) The three-way interaction effect of three variables was not significant, F(1, 36) = 0.96, p = 0.33; see Figure 8 for details.
Based on the results from these three experiments, the conclusions drawn are as follows:
(1) The adapted paradigm proved to be feasible, successfully replicating the results of previous studies: which indicated that Level-1 visual perspective taking was spontaneously activated in the presence of a single avatar. The perspectives of the self and the avatar mutually interfered with each other. For self-perspective judging trials, the avatar’s perspective was spontaneously activated, leading to altercentric intrusion, and conversely, egocentric intrusion could occur.
(2) In the presence of multiple avatars, the Level-1 visual perspective-taking process remained spontaneous. Furthermore, the consistency in the number of objects seen by avatars resulted in a group-perspective effect, especially during self-perspective judgments. When the number of objects seen by the participant and the target avatar was consistent, the group perspective had a positive impact on the judgment. Conversely, if there was an inconsistency, it would impede the judgment process.
(3) When the number of objects seen by avatars was set to be inconsistent, the consistency of the line of sight could still capture the participants’ attention to both avatars. This situation leads to interference from the perspective of the irrelevant avatar, further influencing the spontaneity of Level-1 visual perspective taking, whether it was judged from the perspective of the self or the avatar.
In summary, perspective computation occurs effortlessly, flexibly, and spontaneously in scenarios involving multiple avatars, whether considering the perspective of a target or irrelevant avatar. The outcomes of perspective taking can either enhance or impede the performance in dot-perspective tasks, depending on the relationship among the “self, target avatar, and irrelevant avatar”, while exhibiting distinctive performance traits based on the specific situation.