Comparing the mechanisms of level-1 and level-2 visual perspective taking: Theoretical controversies, behavioral and neuroscientific evidence

doi:10.3724/SP.J.1042.2026.1035

Abstract

Abstract: Visual Perspective Taking (VPT), the ability to simulate and understand anther's visual experience, is traditionally categorized into two levels: Level-1 (judging visibility, i.e., “what” is seen) and Level-2 (judging appearance, i.e., “how” it is seen). The current theories in this field present two opposing views: Two-systems account proposes that these two processes involve separate but complementary cognitive systems, while single-system account suggests that a unified cognitive system is responsible for both theories, however, struggle to fully explain empirical anomalies. To resolve these inconsistencies, this paper proposes a novel Three-Stage Processing Model. This framework suggests that both levels of VPT undergo three sequential phases: (1) Information Processing, (2) Perspective Simulation, and (3) Information Integration with Response Selection.
Stage 1: Information Processing. In this initial stage, both level-1 and level-2 VPT involve the encoding of spatial relationships between the self, others, and objects. However, the depth and scope of this information processing differ. Behavioral evidence suggests that level-1 VPT primarily involves tracking “line-of-sight” paths, requiring relatively shallow representation of whether a physical barrier exists between the agent and the target. In contrast, level-2 VPT demands more fine-grained spatial representation, including the precise orientation and visual morphology of objects as seen from different angles. While both levels share basic spatial encoding in the occipito-parietal cortex, level-2 VPT triggers more extensive activation in dorsal attention and frontoparietal control networks to manage higher representation depth.
Stage 2: Perspective Simulation. This stage marks the most significant divergence between the two levels. In level-1 VPT, perspective simulation is a relatively straightforward process that involves quickly tracking the other's line of sight and determining whether an object is visible. This simulation process relies on rapid, non-embodied mechanisms, such as gaze tracking, that do not require significant cognitive resources. In contrast, level-2 VPT engages more complex and embodied processes, often requiring mental rotation or reconfiguration of the reference frame. This embodied simulation involves a shift from the self's reference frame to that of the other, requiring cognitive resources such as body representation and spatial reasoning. Behavioral studies demonstrate that body posture alignment significantly facilitates level-2 VPT but has little effect on level-1 VPT. Neuroscientific data support this, showing that level-2 VPT specifically activates brain regions associated with body representation, such as the Extrastriate Body Area (EBA) and the insula, which are largely inactive during level-1 VPT.
Stage 3: Information Integration with Response Selection. In the final stage, individuals must integrate the information gathered in the first two stages and making a final judgment about the object or the other person's perspective. During this stage, both level-1 and level-2 VPT share a common mechanism of integrating information about the other person's intentions and mental states. For instance, when an agent exhibits a goal-directed “reach-to-grasp” action, both level-1 VPT and level-2 VPT performance are enhanced, suggesting a shared understanding of others' psychological states at the response stage. However, level-2 VPT generally requires stronger cognitive control to resolve more complex perspective conflicts. Neural evidence regarding “social brain”—specifically the right Temporoparietal Junction (rTPJ) and dorsomedial Prefrontal Cortex (dmPFC)—play a crucial role in managing these conflicts, Although the role of them remains debated, current evidence suggests these regions are likely recruited in both levels when tasks explicitly require processing social intent or involve high interference.
In conclusion, we proposes The Three-Stage Processing Model by integrating evidence from behavioural and neuroscience research. And this model offers a unified framework that accommodates the similarities and distinct differences between Level-1 and Level-2 VPT. To further validate and refine this model, future research should focus on developing experimental paradigms to dissociate these three stages, utilizing high-temporal-resolution techniques to map the model's temporal dynamics, and exploring the triggering conditions for embodied mechanisms in VPT-2 and their cross-modal integration. This study provides a comprehensive framework that paves the way for a more unified theory of spatial and social cognition.

Key words: visual perspective taking, two-systems account, single-system account, spatial cognition

WANG Jiayin, LI Jing. Comparing the mechanisms of level-1 and level-2 visual perspective taking: Theoretical controversies, behavioral and neuroscientific evidence[J]. Advances in Psychological Science, 2026, 34(6): 1035-1048.

References

[1] 李鸿锴. (2025). 读心系统理论之争:双系统还是单系统?. 系统科学学报, 33(1), 46-51.
[2] 李艺, 肖风. (2021). 自动观点采择:内隐心智化与潜心智化的争议.心理科学进展, 29(10), 1887-1900.
[3] 邵雨婷, 李伟健, 孙炳海, 张文海. (2020). 视觉空间观点采择对教师共情的影响: 自我表征抑制和自我视空转换的不同作用.心理科学, 43(4), 871-878.
[4] 吴梦慧, 谢久书, 邓铸. (2022). 视觉观点采择中自我中心性偏差的抑制和归因之争.心理科学进展, 30(1), 179-187.
[5] 肖承丽, 隋雨檠, 肖苏衡, 周仁来. (2021). 空间交互研究新视角:多重社会因素的影响.心理科学进展, 29(5), 796-805.
[6] 张越, 葛贤亮, 田志强, 葛列众. (2018). 基于空间的一级和二级视角转换的行为研究及理论综述.心理科学, 41(2), 504-510.
[7] 赵婧, 王璐, 苏彦捷. (2010). 视觉观点采择的发生发展及其影响因素.心理发展与教育, 26(1), 107-111.
[8] 赵杨柯, 钱秀莹. (2010). 自我中心视角转换——基于自身的心理空间转换.心理科学进展, 18(12), 1864-1871.
[9] Aichhorn M., Perner J., Kronbichler M., Staffen W., & Ladurner G. (2006). Do visual perspective tasks need theory of mind?. NeuroImage, 30(3), 1059-1068.
[10] Apperly I., Riggs K., Simpson A., Chiavarino C., & Samson D. (2006). Is belief reasoning automatic? Psychological Science, 17(10), 841-844.
[11] Apperly, I. A., & Butterfill, S. A. (2009). Do humans have two systems to track beliefs and belief-like states? Psychological Review, 116(4), 953-970.
[12] Baker L. J., Levin D. T., & Saylor M. M. (2016). The extent of default visual perspective taking in complex layouts.Journal of Experimental Psychology: Human Perception and Performance, 42(4), 508-516.
[13] Beck A. A., Rossion B., & Samson D. (2018). An objective neural signature of rapid perspective taking.Social Cognitive and Affective Neuroscience, 13(1), 72-79.
[14] Bohl, V., & van den Bos, W. (2012). Toward an integrative account of social cognition: Marrying theory of mind and interactionism to study the interplay of type 1 and type 2 processes.Frontiers in Human Neuroscience, 6, 274.
[15] Brady N., Leonard S., & Ní Choisdealbha Á. (2024). Visual perspective taking and action understanding.Acta Psychologica, 249, 104467.
[16] Bukowski, H. (2018). The neural correlates of visual perspective taking: A critical review.Current Behaviour Neuroscience, 5, 189-197.
[17] Butterfill, S. A., & Apperly, I. A. (2013). How to construct a minimal theory of mind.Mind & Language, 28(5), 606-637.
[18] Carruthers, P. (2016). Two systems for mindreading?. Review of Philosophy and Psychology, 7(1), 141-162.
[19] Ciorli, T., & Pia, L. (2023). Spatial perspective and identity in visual awareness of the bodily self-other distinction.Scientific Reports, 13(1), 14994.
[20] Cole G. G., Atkinson M., Le A. T. D., & Smith D. T. (2016). Do humans spontaneously take the perspective of others? Acta Psychologica, 164, 165-168.
[21] Cole, G. G., & Millett, A. C. (2019). The closing of the theory of mind: A critique of perspective-taking.Psychonomic Bulletin & Review, 26(6), 1787-1802.
[22] Cole G. G., Millett A. C., Samuel S., & Eacott M. J. (2020). Perspective-taking: In search of a theory.Vision, 4(2), 30.
[23] Conway J. R., Lee D., Ojaghi M., Catmur C., & Bird G. (2017). Submentalizing or mentalizing in a level 1 perspective-taking task: A cloak and goggles test.Journal of Experimental Psychology: Human Perception and Performance, 43(3), 454-465.
[24] Creem-Regehr S. H., Gagnon K. T., Geuss M. N., & Stefanucci J. K. (2013). Relating spatial perspective taking to the perception of other’s affordances: Providing a foundation for predicting the future behavior of others.Frontiers in Human Neuroscience, 7, 596.
[25] David N., Bewernick B. H., Cohen M. X., Newen A., Lux S., Fink G. R., … Vogeley K. (2006). Neural representations of self versus other: Visual-spatial perspective taking and agency in a virtual ball-tossing game.Journal of Cognitive Neuroscience, 18(6), 898-910.
[26] Elekes F., Varga M., & Király I. (2016). Evidence for spontaneous level-2 perspective taking in adults.Consciousness and Cognition, 41, 93-103.
[27] Fischer, T., & Demiris, Y. (2020). Computational modeling of embodied visual perspective-taking.IEEE Transactions on Cognitive and Developmental Systems, 12(4), 723-732.
[28] Flavell J. H., Everett B. A., Croft K., & Flavell E. R. (1981). Young children’s knowledge about visual perception: Further evidence for the level 1-level 2 distinction.Developmental Psychology, 17(1), 99-103.
[29] Fontan A., Cignetti F., Nazarian B., Anton J. L., Vaugoyeau M., & Assaiante C. (2017). How does the body representation system develop in the human brain? Developmental Cognitive Neuroscience, 24, 118-128.
[30] Ford B., Monk R., Litchfield D., & Qureshi A. (2024). Agent-object relationships in level-2 visual perspective taking: An eye-tracking study.Journal of Cognition, 7(1), 72.
[31] Furlanetto T., Becchio C., Samson D., & Apperly I. A. (2016). Altercentric interference in level 1 visual perspective taking reflects the ascription of mental states, not submentalizing.Journal of Experimental Psychology: Human Perception and Performance, 42(2), 158-163.
[32] Gómez-Tabares, A.-S. (2023). Is there continuity from implicit recognition of intentional action in infants to explicit mindreading in preschoolers? Systematic review of longitudinal evidence and theoretical implications.Journal for the Study of Education and Development, 46(4), 950-982.
[33] Gunia A., Moraresku S., & Vlček K. (2021). Brain mechanisms of visuospatial perspective-taking in relation to object mental rotation and the theory of mind.Behavioural Brain Research, 407, 113247.
[34] Guo G., Wang N., Sun C., & Geng H. (2024). Embodied cross-modal interactions based on an altercentric reference frame.Brain Sciences, 14(4), 314.
[35] Hu X., Xu H., Chen H., Shen M., & Zhou J. (2025). Good to see you R2-D2: Inducing spontaneous perspective-taking towards non-human agents through human-like gaze and reach.Cognition, 259, 106101.
[36] Jacob, P. (2019). Challenging the two-systems model of mindreading. In A. Avramides & M. Parrott (Eds.), Knowing other minds (pp. 79-106). Oxford University Press.
[37] Janczyk, M. (2013). Level 2 perspective taking entails two processes: Evidence from PRP experiments.Journal of Experimental Psychology: Learning, Memory, and Cognition, 39(6), 1878-1887.
[38] Kelly J. W., Beall A. C., & Loomis J. M. (2004). Perception of shared visual space: Establishing common ground in real and virtual environments.Presence: Teleoperators and Virtual Environments, 13(4), 442-450.
[39] Kessler, K., & Thomson, L. A. (2010). The embodied nature of spatial perspective taking: Embodied transformation versus sensorimotor interference.Cognition, 114(1), 72-88.
[40] Kloo D., Kristen-Antonow S., & Sodian B. (2020). Progressing from an implicit to an explicit false belief understanding: A matter of executive control? International Journal of Behavioral Development, 44(2), 107-115.
[41] Lieberman M. D., Straccia M. A., Meyer M. L., Du M., & Tan K. M. (2019). Social, self, (situational), and affective processes in medial prefrontal cortex (MPFC): Causal, multivariate, and reverse inference evidence.Neuroscience and Biobehavioral Reviews, 99, 311-328.
[42] Low J., Apperly I. A., Butterfill S. A., & Rakoczy H. (2016). Cognitive architecture of belief reasoning in children and adults: A primer on the two‐systems account.Child Development Perspectives, 10(3), 184-189.
[43] Lukošiūnaitė I., Kovács Á. M., & Sebanz N. (2024). The influence of another’s actions and presence on perspective taking.Scientific Reports, 14(1), 4971.
[44] Marshall J., Gollwitzer A., & Santos L. R. (2018). Does altercentric interference rely on mentalizing?: Results from two level-1 perspective-taking tasks.Plos One, 13(3), e0194101.
[45] Martin A. K., Huang J., Hunold A., & Meinzer M. (2019). Dissociable roles within the social brain for self-other processing: A HD-tDCS study.Cerebral Cortex, 29(8), 3642-3654.
[46] Martin A. K., Kessler K., Cooke S., Huang J., & Meinzer M. (2020). The right temporoparietal junction is causally associated with embodied perspective-taking.The Journal of Neuroscience, 40(15), 3089-3095.
[47] Mayrand F., Capozzi F., & Ristic J. (2024). Gaze communicates both cue direction and agent mental states.Frontiers in Psychology, 15, 1472538.
[48] Mazzarella E., Ramsey R., Conson M., & Hamilton A. (2013). Brain systems for visual perspective taking and action perception.Social Neuroscience, 8(3), 248-267.
[49] McCleery J. P., Surtees A. D. R., Graham K. A., Richards J. E., & Apperly I. A. (2011). The neural and cognitive time course of theory of mind.Journal of Neuroscience, 31(36), 12849-12854.
[50] Michelon, P., & Zacks, J. M. (2006). Two kinds of visual perspective taking.Perception & Psychophysics, 68(2), 327-337.
[51] Müsseler J., von Salm-Hoogstraeten S., & Böffel C. (2022). Perspective taking and avatar-self merging.Frontiers in Psychology, 13, 714464.
[52] Negen, J. (2025). Mental rotation, perspective taking, and performance profiling.Cognitive Processing, 26(3), 531-540.
[53] O’Grady C., Scott-Phillips T., Lavelle S., & Smith K. (2020). Perspective-taking is spontaneous but not automatic.Quarterly Journal of Experimental Psychology, 73(10), 1605-1628.
[54] Pashler, H. (1994). Dual-task interference in simple tasks: Data and theory.Psychological Bulletin, 116(2), 220-244.
[55] Pesimena, G., & Soranzo, A. (2023). Both the domain-general and the mentalising processes affect visual perspective taking.Quarterly Journal of Experimental Psychology, 76(3), 469-484.
[56] Quesque F., Chabanat E., & Rossetti Y. (2018). Taking the point of view of the blind: Spontaneous level-2 perspective- taking in irrelevant conditions.Journal of Experimental Social Psychology, 79, 356-364.
[57] Qureshi, A. W., & Monk, R. L. (2018). Executive function underlies both perspective selection and calculation in level-1 visual perspective taking.Psychonomic Bulletin & Review, 25(4), 1526-1534.
[58] Rochas V., Montandon M.-L., Rodriguez C., Herrmann F. R., Eytan A., Pegna A. J., Michel C. M., & Giannakopoulos P. (2023). Mentalizing and self-other distinction in visual perspective taking: The analysis of temporal neural processing using high-density EEG.Frontiers in Behavioral Neuroscience, 17, 1206011.
[59] Samson, D., & Apperly, I. A. (2010). There is more to mind reading than having theory of mind concepts: New directions in theory of mind research.Infant and Child Development, 19(5), 443-454.
[60] Samson D., Apperly I. A., Braithwaite J. J., Andrews B. J., & Bodley Scott, S. E. (2010). Seeing it their way: Evidence for rapid and involuntary computation of what other people see.Journal of Experimental Psychology: Human Perception and Performance, 36(5), 1255-1266.
[61] Samuel S., Cole G. G., & Eacott M. J. (2023). It’s not you, it’s me: A review of individual differences in visuospatial perspective taking.Perspectives on Psychological Science, 18(2), 293-308.
[62] Samuel S., Erle T. M., Kirsch L. P., Surtees A., Apperly I., Bukowski H., … Quesque F. (2024). Three key questions to move towards a theoretical framework of visuospatial perspective taking.Cognition, 247, 105787.
[63] Samuel S., Salo S., Ladvelin T., Cole G. G., & Eacott M. J. (2023). Teleporting into walls? The irrelevance of the physical world in embodied perspective-taking.Psychonomic Bulletin & Review, 30(3), 1011-1019.
[64] Santiesteban I., Banissy M. J., Catmur C., & Bird G. (2012). Enhancing social ability by stimulating right temporoparietal junction.Current Biology, 22(23), 2274-2277.
[65] Schurz M., Radua J., Aichhorn M., Richlan F., & Perner J. (2014). Fractionating theory of mind: A meta-analysis of functional brain imaging studies.Neuroscience and Biobehavioral Reviews, 42, 9-34.
[66] Schurz M., Tholen M. G., Kronbichler M., Perner J.,& Surtees, A. D. R.(2025). Comparing level 1 and level 2 visuo-spatial perspective-taking in the brain: Evidence from fMRI. Social Neuroscience, 202025.2490574
[67] Seymour R. A., Wang H., Rippon G., & Kessler K. (2018). Oscillatory networks of high-level mental alignment: A perspective-taking MEG study.NeuroImage, 177, 98-107.
[68] Surtees A., Apperly I. A., & Samson D. (2013). The use of embodied self-rotation for visual and spatial perspective- taking.Frontiers in Human Neuroscience, 7, 698.
[69] Surtees A. D., Butterfill S. A., & Apperly I. A. (2012). Direct and indirect measures of level-2 perspective-taking in children and adults.The British Journal of Developmental Psychology, 30(1), 75-86.
[70] Thompson, J. R. (2014). Signature limits in mindreading systems.Cognitive Science, 38(7), 1432-1455.
[71] Todd A. R., Cameron C. D., & Simpson A. J. (2017). Dissociating processes underlying level-1 visual perspective taking in adults.Cognition, 159, 97-101.
[72] Todd A. R., Cameron C. D., & Simpson A. J. (2021). The goal-dependence of level-1 and level-2 visual perspective calculation.Journal of Experimental Psychology: Learning, Memory, and Cognition, 47(6), 948-967.
[73] Todd A. R., Simpson A. J., & Cameron C. D. (2019). Time pressure disrupts level-2, but not level-1, visual perspective calculation: A process-dissociation analysis.Cognition, 189, 41-54.
[74] Tomasello, M. (2018). How children come to understand false beliefs: A shared intentionality account.Proceedings of the National Academy of Sciences of the United States of America, 115(34), 8491-8498.
[75] Ueda S., Nagamachi K., Nakamura J., Sugimoto M., Inami M., & Kitazaki M. (2021). The effects of body direction and posture on taking the perspective of a humanoid avatar in a virtual environment.Plos One, 16(12), e0261063.
[76] Wang H., Callaghan E., Gooding-Williams G., McAllister C., & Kessler K. (2016). Rhythm makes the world go round: An MEG-TMS study on the role of right TPJ theta oscillations in embodied perspective taking.Cortex, 75, 68-81.
[77] Wang N., Huang S., Cai J., Huang R., & Geng H. (2025). How the brain memorizes the world from others’ perspectives: Investigating allocentric encoding of object features during perspective taking.BMC Psychology, 13(1), 691.
[78] Ward E., Ganis G., & Bach P. (2019). Spontaneous vicarious perception of the content of another’s visual perspective. Current Biology, 29(5), 874-880.e4.
[79] Ward E., Ganis G., McDonough K. L., & Bach P. (2020). Perspective taking as virtual navigation? Perceptual simulation of what others see reflects their location in space but not their gaze.Cognition, 199, 104241.
[80] Westra, E. (2017). Spontaneous mindreading: A problem for the two-systems account.Synthese, 194, 4559-4581.
[81] Wilson, M. (2002). Six views of embodied cognition.Psychonomic Bulletin & Review, 9(4), 625-636.
[82] Wittmann M. K., Kolling N., Faber N. S., Scholl J., Nelissen N., & Rushworth M. F. (2016). Self-other mergence in the frontal cortex during cooperation and competition.Neuron, 91(2), 482-493.
[83] Wraga M., Shephard J. M., Church J. A., Inati S., & Kosslyn S. M. (2005). Imagined rotations of self versus objects: An fMRI study.Neuropsychologia, 43(9), 1351-1361.
[84] Yao Y. W., Chopurian V., Zhang L., Lamm C., & Heekeren H. R. (2021). Effects of non-invasive brain stimulation on visual perspective taking: A meta-analytic study.NeuroImage, 242, 118462.
[85] Yeh Y. Y., Wang C. C., Cheng S. K., & Chiu C. D. (2021). Dissociation of posture remapping and cognitive load in level-2 perspective-taking.Cognition, 214, 104733.
[86] Yu, A. B., & Zacks, J. M. (2017). Transformations and representations supporting spatial perspective taking.Spatial Cognition & Computation, 17(4), 304-337.
[87] Zacks J. M., Gilliam F., & Ojemann J. G. (2003). Selective disturbance of mental rotation by cortical stimulation.Neuropsychologia, 41(12), 1659-1667.
[88] Zacks, J. M., & Michelon, P. (2005). Transformations of visuospatial images.Behavioral and Cognitive Neuroscience Reviews, 4(2), 96-118.