Objectives Provincial comprehensive atlas is a complex knowledge system. It uses thematic maps to comprehensively display the resource endowment and social development level of a province. We reconstruct the large, complex and unstructured contents of the atlas to establish the atlas content tupu, explore the basic content organization rules of provincial comprehensive atlas in China.
Methods First, we construct the vocabulary vectors of thematic subjects presented in atlas, calculate their semantic similarities, and extract the standard expressions of subject words. Second, following the compile order of "map group, map sheet, and cartographic index", the semantic hierarchical clusters of subject words are figured out, and the content map of atlas are constructed in a tree-wised graph. Finally, the frequent subgraphs of content tupu among provincial atlases are examined to form an atlas fingerprint to identify the common features of subject expressions in the comprehensive atlas of China's provinces.
Results It is shown that the thematic contents of provincial comprehensive atlas are organized hierarchically, and the fingerprint tupu can be used as a framework to guide the thematic selection and content organization during the comprehensive atlas compilation. With the benefit of content tupu and fingerprint tupu, it can be further revealed that the provincial comprehensive atlas in China has obvious clustering characteristics and diversity characteristics in content expression.
Conclusions The research results provide a basis for the design of provincial comprehensive atlas. In the future, we will explore the content tupus of different types of atlas to enrich the theory of atlas design and compilation.