Behavior-Based Typology: Classifying Investors by Stock Holding Preferences
XIONG Xiong, CHEN Ruoxin, MENG Yongqiang, GAO Ya, LIN Shen
College of Management and Economics/Laboratory of Computation and Analytics of Complex Management Systems (CACMS), Tianjin University; School of Economics and Management, Dalian University of Technology
Summary:
Recognizing, distinguishing, and characterizing the heterogeneous features of micro-level agents constitutes the foundation for advancing micro-level regulatory technologies and constructing a capital market theory grounded in heterogeneous agent behavior. From the practical perspective of financial risk management, the focus of regulatory issues has shifted from “how to regulate” to “the efficiency of regulatory input and output.” Real-time monitoring of all micro-market participants would result in significant waste of computational resources and diminishing marginal returns from regulatory efforts. From the theoretical perspective of financial risk management, the behavioral characteristics of micro-agents, their heterogeneity, and their interactions represent the underlying drivers of systemic risk emergence within the macro-level capital market as a complex system. Meanwhile, advances in information technology have continuously enhanced the acquisition and analytical capacity for massive micro-level data. The joint forces of technological progress and theoretical demand necessitate deeper academic inquiry into micro-level heterogeneity. Although prior studies have made certain progress, the simple reliance on “intuitive” classification criteria such as wealth or age has left numerous issues unresolved. The most salient problems include: (i) insignificant differences across categories, (ii) failure to identify critical categories, and (iii) the inability to justify the selection of segmentation boundaries. The first two issues substantially diminish the applicability of classification methods in managerial contexts, whereas the subjectivity in boundary selection severely undermines their generalizability for research applications. Consequently, existing studies primarily examine the impact of predefined labels on investor behavioral heterogeneity rather than exploring how labels can be cut to effectively distinguish investor behaviors, leaving a notable research gap. Building on the premise that investor types are fundamentally shaped by their behavior, this study employs data from 96,072 individual investor accounts spanning 2016 to 2021, along with 109 stock factors, to classify and analyze investors based on their holding preferences. The results reveal that stock price, as a singular characteristic, emerges as the most significant determinant of internal preference divergence among individual investors, thereby distinguishing distinct investor types in terms of their holdings. Based on the relative importance of preference divergence and the intercorrelation among indicators, this study ultimately focuses on nine firm characteristics. Employing Principal Component Analysis (PCA) for dimension reduction, we summarize the major sources of heterogeneity in investor preferences into two composite dimensions: price-attention factor and shell-value factor, which collectively explain over 50% of the overall heterogeneity across the nine indicators. This outcome not only aligns with existing asset pricing and individual investor studies in the Chinese market but also corroborates the widely cited retail investor adage: “Bet on restructuring (shell value) mid-year and on earnings (low price and low attention) at year-end.” Subsequently, using each investor's positions along these two composite dimensions, we apply the K-means clustering algorithm to categorize investors and examine the relationship between preference-based clusters and investors' demographic and behavioral characteristics. The empirical findings indicate that investor classification derived from stock-holding preferences is not driven by a single demographic label but rather by a combination of multiple attributes and behavioral patterns. For example, two investor clusters may both display characteristics such as a small portfolio size, older age, longer investment experience, lower education, and a lower turnover ratio, making differentiation based on a single demographic label infeasible. However, segmentation based on holding preferences reveals that investors with a strong shell-value orientation warrant closer regulatory attention compared to relatively mature and conservative investors under a tiered supervision framework. Based on the findings, three policy recommendations are proposed: First, enhance tiered supervision and risk warning systems. In light of the challenges posed by financial innovation and digitalization, strengthening the regulatory framework requires robust support from supervisory technology (RegTech). Considering the efficiency constraints in comprehensive micro-level financial regulation, we recommend adopting an approach similar to that presented in this study—identifying core investors based on stock-holding preferences to accurately target risk-asset holders and potential spillover agents, and implementing dynamic, tiered regulation. Second, deliver investor education tailored to heterogeneity. In a market dominated by retail investors—with over 200 million small and medium-sized participants—we recommend using stock-holding preferences as an entry point for investor education. Through regular review and monitoring, regulators can correct behavioral biases, such as excessive risk-seeking and gambling tendencies. Third, employ shell-value preference as a quantitative indicator to advance the registration-based system. At present, China is implementing institutional reforms on both the asset and investor fronts. To address the difficulty of quantifying policy feedback on the investor side, we suggest leveraging the technical approach outlined in this study to measure shell-value preferences among different investor groups, thereby providing an empirical basis for tracking policy responses and evaluating the effectiveness of reforms.
[1]陈文博、陈浪南和王升泉,2019,《投资者的博彩行为研究——基于盈亏状态和投资者情绪的视角》,《中国管理科学》第2期,第19~30页。 [2]李晨辰和吴冲锋,2022,《证券交易的移动化:眼球效应与乐观偏差》,《管理科学学报》第10期,第1~20页。 [3]林兟、何为、余剑峰和熊熊,2023,《公募基金改善了市场定价效率吗?——持股基金质量与股票收益》,《金融研究》第4期,第149~167页。 [4]罗进辉、向元高和金思静,2017,《中国资本市场低价股的溢价之谜》,《金融研究》第1期,第191~206页。 [5]屈源育、沈涛和吴卫星,2018,《壳溢价:错误定价还是管制风险?》,《金融研究》第3期,第155~171页。 [6]宋双杰、曹晖和杨坤,2011,《投资者关注与IPO异象——来自网络搜索量的经验证据》,《经济研究》第S1期,第145~155页。 [7]许泳昊、徐鑫和朱菲菲,2022,《中国A股市场的“大单异象”研究》,《管理世界》第7期,第120~136页。 [8]张维、林兟、康俊卿、熊熊和张永杰,2023,《计算实验金融工程:大数据驱动的金融管理决策工具》,《管理世界》第5期,第173~190页。 [9]朱红兵和张兵,2020,《价值性投资还是博彩性投机?——中国A股市场的MAX异象研究》,《金融研究》第2期,第167~187页。 [10]Akcay, E., and D. Hirshleifer, 2021, “Social Finance as Cultural Evolution, Transmission Bias, and Market Dynamics”, PNAS, 118(26), p. e2015568118. [11]An, L., D. Lou, and D. Shi, 2022, “Wealth Redistribution in Bubbles and Crashes”, Journal of Monetary Economics, 126, pp. 134~153. [12]Anagol, S., V. Balasubramaniam, and T. Ramadorai, 2021, “Learning from Noise: Evidence from India's IPO Lotteries”, Journal of Financial Economics, 140(3), pp. 965~986. [13]Balasubramaniam, V., J. Y. Campbell, T. Ramadorai, and B. Ranish, 2023, “Who Owns What? A Factor Model for Direct Stockholding”, Journal of Finance, 78(3), pp. 1545~1591. [14]Barber, B. M., and T. Odean, 2000, “Trading Is Hazardous to Your Wealth: The Common Stock Investment Performance of Individual Investors”, Journal of Finance, 55(2), pp. 773~806. [15]Barber, B. M., and T. Odean, 2008, “All That Glitters: The Effect of Attention and News on the Buying Behavior of Individual and Institutional Investors”, Review of Financial Studies, 21(2), pp. 785~818. [16]Barrot, J. N., R. Kaniel, and D. Sraer, 2016, “Are Retail Traders Compensated for Providing Liquidity?”, Journal of Financial Economics, 120(1), pp. 146~168. [17]Betermier, S., L. E. Calvet, and P. Sodini, 2017, “Who Are the Value and Growth Investors?”, Journal of Finance, 72(1), pp. 5~46. [18]Cochrane, J., 2009, “Asset Pricing: Revised Edition”, Princeton University Press. [19]Da, Z., J. Engelberg, and P. Gao, 2011, “In Search of Attention”, Journal of Finance, 66(5), pp. 1461~1499. [20]Grinblatt, M., M. Keloharju, and J. T. Linnainmaa, 2012, “IQ, Trading Behavior, and Performance”, Journal of Financial Economics, 104(2), pp. 339~362. [21]Gu, S., B. Kelly,D. Xiu, 2020, “Empirical Asset Pricing Via Machine Learning”, Review of Financial Studies, 33(5), pp. 2223~2273. [22]Hou, K., C. Xue, and L. Zhang, 2020, “Replicating Anomalies”, Review of Financial Studies, 33(5), pp. 2019~2133. [23]Hu, C., J. C. Lin, and Y. J. Liu, 2022, “What Are the Benefits of Attracting Gambling Investors? Evidence from Stock Splits in China”, Journal of Corporate Finance, 74, p. 102199. [24]Jiang, Z., C. Peng, and H. Yan, 2024, “Personality Differences and Investment Decision-Making”, Journal of Financial Economics, 153, p. 103776. [25]Kaniel, R., S. Liu, G. Saar, and S. Titman, 2012, “Individual Investor Trading and Return Patterns around Earnings Announcements”, Journal of Finance, 67(2), pp. 639~680. [26]Kumar, A., 2009, “Who Gambles in the Stock Market?”, Journal of Finance, 64(4), pp. 1889~1933. [27]Lee, Y., Y. Kim, J. Sanz-Cruzado, R. Mccreadie, Y. Lee, 2024, “Stock Recommendations for Individual Investors: A Temporal Graph Network Approach with Mean-Variance Efficient Sampling”, Proceedings of the 5th ACM International Conference on AI in Finance, 24, pp. 795~803. [28]Li, Z., L. X. Liu, X. Liu, and W. K. John, 2024, “Replicating and Digesting Anomalies in the Chinese A-Share Market”, Management Science, 70(8), pp. 5066~5090. [29]Liao, J., C. Peng, and N. Zhu, 2022, “Extrapolative Bubbles and Trading Volume”, Review of Financial Studies, 35(4), pp. 1682~1722. [30]Liu, H., C. Peng, W. A. Xiong, and W. Xiong, 2022, “Taming the Bias Zoo”, Journal of Financial Economics, 143(2), pp. 716~741. [31]Liu, J., R. F. Stambaugh, and Y. Yuan, 2019, “Size and Value in China”, Journal of Financial Economics, 134(1), pp. 48~69. [32]Ma, J., X. Li, L. Lu, W. Wu, and X. Xiong, 2022, “Individual Investors' Dispersion in Beliefs and Stock Returns”, Financial Management, 51(3), pp. 929~953. [33]Markowitz, H., 1952, “Portfolio Selection”, Journal of Finance, 7(1), pp. 77~91. [34]McConville, R., R. Santos-Rodriguez, R. Piechocki, I. Craddock, 2021, “N2d:(Not Too) Deep Clustering Via Clustering the Local Manifold of an Autoencoded Embedding”, Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), 25, pp. 5145~5152. [35]Mukherjee, S., H. Asnani,E. Lin,S. Kannan, 2019, “Clustergan: Latent Space Clustering in Generative Adversarial Networks”, Proceedings of the AAAI Conference on Artificial Intelligence, 33(1), pp. 4610~4617. [36]Odean, T., 1998, “Are Investors Reluctant to Realize Their Losses?”, Journal of Finance, 53(5), pp. 1775~1798. [37]Sharpe, W. F., 1964, “Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk”, Journal of Finance, 19(3), pp. 425~442. [38]Shiller, R. J., 1981, “The Use of Volatility Measures in Assessing Market Efficiency”, Journal of Finance, 36(2), pp. 291~304. [39]Xiong, W., and J. Yu, 2011, “The Chinese Warrants Bubble”, American Economic Review, 101(6), pp. 2723~2753.