Key points are not available for this paper at this time.
The MBTI (Myers-Briggs Type Indicator) is a widely known approach to personality classification. Datasets for the machine learning approach to personality classification using MBTI are highly imbalanced. Handling imbalanced data sets is a significant open problem with a considerable impact on machine learning methods. This paper presents the results of applying different techniques and suggests their best in mitigating the challenge of imbalanced MBTI datasets. Even though most techniques could be used and implemented to some other problems and areas, like images and sound processing, natural language processing has enough challenges to focus on natural language processing and the specific issue of the MBTI datasets.
Čerkez et al. (Mon,) studied this question.