This study models the number of lexical-based sentiment cues as values of a multinomial distribution. By applying a Dirichlet prior distribution to the document-level sentiment probability vector, we propose a Dirichlet-multinomial Bayesian sentiment analysis framework for three sentiment categories: negative, neutral, and positive. The proposed model extracts sentiment cues through dictionary-based word matching and uses the resulting class-specific frequencies as sufficient statistics for probabilistic inference. Furthermore, the proposed methodology is presented through a proof-based explanation, including a formal derivation of Bayesian decision rules for Dirichlet conjugacy, marginal likelihoods, and empirical Bayes estimation of hyperparameters.
Kim et al. (Tue,) studied this question.