Metric Measure on Bipolar Fuzzy Sets: Mathematical Properties and Applications in Sentiment Analysis
Janet Kez, Mohamed Shenify, Fokrul Alom MazarbhuiyaBipolar fuzzy sets provide an effective framework for representing both positive and negative aspects of information. The necessity of a mathematically rigorous and valid distance measure in bipolar fuzzy environments motivates us to introduce a new real-valued function on the set of bipolar fuzzy sets defined over both discrete and continuous universes of discourse. The proposed function is shown to define a valid metric on the set of bipolar fuzzy sets, as it satisfies all the metric axioms. The metric induced by the real-valued function is inspired by the Canberra distance, and it can effectively quantify the dissimilarity between bipolar fuzzy sets in a normalized and interpretable manner. The practical utility of the proposed metric is demonstrated in a pattern recognition problem, where it successfully recognizes an unknown pattern using known bipolar fuzzy patterns. Using the proposed metric, a bipolar fuzzy C-means clustering algorithm is developed for sentiment analysis. The time complexity of the aforementioned algorithm is also analysed. Experiments conducted on the IMDb Movie Review Dataset demonstrate that the proposed algorithm outperforms k-means, fuzzy C-means, and intuitionistic fuzzy C-means algorithms. The proposed bipolar fuzzy C-means algorithm achieves an accuracy of 90.04%, a precision of 90.51%, a recall of 89.01%, an F1-score of 89.75%, a Root mean square error of 0.1191, and a Silhouette score of 0.75. The findings establish that the proposed metric and the associated bipolar fuzzy clustering approach provide a robust and effective framework of handling sentiment data associated with simultaneous positive and negative opinions.