Anomaly Detection in Salesforce’s Transactional Data Using Machine Learning Techniques

Authors

DOI:

https://doi.org/10.61326/jaasci.v4i1-2.406

Keywords:

Anomaly detection, Artificial intelligence, Isolation forest, Machine learning, Salesforce

Abstract

This paper addresses the challenges posed by the volume and complexity of current healthcare systems, which necessitate the use of specific techniques for ensuring data security. Healthcare applications for Salesforce and Anomaly detection using Isolation Forest algorithm are the area of interest of this research work. Anomalous pattern detection is particularly important when it comes to differentiating an unusual pattern of behavior that can pose threats to security, for instance fraud or system failure. In this case, we use Isolation Forest algorithm in the Online Retail Dataset; a simulation of a real healthcare dataset. The fact that data preprocessing and feature engineering steps are tractable and do not require a labeled training data set along with its capability of accurately identifying anomalous data points in salesforce systems make it ideally suited for outlier detection in such systems. The method includes data preprocessing steps such as handling missing values and normalizing features, as well as new feature engineering to better recognize important customer patterns. The Isolation Forest model is then applied to identify the anomaly in the transaction data, achieving an accuracy of 93%, precision of 0.92, recall of 0.89, F1-score of 0.90, and AUC of 0.95. In line with our proposition, findings disclosed that Isolation Forest produced remarkably high accuracy and evaluation measures specifically in the area of outlier detection. Moreover, the model is used for an ongoing surveillance system for continual examination and learning to achieve a higher level of anomaly and outlier detection for the security of the healthcare systems. The research utilizes a simulated Salesforce healthcare dataset that is publicly available in order to remain compliant with data privacy regulations. An unsupervised Isolation Forest algorithm is used for the autonomous detection of anomalies without requiring pre-labeled data. The primary objective of this study is to develop and evaluate an unsupervised anomaly detection framework specifically tailored for Salesforce healthcare CRM systems. The novelty lies in combining context-rich feature engineering with the Isolation Forest algorithm to handle unlabeled and heterogeneous healthcare data. This framework offers a replicable methodology for enhancing fraud detection and operational security within healthcare CRM environments.

References

Agarwal, S., Somaddar, A., Harit, P., Thakur, D., Sharma, A., & Singh, K. K. (2023). Network traffic analysis and anomaly detection. 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON). Bangalore. https://doi.org/10.1109/SMARTGENCON60755.2023.10442908

Akter, S., Fosso Wamba, S., & Alnofeli, K. (2023). The future of AI-based CRM. In S. Akter & S. Fosso Wamba (Eds.), Handbook of big data research methods (pp. 278-293). Edward Elgar. https://doi.org/10.4337/9781800888555.00023

Alexander, T. (2024). Proactive customer support: Re-architecting a customer support/relationship management software system leveraging predictive analysis/AI and machine learning. Engineering: Open Access, 2(1), 39-50. https://doi.org/10.33140/eoa.02.01.04

Almahairah, M. S. (2023). Artificial intelligence application for effective customer relationship management. International Conference on Computer Communication and Informatics (ICCCI). Coimbatore. https://doi.org/10.1109/ICCCI56745.2023.10128360

Amarasinghe, H. (2023). Transformative power of AI in customer relationship management (CRM): Potential benefits, pitfalls, and best practices for modern enterprises. International Journal of Social Analytics, 8(8), 1-10.

An, W., Liang, M., & Liu, H. (2015). An improved one-class support vector machine classifier for outlier detection. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, 229(3), 580-588. https://doi.org/10.1177/0954406214537475

Bairy, M., Muniyal, B., & Shetty, N. P. (2024). Enhancing healthcare data integrity: fraud detection using unsupervised learning techniques. International Journal of Computers and Applications, 46(11), 1006-1019. https://doi.org/10.1080/1206212X.2024.2408262

Berti, A., Jessen, U., van der Aalst, W. M., & Fahland, D. (2024). Challenges of anomaly detection in the object-centric setting: Dimensions and the role of domain knowledge. arXiv:2407.09023. https://doi.org/10.48550/arXiv.2407.09023

Bin Sarhan, B., & Altwaijry, N. (2023). Insider threat detection using machine learning approach. Applied Sciences, 13(1), 259. https://doi.org/10.3390/app13010259

Breunig, M. M., Kriegel, H. P., Raymond, T. Ng., & Sander, J. (2000). LOF: Identifying density-based local outliers. ACM SIGMOD Record, 29(2), 93-104. https://doi.org/10.1145/335191.335388

Chen, D. (2015). Online retail [Data set]. UCI Machine Learning Repository. https://doi.org/10.24432/C5BW33

Chen, D. (2019). Online retail II [Data set]. UCI Machine Learning Repository. https://doi.org/10.24432/C5CG6D

Han, S. J., Kim, D., & Lee, S. (2023). A study on time-series based anomaly detection methods at thermal power plant. Applied Sciences, 13(7), 4097. https://doi.org/10.3390/app13074097

Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1), 100-108. https://doi.org/10.2307/2346830

Hossain, Q., Hossain, A., Nizum, M. Z., & Naser, S. B. (2024). Influence of artificial intelligence on customer relationship management (CRM). International Journal of Communication Networks and Information Security, 16(3), 653-663.

Kalaiyarasan, B., Gurumoorthy, K., & Kamalakannan, A. (2023). AI-driven customer relationship management (CRM): A review of implementation strategies. International Conference on Computing Paradigms (ICCP 2023). Cluj-Napoca.

Lee, K., Park, J., & Kim, H. (2023). An anomaly detection method for unknown protocols in a power plant ICS network with decision tree. Applied Sciences, 13(7), 4203. https://doi.org/10.3390/app13074203

Leelavathi, R., Philip, B., Madhusudhanan, R., Sony, N., & Mukthar, K. P. J. (2024). AI-driven customer relationship management (CRM): A review of implementation strategies. In R. El Khoury (Ed.), Anticipating future business trends: Navigating artificial intelligence innovations (pp. 283-295). Springer. https://doi.org/10.1007/978-3-031-63402-4_22

Liu, F. T., Ting, K. M., & Zhou, Z. H. (2008). Isolation forest. 2008 Eighth IEEE International Conference on Data Mining. Pisa. https://doi.org/10.1109/ICDM.2008.17

Martínez, C., & Gómez, S. (2022). AI-powered CRM solutions: Salesforce’s Data Cloud as a blueprint for future customer interactions. International Journal of Trend in Scientific Research and Development, 6(6), 2331-2346.

Mazingue, C. (2023). Perceived challenges and benefits of AI implementation in customer relationship management (CRM) systems. Journal of Digitovation and Information System, 3(1), 72-98. https://doi.org/10.54433/JDIIS.2023100023

Pang, L. (2022). Applied machine learning methods for time series forecasting. AMNetS’22: Applied Machine Learning for Networking and Systems Workshop.

Pastierik, I. (2024). Oracle APEX as a tool for data analytics. In Á. Rocha, H. Adeli, L. P. Reis & S. Costanzo (Eds.), Trends and applications in information systems and technologies (pp. 203-214). Springer. https://doi.org/10.1007/978-3-031-60328-0_20

Pookandy, J. (2022). AI-based data cleaning and management in Salesforce CRM for improving data integrity and accuracy to enhance customer insights. International Journal of Advanced Research in Engineering and Technology, 13(5), 108-116.

Potla, R. T. (2022). AI and machine learning for enhancing cybersecurity in cloud-based CRM platforms. Australian Journal of Machine Learning Research & Applications, 2(2), 287-302.

Sakurada, M., & Yairi, T. (2014). Anomaly detection using autoencoders with nonlinear dimensionality reduction. MLSDA'14: Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis. Gold Coast Australia. https://doi.org/10.1145/2689746.2689747

San, S. (2023). Optimizing sales performance in creative as a service (CaaS) companies: A machine learning approach to opportunity time-series forecasting (Master’s thesis, Nova de Lisboa University).

Shaikh, I. A. K., Shahare, P., Gangadharan, S., Venkatarathnam, N., Pelluru, G., & Tilak Babu, S. B. G. (2024). Transforming customer relationship management (CRM) with AI in e-commerce. 5th International Conference on Recent Trends in Computer Science and Technology (ICRTCST). Jamshedpur. https://doi.org/10.1109/ICRTCST61793.2024.10578449

Singh, V. K., & Govindarasu, M. (2021). A cyber-physical anomaly detection for wide-area protection using machine learning. IEEE Transactions on Smart Grid, 12(4), 3514-3526. https://doi.org/10.1109/TSG.2021.3066316

Tanuwijaya, E., Mauritsius, T. (2024). Anomaly detection in sales transactions for FMCG (fast moving consumer goods) distribution. Journal of Applied Data Sciences, 5(3), 1223-1236. https://doi.org/10.47738/jads.v5i3.228

Veeravalli, S. D. (2023). Proactive threat detection in CRM: Applying salesforce Einstein AI and event monitoring to anomaly detection and fraud prevention. ISCSITR-International Journal of Scientific Research in Artificial Intelligence and Machine Learning (ISCSITR-IJSRAIML), 4(1), 16-35. http://www.doi.org/10.63397/ISCSITR-IJSRAIML_04_01_002

Wang, J. F. (2023). The impact of artificial intelligence (AI) on customer relationship management: A qualitative study. International Journal of Management and Accounting, 5(5), 74-88. https://doi.org/10.34104/ijma.023.0074090

Downloads

Published

31-12-2025

Issue

Section

Research Articles