Implementation and Experimental Analysis of Random Forrest in R Parallel Computing

Riza, Lala Septem

Implementation and Experimental Analysis of Random Forrest in R Parallel Computing
Nur Azizah, Yaya Wihardi, Lala Septem Riza

Department of Computer Science Education, Universitas Pendidikan Indonesia

Abstract

Random forest is a method for building models by combining decision trees or decision trees generated from bootstrap samples and random features. A common problem that often occurs when implementing random forest in a single core is the long processing time because it uses a lot of data training and builds a lot of tree as the model as well. Therefore, this study proposes the random forest method with parallel computing implemented in the R programming language. Several datasets are used in the experiments, such as the Iris dataset, the quality of the wine, and the diagnostic data of female diabetes Pima Indians. Results obtained from the experimentation show that the computational time used when running random forest with parallel computing is shorter than when running a random forest using only a single processor. It means that the implementation has been successful.

Keywords: Decision Tree, Random Forest, R High Performance Computing, Parallel Computing, R Programming Language

Topic: Computer Science

MSCEIS 2018 Conference | http://msceis.conference.upi.edu/2018