COMPARATIVE ANALYSIS OF THE APPLICATION OF FEATURE SELECTION IN RANDOM FOREST REGRESSION FOR STOCK PRICE PREDICTION
DOI:
https://doi.org/10.59003/nhj.v5i3.1641Keywords:
Feature Selection, RRandom Forest Regression, Spearman Correlation, Stock Price Prediction, Machine LearningAbstract
The rapid development of information technology and data mining has encouraged the use of machine learning algorithms in various fields, including the financial sector and capital markets. One of the main challenges in stock price prediction is the large number of available variables, not all relevant to the target variable, potentially reducing accuracy and causing overfitting. This study aims to analyze the benefits of applying feature selection in improving the performance of the Random Forest Regression algorithm for stock price prediction. The dataset used in this research consists of ten years of historical stock price data from PT Aneka Tambang Tbk (ANTM). The research was conducted using an experimental approach by developing two models: (1) Random Forest Regression without feature selection and (2) Random Forest Regression with feature selection using the Spearman Correlation method. Model performance was evaluated using Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Square Error (RMSE), Coefficient of Determination (R²), and Mean Absolute Percentage Error (MAPE). The experimental results show that the model with feature selection achieved better performance, with improvements in all evaluation metrics, such as reduced error values (MAE: 26.22; RMSE: 51.82; MAPE: 1.32%) and increased R² (0.9895). These findings confirm that integrating feature selection with Random Forest Regression can improve prediction accuracy, reduce model complexity, and minimize overfitting risk. Therefore, feature selection plays a significant role in enhancing the effectiveness of machine learning models in stock price prediction.
Downloads
References
Armaya, A. M. R. (2024). Pengaruh Feature Selection Dan Feature Extraction Dalam Peningkatan Akurasi Klasifikasi Kebakaran Hutan. JuTI “Jurnal Teknologi Informasi,” 3(1), 13. https://doi.org/10.26798/juti.v3i1.1039
Bocianowski, J., Wrońska-Pilarek, D., Krysztofiak-Kaniewska, A., Matusiak, K., & Wiatrowska, B. (2023). Comparison of Pearson’s and Spearman’s Correlation Coefficients Values for Selected Traits of Pinus sylvestris L. (Vol. 17, p. 302). https://doi.org/10.20944/preprints202312.1604.v1
Budiman, S., Sunyoto, A., & Nasiri, A. (2021). Analisa Performa Penggunaan Feature Selection untuk Mendeteksi Intrusion Detection Systems dengan Algoritma Random Forest Classifier. Sistemasi, 10(3), 753. https://doi.org/10.32520/stmsi.v10i3.1550
Budiprasetyo, G., Hani’ah, M., & Aflah, D. Z. (2023). Prediksi Harga Saham Syariah Menggunakan Algoritma Long Short-Term Memory (LSTM). Jurnal Nasional Teknologi Dan Sistem Informasi, 8(3), 164–172. https://doi.org/10.25077/TEKNOSI.v8i3.2022.164-172
Faisal, M., Abd Rahman, T. K., Zainal, D., Mubarak, H., Shabir, F., Anwar, N., & Asrowardi, I. (2025). Utilizing Machine Learning-Based Decision-Making to Align Higher Education Curriculum with Industry Requirements. International Journal of Modern Education and Computer Science, 17(4), 1–25. https://doi.org/10.5815/ijmecs.2025.04.01
Faisal, M., Irmawati, Rahman, T. K. A., Jufri, Sahabuddin, Herlinah, & Mulyadi, I. (2025). A Hybrid MOO, MCGDM, and Sentiment Analysis Methodologies for Enhancing Regional Expansion Planning: A Case Study, Luwu - Indonesia. International Journal of Mathematical, Engineering and Management Sciences, 10(1), 163–188. https://doi.org/10.33889/IJMEMS.2025.10.1.010
Faisal, M., Rahman, T. K. A., Mulyadi, I., Aryasa, K., Irmawati, & Thamrin, M. (2024). A Novelty Decision-Making Based on Hybrid Indexing, Clustering, and Classification Methodologies: An Application to Map the Relevant Experts Against the Rural Problem. Decision Making: Applications in Management and Engineering, 7(2), 132–171. https://doi.org/10.31181/dmame7220241023
Fathoni, F., Ibrahim, A., Septiana, R., Rielisa Putri, A., Ispahan, T., & Shifa Maharani, W. (2025). Analisis Prediksi Harga Saham Pada Perusahaan T Menggunakan Kombinasi Cnn-Lstm. JATI (Jurnal Mahasiswa Teknik Informatika), 9(4), 6669–6675. https://doi.org/10.36040/jati.v9i4.14104
Hwang, S. W., Chung, H., Lee, T., Kim, J., Kim, Y. J., Kim, J. C., Kwak, H. W., Choi, I. G., & Yeo, H. (2023). Feature importance measures from a random forest regressor using near-infrared spectra to predict kraft lignin-derived hydrochar's carbonization characteristics. Journal of Wood Science, 69(1). https://doi.org/10.1186/s10086-022-02073-y
Investing. (2025). Aneka Tambang Persero Tbk (ANTM). Investing.Com. https://id.investing.com/equities/aneka-tambang-ratios
Karmilasari, S. D. K. (2022). Implementasi Long Short-Term Memory Pada Prediksi Harga Saham PT Aneka Tambang Tbk. Jurnal Ilmiah Komputasi, 21(1). https://doi.org/10.32409/jikstik.21.1.2815
Kurnia, F. A., Hardianti, M., Sinurat, M., & Cahyadi, L. (2025). Analisis Prediksi Harga Saham PT. BCA Dengan Menggunakan Metode ARIMA. ECo-Fin, 7(2), 880–896. https://doi.org/10.32877/ef.v7i2.2373
Kurniawati, A., & Arima, A. (2021). Analisis Prediksi Harga Saham PT. Astra International Tbk Menggunakan Metode Autoregressive Integrated Moving Average (ARIMA) dan Support Vector Regression (SVR). Jurnal Ilmiah Komputasi, 20(3), 417–423. https://doi.org/10.32409/jikstik.20.3.2732
Lestari, E. S., & Astuti, I. (2022). Penerapan Random Forest Regression Untuk Memprediksi Harga Jual Rumah Dan Cosine Similarity Untuk Rekomendasi Rumah Pada Provinsi Jawa Barat. Jurnal Ilmiah FIFO, 14(2), 131. https://doi.org/10.22441/fifo.2022.v14i2.003
Muhamad Zulfani, & Dapadeda, A. (2024). Prediksi Harga Saham Menggunakan Algoritma Neural Network. Jurnal Teknologi Informasi: Jurnal Keilmuan Dan Aplikasi Bidang Teknik Informatika, 18(1), 1–6. https://doi.org/10.47111/jti.v18i1.11303
Mulyadi, I., Thamrin, M., Faisal, M., Yunarti, S., Saharuddin, Abd Djalil, A., & Mallu, S. (2024). A Hybrid Model for Palm Sugar Type Classification: Advancing Image-Based Analysis for Industry Applications. Ingénierie Des Systèmes d Information, 29(5), 1937–1948. https://doi.org/10.18280/isi.290525
Priantama, Y., & Yoga Siswa, T. A. (2022). Optimasi Correlation-Based Feature Selection Untuk Perbaikan Akurasi Random Forest Classifier Dalam Prediksi Performa Akademik Mahasiswa. JIKO (Jurnal Informatika Dan Komputer), 6(2), 251. https://doi.org/10.26798/jiko.v6i2.651
Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., & O’Sullivan, J. M. (2022). A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. Frontiers in Bioinformatics, 2. https://doi.org/10.3389/fbinf.2022.927312
Rickert, C. A., Henkel, M., & Lieleg, O. (2023). An efficiency-driven, correlation-based feature elimination strategy for small datasets. APL Machine Learning, 1(1). https://doi.org/10.1063/5.0118207
Somantri, O., & Khambali, M. (2017). Feature Selection Klasifikasi Kategori Cerita Pendek Menggunakan Naïve Bayes dan Algoritme Genetika. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi (JNTETI), 6(3), 301–306. https://doi.org/10.22146/jnteti.v6i3.332
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Emil Agusalim Habi Talib

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
NHJ is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Articles in this journal are Open Access articles published under the Creative Commons CC BY-NC-SA License This license permits use, distribution and reproduction in any medium for non-commercial purposes only, provided the original work and source is properly cited.
Any derivative of the original must be distributed under the same license as the original.