COMPARATIVE ANALYSIS OF THE APPLICATION OF FEATURE SELECTION IN RANDOM FOREST REGRESSION FOR STOCK PRICE PREDICTION

Authors

  • Emil Agusalim Habi Talib Universitas Muhammadiyah Makassar
  • Alvina Felicia Watratan STMIK Profesional Makassar
  • Saharuddin Saharuddin STMIK Profesional Makassar

DOI:

https://doi.org/10.59003/nhj.v5i3.1641

Keywords:

Feature Selection, RRandom Forest Regression, Spearman Correlation, Stock Price Prediction, Machine Learning

Abstract

The rapid development of information technology and data mining has encouraged the use of machine learning algorithms in various fields, including the financial sector and capital markets. One of the main challenges in stock price prediction is the large number of available variables, not all relevant to the target variable, potentially reducing accuracy and causing overfitting. This study aims to analyze the benefits of applying feature selection in improving the performance of the Random Forest Regression algorithm for stock price prediction. The dataset used in this research consists of ten years of historical stock price data from PT Aneka Tambang Tbk (ANTM). The research was conducted using an experimental approach by developing two models: (1) Random Forest Regression without feature selection and (2) Random Forest Regression with feature selection using the Spearman Correlation method. Model performance was evaluated using Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Square Error (RMSE), Coefficient of Determination (R²), and Mean Absolute Percentage Error (MAPE). The experimental results show that the model with feature selection achieved better performance, with improvements in all evaluation metrics, such as reduced error values (MAE: 26.22; RMSE: 51.82; MAPE: 1.32%) and increased R² (0.9895). These findings confirm that integrating feature selection with Random Forest Regression can improve prediction accuracy, reduce model complexity, and minimize overfitting risk. Therefore, feature selection plays a significant role in enhancing the effectiveness of machine learning models in stock price prediction.

Downloads

Download data is not yet available.

References

Armaya, A. M. R. (2024). Pengaruh Feature Selection Dan Feature Extraction Dalam Peningkatan Akurasi Klasifikasi Kebakaran Hutan. JuTI “Jurnal Teknologi Informasi,” 3(1), 13. https://doi.org/10.26798/juti.v3i1.1039

Bocianowski, J., Wrońska-Pilarek, D., Krysztofiak-Kaniewska, A., Matusiak, K., & Wiatrowska, B. (2023). Comparison of Pearson’s and Spearman’s Correlation Coefficients Values for Selected Traits of Pinus sylvestris L. (Vol. 17, p. 302). https://doi.org/10.20944/preprints202312.1604.v1

Budiman, S., Sunyoto, A., & Nasiri, A. (2021). Analisa Performa Penggunaan Feature Selection untuk Mendeteksi Intrusion Detection Systems dengan Algoritma Random Forest Classifier. Sistemasi, 10(3), 753. https://doi.org/10.32520/stmsi.v10i3.1550

Budiprasetyo, G., Hani’ah, M., & Aflah, D. Z. (2023). Prediksi Harga Saham Syariah Menggunakan Algoritma Long Short-Term Memory (LSTM). Jurnal Nasional Teknologi Dan Sistem Informasi, 8(3), 164–172. https://doi.org/10.25077/TEKNOSI.v8i3.2022.164-172

Faisal, M., Abd Rahman, T. K., Zainal, D., Mubarak, H., Shabir, F., Anwar, N., & Asrowardi, I. (2025). Utilizing Machine Learning-Based Decision-Making to Align Higher Education Curriculum with Industry Requirements. International Journal of Modern Education and Computer Science, 17(4), 1–25. https://doi.org/10.5815/ijmecs.2025.04.01

Faisal, M., Irmawati, Rahman, T. K. A., Jufri, Sahabuddin, Herlinah, & Mulyadi, I. (2025). A Hybrid MOO, MCGDM, and Sentiment Analysis Methodologies for Enhancing Regional Expansion Planning: A Case Study, Luwu - Indonesia. International Journal of Mathematical, Engineering and Management Sciences, 10(1), 163–188. https://doi.org/10.33889/IJMEMS.2025.10.1.010

Faisal, M., Rahman, T. K. A., Mulyadi, I., Aryasa, K., Irmawati, & Thamrin, M. (2024). A Novelty Decision-Making Based on Hybrid Indexing, Clustering, and Classification Methodologies: An Application to Map the Relevant Experts Against the Rural Problem. Decision Making: Applications in Management and Engineering, 7(2), 132–171. https://doi.org/10.31181/dmame7220241023

Fathoni, F., Ibrahim, A., Septiana, R., Rielisa Putri, A., Ispahan, T., & Shifa Maharani, W. (2025). Analisis Prediksi Harga Saham Pada Perusahaan T Menggunakan Kombinasi Cnn-Lstm. JATI (Jurnal Mahasiswa Teknik Informatika), 9(4), 6669–6675. https://doi.org/10.36040/jati.v9i4.14104

Hwang, S. W., Chung, H., Lee, T., Kim, J., Kim, Y. J., Kim, J. C., Kwak, H. W., Choi, I. G., & Yeo, H. (2023). Feature importance measures from a random forest regressor using near-infrared spectra to predict kraft lignin-derived hydrochar's carbonization characteristics. Journal of Wood Science, 69(1). https://doi.org/10.1186/s10086-022-02073-y

Investing. (2025). Aneka Tambang Persero Tbk (ANTM). Investing.Com. https://id.investing.com/equities/aneka-tambang-ratios

Karmilasari, S. D. K. (2022). Implementasi Long Short-Term Memory Pada Prediksi Harga Saham PT Aneka Tambang Tbk. Jurnal Ilmiah Komputasi, 21(1). https://doi.org/10.32409/jikstik.21.1.2815

Kurnia, F. A., Hardianti, M., Sinurat, M., & Cahyadi, L. (2025). Analisis Prediksi Harga Saham PT. BCA Dengan Menggunakan Metode ARIMA. ECo-Fin, 7(2), 880–896. https://doi.org/10.32877/ef.v7i2.2373

Kurniawati, A., & Arima, A. (2021). Analisis Prediksi Harga Saham PT. Astra International Tbk Menggunakan Metode Autoregressive Integrated Moving Average (ARIMA) dan Support Vector Regression (SVR). Jurnal Ilmiah Komputasi, 20(3), 417–423. https://doi.org/10.32409/jikstik.20.3.2732

Lestari, E. S., & Astuti, I. (2022). Penerapan Random Forest Regression Untuk Memprediksi Harga Jual Rumah Dan Cosine Similarity Untuk Rekomendasi Rumah Pada Provinsi Jawa Barat. Jurnal Ilmiah FIFO, 14(2), 131. https://doi.org/10.22441/fifo.2022.v14i2.003

Muhamad Zulfani, & Dapadeda, A. (2024). Prediksi Harga Saham Menggunakan Algoritma Neural Network. Jurnal Teknologi Informasi: Jurnal Keilmuan Dan Aplikasi Bidang Teknik Informatika, 18(1), 1–6. https://doi.org/10.47111/jti.v18i1.11303

Mulyadi, I., Thamrin, M., Faisal, M., Yunarti, S., Saharuddin, Abd Djalil, A., & Mallu, S. (2024). A Hybrid Model for Palm Sugar Type Classification: Advancing Image-Based Analysis for Industry Applications. Ingénierie Des Systèmes d Information, 29(5), 1937–1948. https://doi.org/10.18280/isi.290525

Priantama, Y., & Yoga Siswa, T. A. (2022). Optimasi Correlation-Based Feature Selection Untuk Perbaikan Akurasi Random Forest Classifier Dalam Prediksi Performa Akademik Mahasiswa. JIKO (Jurnal Informatika Dan Komputer), 6(2), 251. https://doi.org/10.26798/jiko.v6i2.651

Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., & O’Sullivan, J. M. (2022). A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. Frontiers in Bioinformatics, 2. https://doi.org/10.3389/fbinf.2022.927312

Rickert, C. A., Henkel, M., & Lieleg, O. (2023). An efficiency-driven, correlation-based feature elimination strategy for small datasets. APL Machine Learning, 1(1). https://doi.org/10.1063/5.0118207

Somantri, O., & Khambali, M. (2017). Feature Selection Klasifikasi Kategori Cerita Pendek Menggunakan Naïve Bayes dan Algoritme Genetika. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi (JNTETI), 6(3), 301–306. https://doi.org/10.22146/jnteti.v6i3.332

Downloads

Published

2025-08-30

How to Cite

Habi Talib, E. A., Alvina Felicia Watratan, & Saharuddin, S. (2025). COMPARATIVE ANALYSIS OF THE APPLICATION OF FEATURE SELECTION IN RANDOM FOREST REGRESSION FOR STOCK PRICE PREDICTION. Nusantara Hasana Journal, 5(3), 334–348. https://doi.org/10.59003/nhj.v5i3.1641