Journal Screenshot

International Journal of Academic Research in Accounting, Finance and Management Sciences

Open Access Journal

ISSN: 2225-8329

Applying Data Mining Methods to Predict the Stock Prices of the Top Five Semiconductor Companies in Taiwan

Liang-Ying Wei, She-Ting Yeh

http://dx.doi.org/10.6007/IJARAFMS/v15-i3/26358

Open access

Stock trading is a financial tool for increasing passive income. The semiconductor industry has development potential in Taiwan and is a popular stock investment target for investors. This study uses the stock prices of the top five semiconductor companies with the highest annual revenue in Taiwan in 2021 as the data source for stock price prediction. The daily closing prices of the five companies from 2012 to 2021 are collected as experimental data. The data is divided into training data sets and test data sets on an annual basis. Time series analysis is combined with data mining methods such as random forest, neural network, support vector regression, Gaussian process regression, and k-nearest neighbor regression to predict stock prices. MAE (Mean absolute error) and root mean squared error (Root mean squared error) are used to be prediction error evaluation metrics. The changes in error values generated by different prediction models in each test data set were analyzed and compared. Experiments revealed that due to the impact of COVID-19 and the Sino-US trade war, the stock prices of five semiconductor companies rose rapidly from 2020 to 2021. An overall analysis of the stock price prediction results of five algorithms for different companies revealed that the predictive performance of the stock prediction models, ranked from best to worst, was support vector regression (SVR), neural network (NN), random forest (RF), K-nearest neighbor regression (KNN), and Gaussian process regression (GPR). Among them, support vector regression achieved superior predictive performance across different industries and in both low and high stock price volatility stock price predictions. The results of this study will serve as an objective reference for investors when making investments.

Bayer F. M., Bayer D. M., Pumi G. (2017), Kumaraswamy autoregressive moving average models for double bounded environmental data, Journal of Hydrology, 555, 385-396.
Berry, M. J. A., & Linoff, G. (1997). Data Mining Techniques: for Marking, Sales, and Customer Support. New York: John Wiley & Sons Inc.
Bollerslev T. (1986) Generalized autoregressive conditional heteroscedasticity. Journal of Econometrics. 31 307-327.
Boser B., Guyon I., Vapnik V. (1992) “A training algorithm for optimal margin classifiers”, Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 5, 144-152.
Box G., Jenkins G. (1976) Time series analysis: Forecasting and control, San Francisco: Holden-Day.
Breiman L. (2001). Random Forests. Machine learning.
Caruana R., Niculescu-Mizil A., Crew G., Ksikes A. (2004) Ensemble selection from libraries of models, International conference on machine learning.
Chang P. L., Tsai C. T.?2000?, “Evolution of technology development strategies for Taiwan’s semiconductor industry?Formation of research consortia”, Industry and Innovation, Sydney, 7, 185.
Engle R. F. (1982) Autoregressive conditional heteroscedasticity with estimator of the variance of United Kingdom inflation. Econometrica. 50(4) 987-1008.
Fayyad, U. M., Shapi, G. P., Smyth, P., Uthursamy, R. (1996). Advances in Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press/The MIT Press.
Hippert H. S., Pedreira C. E., Castro R. (2001) Neural networks for short-term load forecasting: a review and evaluation. “IEEE Trans Power Syst”, 16 44-55.
Ho T. K. (1995). Random decision forests. Proceedings of 3rd International conference on document analysis and recognition
Huarng K. H. (2001) Effective lengths of intervals to improve forecasting in fuzzy time series, Fuzzy Sets and Systems. 123 155-162.
Parker, D. B. (1985) “Learning-logic?Casting the cortex of the human brain in silicon.” Technical Report TR-47, Center for Computational Research in Economics and Management Science, Massachusetts Institute of Technology, Cambridge, MA.
Sutskever, I. (2012) Training recurrent neural networks, University of Toronto, Ph.d. thesis .

Wei, L.-Y., & Yeh, S.-T. (2025). Applying Data Mining Methods to Predict the Stock Prices of the Top Five Semiconductor Companies in Taiwan. International Journal of Academic Research in Accounting, Finance and Management Sciences, 15(3), 365–380.