Conference Paper (published)

Attempting to Reduce the Vanishing Gradient Effect through a novel Recurrent Multiscale Architecture

Publisher DOI

Details

Citation

Squartini S, Hussain A & Piazza F (2003) Attempting to Reduce the Vanishing Gradient Effect through a novel Recurrent Multiscale Architecture. In: Proceedings of the International Joint Conference on Neural Networks, 2003 (Volume: 4). The International Joint Conference on Neural Networks, 2003, 20.07.2003-24.07.2003. Piscataway, NJ: IEEE, pp. 2819-2824. http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=27472; https://doi.org/10.1109/IJCNN.2003.1224018

Abstract
This paper proposes a possible solution to the vanishing gradient problem in recurrent neural networks, occurring when such networks are applied to solving tasks where detection of long term dependencies is required. The main idea consists of pre-processing the signal (a time series typically) through a discrete wavelet decomposition, in order to separate the short term information from the long term ones, and treating each scale by different recurrent neural networks. The partial results concerning all the sequences at diverse time/frequency resolutions are combined through an adaptive nonlinear structure in order to achieve the final goal. This new preprocessing based approach is distinct from the other one reported in literature to-date, as it tends to mitigate the effects of the problem under study avoiding relevant changing in network's architecture and learning techniques. The overall system (called recurrent multiscale network, RMN) is described and its performances tested through typical tasks namely the latching problem and time series prediction.

Status	Published
Publication date	31/12/2003
Publication date online	31/07/2003
Publisher	IEEE
Publisher URL	http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=27472
Place of publication	Piscataway, NJ
ISBN	0-7803-7898-9
Conference	The International Joint Conference on Neural Networks, 2003
Dates	20/07/2003–24/07/2003