[1] |
Bing L & Lei Z. A Survey of Opinion Mining and Sentiment Analysis[M] . Boston, MA: Springer US, 2012.
|
[2] |
Aggarwal C & Zhai C, A Survey of Text Classification Algorithms[M] . Boston, MA: Springer US, 2012.
|
[3] |
Zhang Y, Jin R & Zhou Z, Understanding bag-of-words model: a statistical framework[J] . International Journal of Machine Learning & Cybernetics, 2010,1(1-4):43-52.
|
[4] |
Post M & Bergsma S. Explicit and implicit syntactic features for text classification[J] . In Proceedings of the 51st annual meeting of the association for computational linguistics, 2013: volume 2: Short papers, 866-872.
|
[5] |
Mikolov T, Chen K & Corrado G, et al. fficient estimation of word representations in vector space [J/OL]. arXiv preprint arXiv, 2013,1301. 3781.
|
[6] |
Bengio Y, Ducharme R & Vincent P, et al. A neural probabilistic language model[J]. Journal of machine learning research , 2003: 1137-1155.
|
[7] |
Rong X. word2vec Parameter Learning Explained [J/OL]. Computer Science 2014,1411_2738.
|
[8] |
Mikolov T, Sutskever I & Chen K, et al. istributed Representations of Words and Phrases and Their Compositionality [J/OL]. arXiv preprint arXiv, 2013,1310. 4546.
|
[9] |
Pennington J, Socher R & Manning C. D., GloVe: Global Vectors for Word Representation [C]. Empirical methods in natural language processing, 20141532-1543. http:// www.aclweb.Org /anthology/D14-1162.
|
[10] |
Kalchbrenner N., Grefenstette E & Blunsom P., A Convolutional Neural Network for Modelling Sentences[C]. Association for Computational Linguistics (ACL), 2014.
|
[11] |
Lecun Y, Bottou L & Bengio Y, et al. Gradient-based learning applied to document recognition[C]. Proceedings of the IEEE (1998), 86(11):2278-2324. doi: 10.1109/5.726791.
|
[12] |
Deerwester S, T. Dumais S & Landauer T, et al. Indexing by latent semantic analysis[J] . Journal of the American Society for Information Science, 1990,41:391-407. doi: 10.1002/(sici)1097-4571(199009)41:660;391::aid-asi162;3.0.co;2-9;391::aid-asi162;3.0.co;2-9.
doi: 10.1002/(ISSN)1097-4571
|
[13] |
Devlin J, Chang M, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [J/OL]. arXiv preprint arXiv, 2018,1810_04805.
|
[14] |
Ng A. Y.. Feature selection, L1 vs. L2 regularization, and rotational invariance[C] . Proceedings of the twenty-first international conference on Machine learning, 2004.
|
[15] |
Socher R., Huang E & Pennington J. et al. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection[J] . Advances in neural information processing systems, 2011: 24.
pmid: 25152607
|
[16] |
Socher R, Pennington J & Huang E, et al. PSemi-supervised recursive autoencoders for predicting sentiment distributions[C] . Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2011.
|
[17] |
Socher R, Perelygin A & Wu J, et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank[C] . Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013.
|
[18] |
Elman J. Finding structure in time[C] . Cognitive Science, 1990,14(2):179-211. https://onlinelibrary. wiley. com/doi/abs/10.1207/s15516709cog1402_1doi:10.1207/s15516709cog1402\_1
doi: 10.1207/s15516709cog1402_1
|
[19] |
Liu P, Qiu X & Huang X, Recurrent Neural Network for Text Classification with Multi-Task Learning [J/OL]. arXiv preprint arXiv, 2016,1605_05101.
|
[20] |
Ghosh S, Vinyals O, Strope B, et al. Contextual LSTM (CLSTM) models for Large scale NLP tasks [J/OL]. arXiv preprint arXiv, 2016,1602_06291.
|
[21] |
Lai S, Xu L & Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C] . Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015.
|
[22] |
Zhang Y & Wallace B, A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification [J/OL]. arXiv preprint arXiv, 2015,1510_03820.
|
[23] |
Snoek J, Larochelle H & Adams R. P, Practical Bayesian Optimization of Machine Learning Algorithms[J/OL]. Advances in Neural Information Processing Systems 25, 2012.
|
[24] |
Bergstra J, Bardenet R, Bengio Y, et al. Algorithms for Hyper-Parameter Optimization[C] . Advances in Neural Information Processing Systems, 2011.
|
[25] |
Bergstra J & Cox D. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures[C]. Proceedings of the 30th International Conference on Machine Learning 2013: 115-123.
|
[26] |
Gulcehre C, Moczulski M, Denil M, et al. Noisy Activation Functions[C] . International Conference on Machine Learning, 2016.
|
[27] |
Glorot X & Bengio Y, Understanding the difficulty of training deep feedforward neural networks[C] . Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 2010,9, 249-256.
|
[28] |
Nair V & Hinton G, E. Rectified Linear Units Improve Restricted Boltzmann Machines[C]. Proceedings of the 27th International Conference on Machine Learning, 2010.
|
[29] |
Glorot X, Bordes A & Bengio Y, Deep Sparse Rectifier Neural Networks[C]. Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, 2011: 15.
|
[30] |
Goodfellow J, Warde-Farley D & Mirza M, et al. Maxout Networks[C]. Proceedings of the 30 th International Conference on Machine Learning, 2013(3)Vol.28:1319-1327. http://dblp.uni-trier.de/db/conf/icml/icml2013.html#GoodfellowWMCB13.
|
[31] |
Collobert R, Weston J & Bottou L, et al. Natural Language Processing (Almost) from Scratch[J] . Journal of Machine Learning Research, 2011,12(1):2493-2537.
|