skip to main content
10.1145/3397271.3401037acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections

Domain-Adaptive Neural Automated Essay Scoring

Published: 25 July 2020 Publication History


Automated essay scoring (AES) is a promising, yet challenging task. Current state-of-the-art AES models ignore the domain difference and cannot effectively leverage data from different domains. In this paper, we propose a domain-adaptive framework to improve the domain adaptability of AES models. We design two domain-independent self-supervised tasks and jointly train them with the AES task simultaneously. The self-supervised tasks enable the model to capture the shared knowledge across different domains and act as the regularization to induce a shared feature space. We further propose to enhance the model's robustness to domain variation via a novel domain adversarial training technique. The main idea of the proposed domain adversarial training is to train the model with small well-designed perturbations to make the model robust to domain variation. We obtain the perturbation via a variation of the Fast Gradient Sign Method (FGSM). Our approach achieves new state-of-the-art performance in both in-domain and cross-domain experiments on the ASAP dataset. We also show that the proposed domain adaptation framework is architecture-free and can be successfully applied to different models.

Supplementary Material

MP4 File (3397271.3401037.mp4)
This video is a brief introduction to our work on the paper "Domain Adaptive Automated Essay Scoring"


Dimitrios Alikaniotis, Helen Yannakoudakis, and Marek Rei. 2016. Automatic Text Scoring Using Neural Networks. (2016).
Martín Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein GAN. CoRR abs/1701.07875 (2017). arXiv:1701.07875
Yigal Attali and Jill Burstein. 2006. Automated essay scoring with e-rater® V. 2. The Journal of Technology, Learning and Assessment 4, 3 (2006).
Fabio Maria Carlucci, Antonio D'Innocente, Silvia Bucci, Barbara Caputo, and Tatiana Tommasi. 2019. Domain Generalization by Solving Jigsaw Puzzles. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 2229--2238.
Xilun Chen and Claire Cardie. 2018. Multinomial Adversarial Networks for Multi-Domain Text Classification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), Marilyn A. Walker, Heng Ji, and Amanda Stent (Eds.). Association for Computational Linguistics, 1226--1240. 18653/v1/n18-1111
Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie, and Kilian Q. Weinberger. 2018. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification. Trans. Assoc. Comput. Linguistics 6 (2018), 557--570.
Madalina Cozma, Andrei M. Butnaru, and Radu Tudor Ionescu. 2018. Automated essay scoring with string kernels and word embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers, Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computational Linguistics, 503--509.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186.
Fei Dong and Yue Zhang. 2016. Automatic Features for Essay Scoring - An Empirical Study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1- 4, 2016, Jian Su, Xavier Carreras, and Kevin Duh (Eds.). The Association for Computational Linguistics, 1072--1077.
Fei Dong, Yue Zhang, and Jie Yang. 2017. Attention-based Recurrent Convolutional Neural Network for Automatic Essay Scoring. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada, August 3-4, 2017, Roger Levy and Lucia Specia (Eds.). Association for Computational Linguistics, 153--162.
Youmna Farag, Helen Yannakoudakis, and Ted Briscoe. 2018. Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), Marilyn A.Walker, Heng Ji, and Amanda Stent (Eds.). Association for Computational Linguistics, 263--271.
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor S. Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. J. Mach. Learn. Res. 17 (2016), 59:1--59:35.
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger (Eds.). 2672--2680. http://papers.
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. (2015).
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (1997), 1735--1780. 8.1735
Cancan Jin, Ben He, Kai Hui, and Le Sun. 2018. TDNN: A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computational Linguistics, 1088--1097.
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. (2015).
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States, Peter L. Bartlett, Fernando C. N. Pereira, Christopher J. C. Burges, Léon Bottou, and Kilian Q. Weinberger (Eds.). 1106--1114. neural-networks
Jiawei Liu, Yang Xu, and Lingzhe Zhao. 2019. Automated Essay Scoring based on Two-Stage Learning. CoRR abs/1901.07744 (2019). arXiv:1901.07744
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2017. Adversarial Multi-task Learning for Text Classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, Regina Barzilay and Min-Yen Kan (Eds.). Association for Computational Linguistics, 1--10. 1001
Ryo Masumura, Yusuke Shinohara, Ryuichiro Higashinaka, and Yushi Aono. 2018. Adversarial Training for Multi-task and Multi-lingual Joint Modeling of Utterance Intent Classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun'ichi Tsujii (Eds.). Association for Computational Linguistics, 633--639. v1/d18--1064
Takeru Miyato, Andrew M. Dai, and Ian J. Goodfellow. 2017. Adversarial Training Methods for Semi-Supervised Text Classification. (2017). https://openreview. net/forum?id=r1X3g2_xl
Farah Nadeem, Huy Nguyen, Yang Liu, and Mari Ostendorf. 2019. Automated Essay Scoring with Discourse-Aware Neural Models. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, BEA@ACL 2019, Florence, Italy, August 2, 2019, Helen Yannakoudakis, Ekaterina Kochmar, Claudia Leacock, Nitin Madnani, Ildikó Pilán, and Torsten Zesch (Eds.). Association for Computational Linguistics, 484--493. v1/w19-4450
Mehdi Noroozi and Paolo Favaro. 2016. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VI (Lecture Notes in Computer Science, Vol. 9910), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer, 69--84. 3-319-46466-4_5
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). ACL, 1532--1543.
Peter Phandi, Kian Ming Adam Chai, and Hwee Tou Ng. 2015. Flexible Domain Adaptation for Automated Essay Scoring Using Correlated Linear Regression. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, Lluís Màrquez, Chris Callison-Burch, Jian Su, Daniele Pighin, and Yuval Marton (Eds.). The Association for Computational Linguistics, 431--439. v1/d15-1049
Kaveh Taghipour and Hwee Tou Ng. 2016. A Neural Approach to Automated Essay Scoring. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, Jian Su, Xavier Carreras, and Kevin Duh (Eds.). TheAssociation for Computational Linguistics, 1882--1891.
Yi Tay, Minh C. Phan, Luu Anh Tuan, and Siu Cheung Hui. 2018. SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 5948--5955. ocs/index.php/AAAI/AAAI18/paper/view/16431
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998--6008.
Yucheng Wang, Zhongyu Wei, Yaqian Zhou, and Xuanjing Huang. 2018. Automatic Essay Scoring Incorporating Rating Schema via Reinforcement Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun'ichi Tsujii (Eds.). Association for Computational Linguistics, 791--797.
Helen Yannakoudakis, Ted Briscoe, and Ben Medlock. 2011. A New Dataset and Method for Automatically Grading ESOL Texts. In The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, Dekang Lin, Yuji Matsumoto, and Rada Mihalcea (Eds.). The Association for Computer Linguistics, 180--189.

Cited By

View all
  • (2024)Dual‐scale BERT using multi‐trait representations for holistic and trait‐specific essay gradingETRI Journal10.4218/etrij.2023-032446:1(82-95)Online publication date: 28-Feb-2024
  • (2024)Graded Relevance Scoring of Written Essays with Dense RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657744(1329-1338)Online publication date: 10-Jul-2024
  • (2024)A comparison review of transfer learning and self-supervised learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122807242:COnline publication date: 16-May-2024
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Conferences
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2020
2548 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2020


Request permissions for this article.

Check for updates

Author Tags

  1. automated essay scoring
  2. domain adaptation
  3. natural language processing
  4. self-supervised learning


  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Key Laboratory of Science Technology and Standard in Press Industry
  • Tencent AI Lab Rhino-Bird Focused Research Program



Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)131
  • Downloads (Last 6 weeks)9
Reflects downloads up to 14 Sep 2024

Other Metrics


Cited By

View all
  • (2024)Dual‐scale BERT using multi‐trait representations for holistic and trait‐specific essay gradingETRI Journal10.4218/etrij.2023-032446:1(82-95)Online publication date: 28-Feb-2024
  • (2024)Graded Relevance Scoring of Written Essays with Dense RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657744(1329-1338)Online publication date: 10-Jul-2024
  • (2024)A comparison review of transfer learning and self-supervised learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122807242:COnline publication date: 16-May-2024
  • (2024)A crowdsourcing-based incremental learning framework for automated essays scoringExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121755238:PBOnline publication date: 27-Feb-2024
  • (2023)NC2T: Novel Curriculum Learning Approaches for Cross-Prompt Trait ScoringProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592027(2204-2208)Online publication date: 19-Jul-2023
  • (2023)Tapping the Potential of Coherence and Syntactic Features in Neural Models for Automatic Essay ScoringInternational Journal of Asian Language Processing10.1142/S271755452350006632:02n03Online publication date: 15-Jul-2023
  • (2023)Enhanced cross-prompt trait scoring via syntactic feature fusion and contrastive learningThe Journal of Supercomputing10.1007/s11227-023-05640-280:4(5390-5407)Online publication date: 27-Sep-2023
  • (2023)Learning from Patterns via Pre-trained Masked Language Model for Semi-supervised Automated Essay ScoringAdvanced Intelligent Computing Technology and Applications10.1007/978-981-99-4752-2_41(497-510)Online publication date: 31-Jul-2023
  • (2022)Robust Automated Essay Scoring by Using Attentive Capsule2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS)10.1109/CCIS57298.2022.10016365(595-599)Online publication date: 26-Nov-2022
  • (2021)Gated Character-aware Convolutional Neural Network for Effective Automated Essay ScoringIEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology10.1145/3486622.3493945(351-359)Online publication date: 14-Dec-2021
  • Show More Cited By

View Options

Get Access

Login options

View options


View or Download as a PDF file.



View online with eReader.








Share this Publication link

Share on social media