Abstract
Academic English writing poses additional barriers for researchers with
English as a second language (L2 researchers), including linguistic and cognitive
challenges. Writing assistants offer the opportunity to offer multiple solutions in
an integrated environment to help writers improve their English. However,
existing solutions are not suitable for L2 researchers for a variety of reasons,
including use of L1-English training data and focus on grammar and spelling
checking. As such, this research proposes the development of Scrido: an L2 writing
assistant. The proposed tool will be first-language oriented and target an academic
audience, offering high-level style and usage suggestions, empowered by lexical
resources, and having an easy-to-use yet comprehensive user interface.
Instead of manually crafting text improvement rules, an automated method
for detecting and suggesting improvements on the academic writing of L2
researchers was developed. The method uses machine-learning to develop an
intralingual neural machine translation model that translates problematic L1-
influenced sentences into improved English sentences. Training data was
automatically compiled by machine-translating the non-English text of a parallel
corpus and using the translated text as a proxy for L1-influenced sentences
aligned with the reference translation. In order to provide examples of authentic
usage in the scientific community and decrease the cognitive burden of referring
to external resources, the writing assistant also includes features developed using
a corpus of high-impact papers that was purpose-built for this study.
Using both automated and human evaluation, the study indicates that
machine translated text can be effectively used as training data for the
development of an intralingual translation engine for L2 users. In addition, this
research revealed how user perceptions varied across participants’ demographics
and provided recommendations for developing further improvements in writing
assistants that are personalised according to users’ L2, research area, and level of
English.