Acquisition of the Mandarin ba-Construction by Cantonese Learners

: Optionality, defined as the coexistence of two or more variants of a given construction, has been investigated in second language studies. Following the notion of optionality, this study defines the Mandarin ba -construction and its corresponding non- ba -forms as optional variants to examine Cantonese-speaking learners’ acquisition of Mandarin. We designed an elicited production task and an acceptability judgement task to test the disposal and locational displacement types of the ba -construction, and invited Cantonese learners and Mandarin native speakers to attend the experiments. The Cantonese learners, like learners with other language backgrounds, produced fewer ba -sentences compared with native speakers, but the Cantonese learners produced much more ba -sentences than learners with other language backgrounds and were aware of the constraints on the ba -construction, which is probably due to the existence of the similar zoeng -construction in Cantonese. Although at an advanced level, the performance of the Cantonese learners diverged from that of the native speakers in both the production and the judgement, which demonstrates the existence of optionality in the Cantonese learners’ interlanguage grammars.


Introduction
The interlanguage of second language (L2) learners is known to be different from the native language and the target language (Selinker, 1972;Adjemian, 1976;Birdsong, 1989;Yip, 1995;Yuan, 2007), and the notion of 'optionality' has been put forward to provide a new perspective on interlanguage grammars (Sorace, 2000(Sorace, , 2003. Optionality, which is different from variation, refers to the linguistic competence in underlying knowledge (i.e. the speakers have the optional variants in their mind, but might not utter them in their production). The Mandarin ba-construction and its corresponding non-ba-forms are defined as two optional variants in this study to help us explore the nature of optionality.
1.1 Optionality in interlanguage Selinker (1972) introduced the term interlanguage to represent a special language system developed by L2 learners that is independent of their first language (L1) and their L2. Interlanguage is thought to be influenced by the linguistic features of both the L1 and L2, even though it systematically behaves like a natural language (Adjemian, 1976;Yip, 1995). The differences between the end state of child learners and that of adult learners have been noticed in the literature (Coppieters, 1987;Birdsong, 1992;Sorace, 1993;Papp, 2000). A child achieves native competence in the language (L1) at the end state, whereas the end state of an adult learner always varies and is characterised by different levels of proficiency in the L2 (sometimes termed 'multicompetence') (Cook, 1995). The interlanguage may cease to develop at a certain stage if the learner reaches a steady state (Lardiere, 1998a;White, 2003). Proposed as a new perspective to capture the interlanguage grammars, optionality was defined as the coexistence of two or more variants of a given construction within an individual grammar, and the variants 'make use of the same lexical resources and express the same meaning' (Sorace, 2000:93). The syntactic movement of the adverb, as in (1a), and the drop of the complementiser 'that', as in (2b), can serve as illustrations of optionality.
(1) a. Mary speaks very well English.
b. I think Paul is very clever. (Sorace, 2003:135) According to Sorace (2000), French speakers of English tend to produce (1a), which has undergone adverbial movement. The omission of the complementiser 'that' in (2b) is another example of optionality. Sorace (2003) further claimed that optionality differs from 'variation' in that optionality refers to 'a state of grammatical competence', which is not a necessary or sufficient condition for variable performance. Optional variants exist in speakers' minds, but this does not guarantee the occurrence of the variants in the speakers' speech.
Optionality is found in both L1 child and L2 grammars. The optionality in L2 grammars has been investigated in various studies (Sorace, 1993;Lardiere, 1998b;Papp, 2000;Parodi & Tsimpli, 2005;Domínguez & Arche, 2008). L2 optionality differs from L1 optionality in the following ways: (a) most of the L2 learners have already achieved full L1 competence, so the L1 may serve as an additional source of L2 optionality; (b) developmental optionality in L1 may end with only one variant, whereas L2 optionality tends to persist even at advanced competence levels; and (c) residual optionality is found in L2 ultimate attainment (Sorace, 2000(Sorace, , 2003. Robertson (2000) investigated the acquisition of English articles by Mandarin learners. Unlike English, Mandarin does not use articles. The dialogue in (3) exhibits either adoption or avoidance of articles by Mandarin learners. In this case, Mandarin is the learners' additional source of L2 optionality.
B: The left hand side.
A: Yeah, left hand side. And, er, the distance between blue square to red square is about five cm.
And, er . . . (Robertson, 2000:159) Unlike the optional variants in L1 acquisition, the optional variants emerge alternatively in L2 development, and the optionality still exists even when the L2 learners have reached advanced proficiency. By that stage, the target variant would be 'strongly but not categorically preferred', but the non-target variant would never be 'completely expunged'; rather, the non-target variant would still appear occasionally in near-native grammars (Sorace, 2000(Sorace, , 2005. Sorace (2000) also proposed that L2 grammars have a 'much greater degree of tolerance' for the optional variants than native grammars.
Apart from the distinction in word order, there are further constraints on the ba-NP and ba-VP. The ba-NP must be either definite or generic (Cui, 1995;Lü, 2002Lü, [1955; Mullie, 1932:178-185). The constraint on the ba-VP is that bare verbs are prohibited; namely, the VP must be morphologically complex, and should be either (a) a resultative verbal complement (RVC), (b) a verb plus an aspect marker, or (c) a verb plus a locative prepositional phrase (PP) (Li & Thompson, 1981:463-491;Liu, 1997). An alternative view comes from Feng (1995Feng ( , 2001, who argued that the ba-construction might be under one kind of prosodic constraint and thus a bare-verb construction might also be allowed. Moreover, Yang (1998aYang ( , 1998b proposed that the ba-construction must satisfy the aspectual feature of [+telic] and [+perfective], so the VP must be complex to indicate the telicity of the activity. Wang (1987Wang ( [1943:124-130) put forward the notion of disposal and suggested that the ba-construction must carry a semanteme that deals with something. The ba-construction may also be used in other cases (see Lü (1994) for a summary). Lü (1994) classified 1,094 ba-sentences into six semantic types and calculated the proportion of each type. The results showed that the disposal and locational displacement (location) types occurred the most frequently, with the former constituting 49.8% of all the sentences (546 out of 1094) and the latter 27.8% (305 out of 1094). Based on Lü's classification, this study investigated these two kinds of ba-construction: the location ba-construction as illustrated in (6a) and the disposal ba-construction as illustrated in (6b). The ba-NPs in the location type undergo locational displacement as 'being moved away' in (6a), while the ba-NPs in the disposal type are affected by the action as 'being erased' in (6b). Previous studies on L2 acquisition of the ba-construction mainly focused on the development of the ba-construction in L2 grammars (Du, 2006;Huang & Yang, 2004;Jin, 1993;Jin, 2010;Wen, 2012;Yu, 2000). Mandarin belongs to the Sino-Tibetan language family, but the L1s of participants from the previous studies were either Indo-European Languages (English and Norwegian) or Altaic languages (Japanese and Korean). Also, different methods were adopted in these studies, among which the elicited production task was the most popular. In general, these studies showed that although participants of different levels exhibited certain awareness of the constraints on the ba-construction and were able to produce correct ba-sentences, their percentages of adoption were much lower than those of the native control groups. However, no satisfactory explanations were provided for this finding.
In Cantonese, the zoeng-construction is regarded as the counterpart of the Mandarin ba-construction (Tang, 2002). As shown in (7) and (8), the zoeng-construction shares a very similar syntactic structure with the ba-construction, but there are some apparent differences between them (Cheung, 1992;Chen, 2005). For example, the ba-construction accepts all kinds of subject nouns, but the subject of the zoeng-construction must be an [+animate] agent. Also, while the ba-construction is a widely used construction in Mandarin, the usage of the zoeng-construction is very restricted, i.e. it occurs only in disposal cases and in very formal situations. Still, the Cantonese zoeng-construction may have some effects on the interlanguage of Cantonese learners and serve as a source of L2 optionality.

This study
To fill the gaps in the field, this study attempts to observe and explain the acquisition data from a different perspective, namely, by positing the existence of optionality in the participants' interlanguage grammars. We define optionality as two variants (the ba-construction, as in (4b), and the non-ba-construction, as in (4a)) and investigate whether optionality exists in the acquisition of the Mandarin ba-construction by the Cantonese speakers. The following are the research questions this study attempts to address: 1) Do the L2 learners have a preference for the ba-construction or the non-baconstruction?
2) Is there any difference between the acquisition of the disposal ba-construction and the acquisition of the location ba-construction?
3) How is the L2 learners' tolerance for ba-sentences and corresponding non-basentences?

Methodology
We designed an elicited production experiment and an acceptability judgement experiment, which were conducted one-on-one in a quiet room. All the participants gave their written informed consent prior to the experiments, and the production experiment always preceded the judgement experiment.

Participants
The same participants were recruited for the two experiments. Their background information is provided in Table 1. The Cantonese group consisted of 30 Cantonese-speaking undergraduates from the Chinese University of Hong Kong who were born and raised in Hong Kong. A background questionnaire was prepared to assess the participants' language backgrounds and guarantee their proficiency in Mandarin. Additionally, an independent test of Mandarin listening and comprehension was administered at the end of each experiment. The highest possible score for the independent test was 20, and participants' scores ranged from 17 to 20 (M = 18.97), from which we can conclude that the Mandarin proficiency of the participants was at the advanced level. Twenty native speakers of Mandarin formed the control group. Since the Chinese spoken in Northern China is closest to Standard Mandarin, only those who were born and raised in Northern China were recruited.
2.2 Experiment 1: Elicited production The first experiment was an elicited production test with 24 sets of pictures as the stimuli. Among the 24 sets of pictures, 12 describe one situation that involves either the case of location or the case of disposal as target sentences (six sets for each case), and the other 12 are fillers. Figures 1 and 2 are examples of the stimuli for the target sentences, and Figure 3 illustrates a case of the fillers. The picture stimuli used in this test were prepared and originally designed by the author and our artists.
Because students in Hong Kong are required to learn written Chinese (which generally follows the grammar of Mandarin), the data were collected orally to avoid the influence of the written form (Chinese characters). During the experiment, the participants were instructed to describe each set of pictures in one sentence in Mandarin. They were allowed to repeat or correct the sentences they produced. Before the experiment, there was a practice session with another two sets of pictures for the participants to get familiar with the task. The entire task was audio-recorded to guarantee that the answers were recorded accurately.  To process the data, the target sentences in the recordings were first transcribed to text. Next, the author manually coded all the sentences and identified whether each sentence was a ba-sentence or a non-ba-sentence. The ba-sentences were coded as 'ba' (labelled as '1') and the non-ba-sentences as 'non-ba' (labelled as '0'). For the data analysis, we fitted linear mixed-effects models using the 'lme4' package (Bates, Mächler, Bolker & Walker, 2015) in R (R Core Team, 2018). In model construction, Response ('ba' or 'non-ba') was the dependent variable. Language (i.e., L1, 'Mandarin' or 'Cantonese') and Type ('Disposal' or 'Location') were included as the fixed effects, and Participant and Trial were added as the random effects. Likelihood ratio tests were used to determine whether the effects of independent variables reached significance. Besides the quantitative analysis, we also examined the ba-NP and ba-VP in detail to see whether the constraints of ba-construction are acquired.

Experiment 2: Acceptability judgement
In the acceptability judgment test, there were 48 trials in total. Twelve of them were sentences with the ba-construction, as in (4b), and another 12 were simple declaratives, as in (4a). The remaining 24 were filler sentences. There were six disposal ba-sentences and six location ba-sentences, all of which are grammatical, while their corresponding simple declaratives are either barely acceptable or unacceptable. The grammaticality of the fillers also varies. All the stimuli were randomised and recorded by two native speakers of Mandarin prior to the experiment.
During the experiment, the participants were presented with the audio stimuli and asked to make a judgment after listening to each Mandarin sentence. Since linguistic data occasionally fail to provide a 'clear-cut division' between fully acceptable/grammatical and fully unacceptable/ungrammatical, and there always exists gradience as a result of the data's varying in degrees of acceptability/grammaticality (Sorace & Keller, 2005), we designed a rating scale in which the participants were required to rate each sentence on a five-point Likert scale (Likert, 1932) from 'Completely unacceptable' (marked as '1') to 'Completely acceptable' (marked as '5'). Our rating scale of acceptability judgement was similar to that used in Ma (2008), except for that no specific description of the levels 2, 3 and 4 was provided to the participants in our experiment.
For the data analysis, we employed ordinal logistic regression modelling to the rating responses. Although we presented a five-point rating scale to the participants, it would be dangerous to claim that the distance from '1' to '2' is equivalent to the distance from '2' to '3'. The advantage of using the ordinal logistic regression is that this model assumes only the hierarchy of the responses while not considering the distance of them (Harrell, 2015). In the models, Response was included as the dependent variable, and Ba ('ba' or 'non-ba'), Language ('Mandarin' or 'Cantonese') and Type ('Disposal' or 'Location') were the independent variables. The models were fitted with the 'MASS' package (Venables & Ripley, 2002) and the figures were plotted with the 'ggplot2' package (Wickham, 2016). Similarly, likelihood ratio tests were used to determine whether the effects of dependent variables reached significance. Table 2 is an overview of the production data. The Cantonese learners produced 252 ba-sentences, which constituted 70% of the 360 sentences they produced in total. The Mandarin speakers produced 212 ba-sentences, accounting for 88% of all their sentences. We first fitted linear mixed-effects models with all the data we collected. There was an effect of Language (χ 2 (1) = 8.684, p = .003), suggesting that the Mandarin group outperformed the Cantonese group in the production. The effect of Type(χ 2 (1) = .513, p = .474) or the interaction of Language and Type (χ 2 (1) = 1.379, p = .240) did not reach significance, from which we can infer that the participants did not distinguish between the two types of ba-sentences in their production. Next, we compared the production data of each group separately. Again, the models showed that both groups had similar performance in the production of the two types of ba-sentences (for the Mandarin group: χ 2 (1) = 1.313, p = .252; for the Cantonese group: χ 2 (1) = .123, p = .726). Lastly, we separated the data into different types of ba-sentences to see whether there were differences in the two groups. According to the models, the Mandarin group had higher accuracy scores than the Cantonese group for both the disposal type (χ 2 (1) = 9.545, p = .002) and the location type (χ 2 (1) = 5.728, p = .017).

Elicited production
We then examined the ba-NPs and ba-VPs in detail. Table 3 shows the different types of ba-NPs produced by the two groups of participants, the distribution of which was similar across groups. More than half of the ba-NPs were bare nouns, which normally refer to the objects already known by both the speaker and the listener, so bare nouns are well in line with the [+definite] constraint on the ba-NP. In addition, the participants replaced the nouns with pronouns in multiclausal sentences. The remaining ba-NPs were all combinations of a modifier and a head noun, where the modifier was either a possessive element or a relative clause preceding the noun. Both the pronouns and the modified nouns are definite. Therefore, no error has been found concerning the semantic requirement of the ba-NP.  Table 4 presents the distribution of ba-VPs produced by the two groups. A similar distribution pattern was also observed for the two groups. Of all the types, 'RVC (+ le)' was the most frequently used type of ba-VP, followed by 'V + P (+ le) + L', 'V + P (+ le) + N' and 'V + le + (C)'. These four types constituted 95.07% of all the ba-sentences (95.26% for the Cantonese group and 94.86% for the Mandarin group). They clearly indicated the results of changes to the objects, which fit the semantic and pragmatic requirements of the ba-construction. A closer examination on the ba-NPs and ba-VPs showed that the Cantonese learners have acquired the constraint on the ba-construction and are able to produce appropriate ba-sentences. Despite the fact that the Mandarin speakers produced more ba-sentences than the Cantonese learners at the group level, there were three Cantonese speakers that produced 12 ba-sentences in their data, that is, they used ba-construction for all the target sentences, which is even more frequent than some native speakers. At the end of the next section, we will compare the rating of these three participants with the Mandarin group.

Acceptability judgement
This part reports the results of the acceptability judgement task. The responses to the 24 target sentences (12 ba-sentences and 12 corresponding non-ba-sentences) were analyzed and presented in Figure 4. There were significant effects of the variables Ba ('ba' or 'non-ba'; (χ 2 (1) = 185.734, p < .001)) and Language ('Mandarin' or 'Cantonese'; type (χ 2 (1) = 5.638, p = .018)) but not any effect of the variable Type ('Disposal' or 'Location'; type (χ 2 (1) = 0.692, p = .405)). Interaction between Ba and Language was also found (χ 2 (1) = 55.142, p < .001), suggesting divergence between the two groups in the rating of the ba-sentences and non-ba-sentences. We then separated the data into two language groups and fitted different models to examine the effects of Ba and Type on the rating scores. For the data of Mandarin speakers, there were effects of Ba (χ 2 (1) = 181.491, p < .001) and Type (χ 2 (1) = 9.554, p = .002) as well as interaction between the two variables (χ 2 (1) = 5.335, p = .021). Post-hoc tests revealed two points: (a) the Mandarin speakers rated the ba-sentences higher than the non-ba-sentences (p < .001); and (b) for the non-ba-sentences, the Mandarin speakers rated the Location type lower than the Disposal type (p < .001). For the Cantonese speakers, only the effect of Ba reached significance (χ 2 (1) = 44.195, p < .001). Although Cantonese speakers also gave higher scores to the ba-sentences than the non-ba-sentences (p < .001), they did not distinguish the Location and Disposal types for the non-ba-sentences (p = .594).
Next, we divided the data into ba-and non-ba-sentences, and tested whether the effects of Language and Type on the rating scores reached significance. For the ba-sentences, there was an effect of Language (χ 2 (1) = 55.113, p< .001) and the Mandarin speakers rated the ba-sentences higher than the Cantonese learners (p< .001). No effect of Type or interaction between Language and Type was found, suggesting that both groups gave consistently high scores to the two types of ba-sentences. For the non-ba-sentences, there were effects of Language (χ 2 (1) = 8.245, p = .004) and Type (χ 2 (1) = 5.424, p = .020) as well as an interaction between the two variables (χ 2 (1) = 13.919, p < .001), indicating that both groups gave higher scores to the Location type non-ba-sentences and that the Mandarin speakers gave lower scores than the Cantonese learners. Although both groups did not show any difference when rating the Disposal type non-ba-sentences (p = .585), the Mandarin speakers gave lower scores when rating the Location type non-ba-sentences (p < .001). Lastly, we extracted the rating data from the three Cantonese learners (C03, C27 and C33) that produced 12 ba-sentences in the production experiment to compare them with the Mandarin rating data. The results are presented in Figure 5. We first fitted models with data from the three learners as a group, and then fitted models for each learner separately. At the group level, there was a marginal effect of Language on the rating scores (χ 2 (1) = 2.675, p= .102), suggesting that the three learners still behaved differently from the Mandarin speakers. Also, compared with the three Cantonese learners, the Mandarin speakers gave higher scores in the rating to the ba-sentences (χ 2 (1) = 9.845, p = .002). At the individual level, Participant C03 did not differ from the Mandarin group (χ 2 (1) =.136, p = .712), either for the ba-sentences (χ 2 (1) =.285, p = .593) or for the non-ba-sentences (χ 2 (1) = .944, p = .331), while the other two participants' rating to the ba-sentences differed from the Mandarin group (χ 2 (1) = 9.845, p = .002 for Participant C27 and χ 2 (1) = 9.845, p = .002 for Participant 33). However, an obvious difference can be seen from the data. The three learners, unlike the native speakers, only use four of the five points on the rating scale. The one they did not choose is 'completely unacceptable' (as shown in Figure 5), indicating that the learners were more tolerant to the non-ba-sentences than the native speakers, as predicted by Sorace (2000).

Discussion
This study asked the following research questions: (a) Do the L2 learners have a preference for the ba-construction or the non-ba-construction? (b) Is there any difference between the acquisition of the disposal ba-construction and the acquisition of the location ba-construction? (c) How is the L2 learners' tolerance for ba-sentences and corresponding non-ba-sentences? The L2 learners produced 70% ba-sentences in the production task with very few errors, and the result revealed a clear preference for the ba-construction over the non-ba-construction. In regard to the two types of ba-construction tested in this study, the L2 learners did not show any difference between the disposal and location ba-sentences in both tasks. When they were required to mark the acceptance of the Mandarin sentences, the L2 learners, like the native speakers, gave higher scores to the ba-sentences than the non-ba-sentences, but divergence was found in the rating: the L2 learners did not rate the ba-sentences as high as the native speakers; nor did they rate the non-ba-sentences as low as the native speakers.
It can be seen from our data that the Cantonese learners performed relatively well in the two tasks; they produced more target ba-sentences than the non-target-forms in production, and they also gave higher marks to the ba-sentences in the acceptability judgment task. Our data proves that the ba-construction is learnable and acquirable for the Cantonese learners. Although the frequency of using ba-construction by the L2 learners was not as high as the L1 speakers, the L2 learners were aware of the constraints on the ba-construction and were able to express themselves with the appropriate ba-sentences.
Compared with the previous studies on the L2 acquisition of the ba-construction, the Cantonese learners' performance in this study was much better, as they produced noticeably more ba-sentences than learners with other L1s (e.g. only 27.1% were produced by advanced English learners in Wen (2012)). A plausible explanation is that the similarities in linguistic features between Cantonese and Mandarin may have contributed to the more successful acquisition of the ba-construction by the Cantonese learners, whereas the absence of a counterpart in the L1s may result in less successful acquisition by learners with other language backgrounds (Mai, 2016). Without any prior knowledge of such a structure in their L1s, it is very difficult (if not impossible) for learners to acquire the complex structure and the constraints of the ba-construction, and this consequently makes the learners tend to avoid using it (Yu, 2000). However, one might argue that the observed divergence between the Cantonese learners and English learners may result from the differences in the test design while not from the L1s. To test whether the claim of L1 influence holds true, speakers with other L1s should attend the same experiment for a direct comparison with our data from Cantonese learners.
Although the Cantonese learners in our study outperformed the learners from previous studies, there was still divergence between the Cantonese learners and native speakers in both the production and judgement tasks, suggesting that the Cantonese learners have not reached nativelike competence in the use of the ba-construction. In the production task, three Cantonese learners adopted the ba-construction for all the target trials, which may be a counterexample of this claim. But when we examined the judgement data of the three learners in details, we found that none of them rejected the non-ba-sentences as strongly as the native speakers (as shown above, they did not choose 'completely unacceptable'). Thus, 'residual optionality' in our data provided evidence to the existing literature on incompleteness in L2 ultimate attainment (Cheng & Tang, 2016;Zheng, 2018). According to the Interface Hypothesis (Sorace, 2011;Sorace & Filiaci, 2006), narrow syntactic properties can be fully acquired by L2 learners, while properties at the interface of different linguistic domains (e.g. syntax-pragmatics interface) are particularly challenging and may not reach nativelike competence in ultimate attainment. The ba-construction, as reviewed above, has syntactic and semantic constraints and is not a purely syntactic structure. It is not surprising that the advanced learners of Mandarin did not show nativelike performance in our tasks.

Conclusion
In this study, we conducted an elicited production experiment and an acceptability judgement experiment to explore the acquisition of Mandarin ba-construction by Cantonese learners. The Cantonese learners, like learners with other language backgrounds, produced fewer ba-sentences compared with native speakers, but the Cantonese learners produced much more ba-sentences than learners with other language backgrounds and were aware of the constraints on the ba-construction, which is probably due to the existence of the similar zoeng-construction in Cantonese. Although at an advanced level, the performance of the Cantonese learners diverged from that of the native speakers in both the production and the judgement experiments, which demonstrates the existence of optionality in the Cantonese learners' interlanguage grammars.