Comparing Afrikaans Morpho-syntactic Variation in a South African Underdeveloped Context

: This paper is a qualitative and quantitative evaluation of a particular methodology for uncovering dialectal morpho-syntactic variation. It outlines the challenges facing linguists on a project to document morpho-syntactic variation in Afrikaans in the Cape Provinces of South Africa as well as some novel data on the typology of expletives in Afrikaans. South Africa‟s linguistic situation is precarious, rapidly changing and driven by underdevelopment. This poses particular challenges to the organization of a comparative dialectological survey at practical and methodological levels that are perhaps slightly different to the issues faced in European contexts where the methodology was developed. The preliminary results have uncovered previously unknown variation with respect to expletives.

the only West-Germanic language to systematically lack verbal agreement morphology of any kind, it includes a variety of constructions which are not found in other West-Germanic languages and it evolved relatively recently (since 1652) in a context of intense language contact, thus providing insights into creole formation, linguistic diffusion and the theoretical nature of language change.
In order to explore these issues further, a project was set up to study morpho-syntactic variation in the Afrikaans spoken in the Northern, Western and Eastern Cape provinces. Using methodologies that were originally developed and widely tested in European contexts, the question remains as to their applicability in the underdeveloped contexts of the rural Cape provinces. This paper explores some of the results of the project to date, arguing that while the method does return valid results, the roles of the interactants in the data-collection process has a crucial bearing on the overall "instrument". It will be argued that the methodologies cannot be treated as context-free and that a sensitivity to pragmatics is essential to their success.
The Afrikaans Variation Project is a three-year collaborative project between the Linguistics Departments of Rhodes University, Stellenbosch University and the Meertens Institute in the Netherlands. As such it creates a space for collaboration both nationally and internationally. The project has several aims, chief of which is to document the typology of morpho-syntactic variation in Afrikaans, specifically focussing on the Afrikaans spoken in the Cape Provinces. In doing so, it develops a network of language respondents who can hopefully constitute a useful community-based linguistic resource. The data will be made available, publicly and onlinea first for a project of this kind in South Africa. A preliminary database with a partial set of data can be found at http:// www. meertens. knaw.nl/ sanpad enduser/; this database will be extended as more results become available during the course of 2010. Hopefully, the entire project will ultimately contribute in a small way towards retaking the cultural discourse space that has been colonized by normative perceptions of Afrikaans and what it means to be as speaker of the language. This paper develops its argument in three broad sections. Section 2 qualitatively evaluates the methodology, section 3 provides a quantitative assessment and Section 4 concludes the article with some implications about the ways in which linguists view their instruments.

A qualitative assessment of the methodology
The methodology is broadly similar to the semi-structured interview technique described by Cornips and Poletto (2005), Cornips and Jongenburger (2001) and Cornips (2006). In particular, it is similar to the method used in Belgium during the SAND project (Barbiers et al. 2007) where fieldworkers used a questionnaire/stimulus-sheet to elicit (i) indications of whether a particular linguistic variable was used in a particular speech community and (ii) a paraphrased utterance from a speaker of the local variety.
Interviews were conducted in a home or neutral venue by a L1 field interviewer assisted by another interviewer who occasionally asked questions, but otherwise remained in the background. Each interview took about 45 minutes to complete with an additional 15 minutes of general discussion in the beginning. At the beginning interviews took substantially longer, but averaged around 45 minutes with practice. In an attempt to relax the respondents and allow them to get used to the interview environment, interviews commenced by asking general questions about the participant"s background, personal details as well as for stories about the area (e.g. local ghost stories). After a while, the interviewer started using the questionnaire as a stimulus to elicit indirect grammaticality judgements (Cornips and Jongenburger 2001). Here is an extract from a particular stimulus set relating to the distribution of the past-participle prefix ge-in verb clusters (i.e. the IPP effect and anti-IPP effect). ① (1) a. Ek weet dat Jan die huis gebou het. Yes: 1 2 3 4 5 I know that Jan the house pst-build have. "I know that Jan (has) got the house built." All West-Germanic languages which use a prefix (e.g. "ge-") to indicate past tense (1a) have complementary distribution of the prefix with a left-adjacent modal or aspectual verb; in other words, the ge-prefix is absent in these contexts (1b). This is known as the Infinitivum Pro Participio (henceforth IPP) puzzle. Thus examples (1a,b) illustrate the pattern which is also found in Dutch and similar languages. What is particularly striking ① Note that since Afrikaans has no preterite past tense (past tense is expressed periphrastically with an auxiliary), the examples in (1) are ambiguous between present perfect and past perfect readings.
about Afrikaans is that for some colloquial varieties, the prefix ge-makes an appearance, either on the aspectual/linking verb (1c) or sometimes on the lexical verb itself (1d). It has also been claimed that in some varieties of Afrikaans such as Orange River Afrikaans, the ge-prefix is occasionally doubled (1e) (Du Plessis, p.c.). Examples (1c) through (1e) are, as far as we know, unique to Afrikaans. From the perspective of the questionnaire, what is important to take from these examples is simply that they test all possible combinations of the past tense marker. Respondents were asked to provide a series of judgements, in accordance with the procedure outlined by Cornips and Jongenburger (2001:57). They were asked: (i) whether the stimulus was used in the particular speech community, (ii) what paraphrase the respondent him or herself would actually use, and (iii) follow up questions on how frequent it was, whether it was used by older or younger speakers etc. were also sometimes asked. An extract from an interview is provided below (A = Main interviewer, S = Respondent). ① (2) a. A: Dan"s daar hierdie sinnetjie: Die kinders weet dat Jan die huis gebou het.
Then there is this sentence: the children know that Jan built the house S: mmm.
A: Die kinders weet dat Jan die huis gebou het.
The children know that Jan built the house S: Ja. Dis algemeen, ja.

Yes. It's common, yes
A: Algemeen.  What these examples demonstrate is that the responses to the questions were seldom binary and that there was continual need for the interviewers to ask for disambiguation, ask follow-up questions, subtly challenge the respondent to provide qualifying information etc. As such, the interviewers dynamically perform an evaluative role. Related to this issue of dynamic interpretation is the use of the Likert scale. The use of numbers proved unwieldy as respondents did not seem to "get" the numbers. They consistently preferred verbal cues such as baie, min mense praat so, algemeen, baie selde etc. However, the interviewers experienced problems in mapping a verbal response such as algemeen to a numeric scale. The result was that the Likert scale was largely abandoned and the verbal responses were included in the transcriptions. It is important to note, however, that this change in interview practice does not entail an endorsement of binary grammaticality judgements. It is merely an acceptance that linguists wanting to use the data must be prepared to interpret it after the fact in the light of these qualifying statements by the respondents.

Piloting
The questionnaire was based on examples of variation from existing literature. At a workshop, Dutch and South African academics contributed ideas on areas of known variation which could be included in the survey. Naturally, not all of these variables were ultimately able to be represented in the questionnaire, which included questions on EPP, IPP, Verb clusters, Verb Projection Raising and various types of doubling. There was also a broader range of stimulus styles. For instance, some stimuli included sentences with "gaps" where a respondent had to insert a word and translation tasks. In addition, many stimuli included multiple variables (e.g. embedded verb-second combined with saam "together"). Written paraphrase and translation tasks were not included by design as it was felt that due to constraints on functional literacy that respondents may have had trouble with this kind of task.
The first version of the questionnaire was piloted at Paardekraal (EC) in mid 2007. Some questions were added and some discarded and/or rephrased. This process continued during fieldwork in the North-Eastern Cape. In particular, the "gap"-type stimuli were problematic as the respondents simply couldn"t parse the sentence with a word missing. In addition, stimuli with multiple variables proved difficult insofar as it was not always clear which stimulus the respondent was responding to. This required additional questioning to disambiguate the answers. Researchers from Stellenbosch also undertook fieldwork in the Western Cape. These fieldwork experiences were then brought to bear on the questionnaire at a subsequent workshop where the questionnaire was slimmed down substantially and a number of additional questions added. The questionnaire was then made available on the web site of the project so that other members of the research group could make comments on it.
2.2 Data handling Interviews were recorded digitally in uncompressed (WAV, 44.1K) format using a Marantz PMD660 solid-state recorder, a SanDisk Ultra II 4Gig CompactFlash recording medium and an ElectroVoice RE90L omnidirectional lavalier microphone with phantom power. The sound files were annotated in PRAAT. Separate tiers were used to annotate the interviewer, assistant interviewer, respondent, stimulus number, additional notes and a pragmatically informed judgment of whether the respondent had agreed or disagreed with the stimulus question. The data were extracted to a searchable MS-ACCESS database. The audio data of the respondents" responses was also extracted so that the precise response to any stimulus can be listened to and interpreted independently.

Participants
To date, most of the research has been done in the Cape Provinces, the only provinces with a large Afrikaans majority. In the Western Cape, Afrikaans is spoken by more than 50% of the population and by more than 70% in the Northern Cape. In the Eastern Cape, Afrikaans is spoken by a majority in the Western municipalities, with Xhosa gradually becoming the majority language in the East (Statistics South Africa 2007). Participants were selected according to the following criteria: (i) that they speak Afrikaans as L1 and use it predominantly at home (ii) that they be born in the area and have not moved away from the area for more than seven years (iii) that they be older than 55 and preferably even older (iv) that they be preferably from the working class or lower-middle class (v) that two men and two women from each datapoint be selected. The participants were drawn from the Northern and Eastern Cape regions respectively.
In addition to being geographically distanced, the Northern Cape is known for its regional variation (including Namakwaland Afrikaans (Links 1989) and Orange River varieties (Rademeyer 1938, Du Plessis 1984, Van Rensburg 19841989a, Verhoef 1988, Henning 1983, Roux 1988, Du Plessis 1985, Van Rensburg 1989b1990, Webb 1993) while the Eastern Cape is supposedly the region from which standard Afrikaans originated.
To provide an idea of the respondents, here are some details from those that are currently in the database. These are the same respondents whose data are cited later in this paper.

Participant selection and the observer's paradox
One means of ameliorating the observer"s paradox is to gain entry into a linguistic network. Since many networks include patterns of kinship, allegiance and obligation, having an insider vouch for one is an excellent technique for gaining entry into a particular network. This provides the fieldworker with an intermediate status between insider and outsider. For this reason, we used the "friend-of-a-friend" technique (Milroy 1987;1980) as much as possible where according to Milroy: "it was possible to equip myself with a status which was neither that of insider, nor that of outsider, but something of botha friend of a friend, or more technically, a second order network contact" (Milroy 1980:44). On arrival in a data location, the fieldworker would opportunistically ask people s/he met if they knew somebody s/he could talk to. For instance, in Kakamas (NC), we asked people working at a local supermarket if they knew a pastor in the area. It just so happened that a pastor was just leaving the shop at that moment. In our rush to meet him before he got into his car, the fieldworker ran headlong into a glass door. Since the fieldworker had been embarrassed in an effort to speak to him, this created an immediate sense of closeness between the pastor and the fieldworker and he was quite happy to provide additional contacts from the community who we could talk to. Furthermore, in approaching these contacts, we were able to use the name of the pastor as a kind of network capital with which to gain access. In another case, in Cradock (EC), we approached an office worker at the place we were staying. On hearing we were researching Afrikaans dialects, she mentioned she was doing an ATKV writing course. Since one of the original applicants for the research project was Prof. Du Plessis, who was the head of the ATKV writing school, this provided a shared acquaintance which facilitated communication in the network. In some cases, this initial contact personally introduced us to additional respondents. In other cases, we simply introduced ourselves as along the lines of: "We have just talked to X and s/he thought you would be able to help us". In addition, we asked every respondent if they knew somebody else we could talk to, or somebody in the next town we could contact. An example of the effectiveness of this technique occurred in Keimoes (NC) where we were conducting an interview with the respondent named Sally. During the interview a friend, Sophia, arrived and we were able to start an interview with her. During the interview, tea was served and interview was terminated. However, the recorder continued to run and there was no sign that the respondent was uncomfortable with it. After a fairly lengthy time, the interview was continued. A similar situation occurred in Riemvasmaak (NC) where the interview included a cigarette break and a neighbour arriving on the scene. However, in neither case was there a noticeable shift of register. These experiences are quite comparable with the experiences of Milroy (Milroy 1980) in Belfast where the friend-of-a-friend technique enabled her to record informal discussions in the vernacular. On some occasions, the respondent who we had been referred to was not at home. On one particular occasion in Somerset East (and a similar experience occurred in De Aar), a neighbour came out to see who was knocking and we started talking. Interestingly, the fact that we already had an appointment with an (absent) neighbour seemed to serve as an endorsement similar to Milroy"s friend-of-a-friend status and we were invited inside to conduct an interview and obtain further references.
A valid critique of this method is that it did not seem to assist us in accessing communities of "marginals". By "marginals", I mean those whose structural position lies beyond the established class structure, who tend to live in informal and semi-formal settlements in deep poverty (Milroy 1980:citing Lloyd (1979). Farm workers and seasonally employed agricultural workers fall into this category as do certain inhabitants on the outskirts of rural Northern Cape towns. There were occasions when we entered these settlements (e.g. in Augrabies, Sutherland and Carnarvon) and attempted to engage with people. While people were often willing to sit for an interview and/or would refer us to respondents, the interviews themselves were stilted affairs with some respondents providing anomalous answers and/or simply being led by the interviewer and occasionally, the interviewers felt a sense of danger. In such cases the interviewer tended to cut short the interviews and leave. It seems that these communities are beyond the range of the methodology described here.

Some practical problems
In addition to the philosophical issue of the independence of the "instrument" and the pragmatic inferences required in using the Likert scale, a number of problems were encountered, some more serious than others. At a purely practical level, the distances proved to be intimidating across the research area. For instance over 16 300 km have been travelled to date, taking a toll on time and resources and placing constraints on how much data can be collected during a single field trip.
There are also various sociological factors that have been interesting to negotiate. At some data points, it was very difficult to make contact with older people. For instance, in the far northern Cape, when we asked to be referred to grootmense "great-people" (i.e. elders), some of the people we met were apparently younger than 60. This reflects the harshness of life in this area. However, if we tried hard enough we were invariably able to find some older people. Alcohol abuse is very common in the entire northern Cape with several respondents also making reference to it during interviews); Foetal Alcohol Syndrome is the highest in the world; 10% severe incidence among children under 12 months and 50% partially affected in the same group (http:// www. scienceinafrica. co.za/ 2006/ march/ fas.htm (Accessed 14 September 2008)).) Nevertheless, it should be noted that we did not experience these types of problems consistently as the social networks we utilized differed from place to place. For instance, in areas where we met respondents based on personal social networks, the respondents tended to differentiate themselves from those in the same community who abused alcohol. Anecdotally then, it seems that many rural communities are divided by patterns of alcohol use and abuse.
It is also worth mentioning that where difficulties were experienced in meeting respondents of the required demographic, this may be partially attributable to the harshness of rural life in South Africa. Ideally, dialect research should be conducted in an environment with low volumes of migration to and from the area, focussing on respondents who are older, linguistically conservative and geographically immobile. The reality of rural South African life is very different. Farm economies frequently involve migrant labour, often from outside the region (e.g. some respondents mentioned that seTswana L2 Afrikaans was a different variety) and also within the region (many of our respondents had travelled extensively within the region in their lifetimes). Moreover, development in South Africa and linguistic change are closely linked. Economic development often goes hand-in-hand with urbanization and concomitant rural depopulation. Urbanization is also fuelled by local dynamics of land ownership and drought e.g. between 1994 and 2004, 2.3 million farm workers have been displaced and another 900 000 evicted, an increase over the previous decade (Nkuzi Development Association report http: // www. nkuzi. org. za /docs / Evictions Summary.pdf (Accessed 16 March 2009)). This motivates linguistic change in two ways: first, rural varieties become less sustainable as urbanization of the youth occurs; second, urban areas become linguistic melting pots that lead to future varieties.
2.6 Validity issues Another issue we faced was that many respondents were nominally aware of the standard and were aware that they may not speak the standard. This is indicated in the quotations below where respondents use overwording of lexical sets relating to ideologies of linguistic purity: suiwer "pure", reg praat "correct speaking", verbeter "improved" and its opposite mors hom "messing it up", nie reg nie "incorrect", agterstevoor "back-to-front". In the following examples, all data are from the North East Cape (Kakamas, Keimoes, Riemvasmaak).

I know it is dancing tonight
A: Ja.

No, no no uhuh
On the other hand, anecdotally, it seems that what the respondents considered "suiwer" Afrikaans (the term used by the respondent) could differ substantially from what is considered "standard" (5a) e.g. as reified in the media and defined in prescriptive language guides. Many respondents were aware and proud of regional differences in their language (5b,c,d) and are quite adamant that they have not been influenced by external varieties or the standard (5d). In doing this, they often displayed considerable pride in the regional variety as illustrated by the following extract.

A quantitative assessment of reliability
Reliability is, of course, a central concern, especially if there is the risk of slightly noisy data (e.g. as a result of the use of interlects or the respondent accommodating to the standard). The key question is how to determine that the questionnaire method is reliable, notwithstanding the endorsements of other projects using similar methods (Cornips and Poletto 2005, Cornips and Jongenburger 2001, Cornips 2006, Cornips and Corrigan 2005, Barbiers et al. 2007. Assuming that noise is systematic, there are three parts to the argument: (i) Compare the responses to stimuli with what expected based on existing literature and other sources: i. Known attested constructions which informants judge as attested; ii. Known unattested constructions which informants judge as unattested; iii. Constructions which are known to be unattested in the Standard language but known to be attested in dialects and which are judged by informants to be attested in their dialects. (ii) Look at disconfirming evidence in a particular section and check to see whether the results are internally systematic and consistent. Section 3.1 explores (i) while section 3.2 looks at (ii). ①

Comparison with what is expected
One way of ascertaining the general reliability of the questionnaire is to compare the responses to various control stimuli against the actual responses from the participants. The respondents responded reliably to these stimuli in the anticipated way. In the following examples, the percentage reflects the number of respondents who indicated judgements as expected (n=19); in other words, this indicates the percentage of reliable answers. Note that this only indicates the clear "yes" and "no" percentages. For some examples, respondents either expressed uncertainty or gave responses that could not be reliably interpreted as either "yes" or "no"; these numbers are typically very low in the examples below and are included as part of the "noise".  There were many examples in the database where the respondents identified constructions as occurring in their communities. ① It is interesting to note the very high degree of agreement between speakers ranging from 100% to 89% (n=19)). This both shows that the test is reliable (at least for these examples) and furthermore, provides an informal estimate of the amount of "noise" in the sample. Based on this I will take the noise level to be 6%-11%. Incidentally, this is consistent with the 3% to 11% "non response" for the same stimulus type established by Cornips and Jongenburger (2001:57) an encouraging sign.
The next set of data are examples which are unattested in standard Afrikaans and which were identified as being unattested in local varieties. Note that some of these (e.g. the first example), while attested in Dutch dialects, are certainly not expected in Afrikaans ones. So these judgements confirm what is expected based on readings of the literature. Again, it is worth noting the very high percentages which confirm the informal noise level suggested above. The last question (8d) is particularly interesting because it is indeed ungrammatical in standard Afrikaans, but was expected based on the literature (Robbers 1997). The respondents (in the North-West Cape province) identified the stimulus as 100% unattested. However, they then went on to provide the correct non-standard version i.e. Ek het die paal (in die grond) gestamak "I made the pole (in the ground) pst-stand-make", which is equally unattested in standard Afrikaans, but is attested in dialectal Afrikaans. In other words, the stimulus was effective in being able to elicit non-standard varieties. In addition, the examples in (8b) and (8c) with separable prefixes are important because they can be contrasted with examples with inseparable prefixes (see (9) below). Thus respondents were able to distinguish between attested non-standard forms (i.e. those below) and unattested non-standard forms (those above).
The next group of examples are unattested in standard Afrikaans but are identified as attested in local varieties. Examples such as these are attested in the dialect literature (Calitz 1957, Rademeyer 1938, Heiberg 1950 and in some colloquial varieties of standard Afrikaans (Donaldson 1993). They are indicated as being attested by the respondents as would be expected. Once again, I would like to draw attention to the very high and consistent percentages associated with these. Once again, they confirm the noise level established previously.
(9) Examples which are unattested in standard Afrikaans but are identified as attested dialectally by the respondents (in accordance with observations by Calitz 1957, Rademeyer 1938, Heiberg 1950  The indirect grammaticality judgements are also stable with respect to minimal pairs. In other words, not only the grammatical ones are identified as attested, but the ungrammatical ones are identified as being unattested. These data indicate that for known variables (based either on the literature or on standard Afrikaans), the respondents respond as predicted and that task effects seem to be relatively negligible. ①

A case study of expletive use: exploring responses in the context of a particular section
Having demonstrated that the questionnaire elicits the expected responses in particular cases, I will now explore the responses from a particular section of the questionnaire relating to expletive use. Not all the data display such categorical effects as those indicated above. There are many stimuli which elicit percentages that fall between 11% and 89%. This raises a number of questions. First, are these "in-between" results simply wrong? (i.e. a result of task effects, inconsistent methodology, respondent fatigue, respondent "confusion", transcribing errors etc) If not, then how should they be interpreted?
One way of evaluating the implications for validity of non-categorical responses is to consider them in context with the other stimuli. For this reason, I will examine the question set relating to Extended Projection Principle (EPP)/expletive effects. I will demonstrate that non-categorical responses exist in the same context as categorical ones and that consequently, the non-categorical responses cannot be simply dismissed as being "noise" or respondent error, respondent confusion, or a failure of the reliability of the test. ① Fourie (1985) suggested that respondents were not able to parse syntactically complex stimuli. Our experience does not bear this out; the examples cited above reflect the ability of respondents to produce reliable judgements in complex syntactic (word order, embedded clauses) as well as morphological (geplacement on particle verbs, IPP, comparatives, preposition marking) domains. This is not to say that certain types of stimuli are not unproblematic. For example, in our experience, the presence of various nominals within various positions in a verbal cluster (i.e. "Verb Projection Raising") proved difficult to elicit reliable responses as respondents did not appear to be able to hear the difference between the various possibilities.
To set the scene, Afrikaans, like Dutch, has an optional expletive in embedded position. This is indicated by the contrast between (11a,b). It is known independently that both options are attested and the difference in the percentage scores refers to the personal preferences of the respondents. However, the overall high percentages of 78.1% and 90.8% would seem to indicate that both are grammatical (as expected).
I know that tonight at Jan pst-barbecue is.aux "I know that there will be a barbecue at Jan"s house tonight" (78.1%) b. Ek weet dat daar vanaand by Jan gebraai word.
I know that there tonight at Jan pst-barbecue is.aux "I know that there will be a barbecue at Jan"s house tonight" (90.8%) c. Ek weet dat dit vanaand by Jan gebraai word.
I know that it tonight at Jan pst-barbecue is.aux "I know that there will be a barbecue at Jan"s house tonight" (82.6%) However, in colloquial Afrikaans, there is also a 3SG.NEUT. expletive, namely dit. The percentage of 82% falls short of a categorical judgement (it exceeds the 10% noise level established earlier) but is still a fair majority; they are far from the 0-10% range for the examples known to be ungrammatical (see above).
It might be argued that the 3SG expletive in (11c) is actually a true deictic pronoun which interferes with the judgements of the respondents, leading to lower-than-expected percentages. Thus the reading for (12a) would be something along the lines of that something (e.g. meat) is braaied at Jan's house. However, this is not appropriate for (12a) using dans since it is unlikely that there is a clear deictic referent. In addition (12b) is incompatible with a deictic reading.
On New Year"s Eve is.aux it pst-dance / pst-drink across the whole land "On New Year"s Eve, there will be drinking/dancing across the whole land" (41.7%) b. Op die plaas word dit vandag gewerk.
On the farm is.aux it today pst-work "On the farm, there will be working (going on) today" (60.8%) The percentages here are beyond the range of unattested examples (which were in the 0-10% range demonstrated earlier). The fact that these percentages exceed the informal noise level shows that they are relevant in some way, although exactly which way is a matter of interpretation. In addition, the daar/there-type expletive has a larger distribution than in standard Afrikaans or Dutch. This is most apparent in weather predicate contexts where a daar/there-type expletive unexpectedly appears in the responses of a slight majority of respondents.
There rain(.agr) tonight "There"s raining tonight" (55.3%) The there-type expletive is unexpected, nevertheless about half of all respondents judged it to be attested. This result is unlikely to be the result of methodological error for a number of reasons. First, respondents did not reverse their judgementseven when questioned by the interviewerand were even able to answer questions on the semantics of the construction i.e. whether the daar expletive was locative in nature or not (it always is).

Yes, there rains tonight. It means now a little way awaya distance
Alexa: [afstand, ja.]

Yes.
Second, it is found in virtually every datapoint that has been visited, so it is geographically widespread. This means it is not simply the product of a failure in methodology at a specific data point. Second, it appears regardless of the identity of the interviewer so it cannot easily be attributed to the various interviewer effects or the Rutledge effect (Bailey and Tillery 1999). ① The next argument that these judgements are not simply artefacts of the method etc, is that the non-categorical judgements co-occur with clear, categorical ones such as hy and sy in the same stimulus set (15).
(15) a. Hy reë n(t) vanaand. ① The daar/there-type expletive in weather contexts first surfaced in a pilot interview in Paardekraal (EC) with myself as the interviewer. It was also elicited by two different interviewers and also by a third-year student who collected data for another project in the Knysna area in 2008.
She rain(.agr) tonight "It"s raining tonight" (3.6%) c. Dit reë n(t) vanaand. all examples refer to a particular raining event located in space and time (i.e. one cannot infer a pragmatic effect in one example that cannot be inferred in the others). Furthermore, they are cognitively comparable; in fact, it could be argued that the non-referential expletive it is more cognitively difficult simply because it does not refer to any particular entity. In addition, these options are not beyond the realms of linguistic possibility: hy "he" is apparently used as an expletive in the Western Cape in areas where "gender neutralization" of pronouns has taken place. Nevertheless it was not consistently attested in our data area. Respondents were also very clear that the feminine 3SG pronoun was not used in this way. These arguments strongly suggest that these elicited judgements (a) cannot be explained away on methodological, cognitive or pragmatic grounds and (b) that accordingly, they do reflect underlying linguistic judgements about the I-languages of the speakers.
Another possible explanation is that the 3SG pronoun is a reduced pronoun and forms a suffix on the weather predicate itself. In other words, when the respondents say Daar rëen't "There rains", they actually mean Daar rë en dit "There rains it". Such a response could easily be mistaken for a daar/there-type expletiveespecially as the respondents always emphasized the strongly locative character of the daar "there" in weather contexts.
There.loc rain it "It"s raining over there" (Standard Afrikaans) This may be plausible as an explanation of the origin of the daar/there-type expletive in this context, but it is inadequate as a synchronic explanation because the suffix 't can be doubled by a true 3SG expletive. Thus the suffix is not pronominal in nature and this cannot be the explanation for the presence of daar. There is evidence that the expletive daar "there" in weather contexts is actually a locative adverb rather than an expletive per se. All speakers who accepted (15d) emphasized the locative reading of the expletive. Under this interpretation, there are pragmatic effects: certain speakers accept the sentence with a locative reading, while others reject the non-locative reading. What cannot be excluded is that the latter group actually accept the locative reading, in which case the percentage of those accepting (15d) is presumably much higher than 55.3%. Furthermore, it also suggests that there is a null expletive in weather verb contexts. This is not beyond the realms of linguistic possibility since there is independent evidence that null expletives exist in Afrikaans (see example (11a)). However, if this hypothesis was ultimately proved correct, then it would constitute evidence against the possibility that the strange expletive effects discussed above are the result of methodological noise.
The next argument that these non-categorical judgements cannot be simply dismissed as "error" or "noise" is that they are relatively stable across a number of different syntactic contexts. This has already been demonstrated above where it has been shown that a daar/there-type expletive occurs both in embedded contexts and in weather-verb contexts. In both these contexts, the percentages indicate a non-categorical judgement although the percentages themselves differ. It look if the weather tomorrow nice will be "It looks as if the weather will be nice tomorrow" (96.4%) b. Hy lyk of die weer môre mooi sal wees.
He look if the weather tomorrow nice will be "It looks as if the weather will be nice tomorrow" (6%) c. Sy lyk of die weer môre mooi sal wees.
She look if the weather tomorrow nice will be "It looks as if the weather will be nice tomorrow" (6%) d. Daar lyk of die weer môre mooi sal wees.
There look if the weather tomorrow nice will be "It looks as if the weather will be nice tomorrow" (32.5%) e. Daar lyk of daar miskien iemand in die tuin sit.
There look if there perhaps somebody in the garden sit "There looks as if there is perhaps somebody sitting in the garden" (42.8%) In addition, daar/there-type expletives can occur in raising contexts and can even be doubled by a true locative daarthereby apparently increasing the degree of acceptability. There on the mountains rain there tonight "There on the mountains it"s raining tonight" (45.0%) Thus, it appears that there are no simple ways of discounting non-categorical responses in terms of method. The results appear to reflect some property of the language used by the respondents. Exactly how to interpret this, of course, is still an open question. ①

Summary, questions and implications
I conclude then, that there is little evidence to suggest that the non-categorical results are "noise" or "error" or a result of respondent "fatigue" or task effects etc. They must tell us something about the linguistic variety in question. How this is interpreted is, of course, still an open question. I would like to suggest that they indicate dispreferred forms which are nevertheless attested, albeit in smaller numbers. Perhaps there are more subtle distinctions that further stimuli could disambiguate. Ultimately, the non-categoricality really indicates that we have found more variation than we originally anticipated.
What is very clear from the results demonstrated in this paper is that the notion of a linguistic "instrument" must be problematized. For some linguists there may be an implicit understanding that a linguistic "instrument" is something like a context-free questionnaire which can be applied consistently and systematically across a range of respondents and contexts thereby yielding quantitative results. However, this view ignores the evaluative roles played by both the respondents and field workers which constitute a major component ① The kinds of variation evidenced here are not beyond the realms of theoretical possibility. For example, Parrot (2001) discusses use of "weak" pronoun expletives in English and De Vos (2009 m.s) shows that there-type expletives can occur in English weather-constructions under certain circumstances. As such, one likely interpretation of the results is that there is a subset of speakers who speak a variety which allows this type of construction.
of an interview situation and thus of the instrument itself. The "instrument" consists of the combination of both the questionnaire (a set of stimuli) and the interviewer working in a particular context with a respondent. For instance, at an early stage in the piloting of the questionnaire it was realized that for certain stimuli which mentioned drinking and dancing, respondents would often react to the pragmatic context rather than to the stimulus per se. For example, in response to the stimulus Daar word gedans/gedrink op die plaas "There is dancing/drinking on the farm", older respondents would sometimes respond negatively since to affirm the stimulus could presumably entail a value judgement (given the contextual use and abuse of alcohol mentioned earlier in this paper). This required the interviewer to intervene and rephrase the stimulus in a way that was pragmatically more acceptable. Another example of dynamic interpretation occurs when a respondent provides an answer that may be ambiguous and/or when it is unclear whether a respondent would actually use the construction in question. When this occurs, it is the role of the interviewer to take action by rephrasing or repeating the question or by asking for qualifying information. The interviewer thus plays an interpretive role in evaluating responses to the stimuli, assessing whether to repeat stimuli, rephrase them etc. In addition, the respondent herself also plays an interpretive role insofar as s/he must assess whether a particular kind of stimulus is used in their language community, by whom and whether it is widely attested or not. This links to broader question of whether an instrument can ever be neutral or objective and able to be applied without interpretive input from the interviewer. I would argue that an objective instrument, in this sense, is an impossibility and that the interviewer is an important part of the instrument itself. Ultimately, the skill of the interviewer is an important component of the success or not of the interview. This is taken for granted in many types of descriptive, sociolinguistic and ethnographic research and given its pervasiveness (indeed unavoidability) in the discipline, I don"t see it as being particularly problematic so long as this is acknowledged. ① The implication of this is that due care must ① In fact, this issue occurs in many disciplines, including the "hard" sciences and indeed, is probably present in all research. Some examples are illustrative. (1) An ultrasound scanner can be a high precision instrument, providing measurement data that can be used to determine the date of birth of a foetus, disease diagnostics (e.g. nuchal fold measurement) etc. However, anybody who has seen an ultrasound scan will know that there is a lot of "noise" and that the gray blobs and patches are only meaningful to a specialist. Furthermore, the specialist interprets these and adjusts the angle and pressure of the probe etc in order to achieve the best resolution. Although all these decisions are interpretive in character, they are not perceived as detracting from the overall reliability of the instrument. (2) Similarly, a gas spectrometer is a high-precision instrument capable of detecting trace elements in a sample. However, it is not simply a matter of injecting a sample and waiting to see what comes out the other side (except in routine cases). There may be other elements in the mix that can drown out the element being looked for (adsorption effects). A specialist will know that detection of certain elements can be improved by derivativization (using catalysts, filters and supressants etc). Once again, the instrument can be used to its maximum be taken when administering a questionnaire of this nature, that it be applied in a contextually sensitive way by trained linguists. As such, the questionnaire should be treated more as a set of stimuli for discussion rather than a fixed set of questions that require binary answers. While it may be suggested that an "instrument" contextually defined would provide invalid and unreliable results, this paper demonstrates that this is not actually the case and that such an instrument can provide interesting results.