From the perspective of language curriculum development, choice of teaching method is but one phase within a system of interrelated curriculum development activities. Choice of teaching approach or method, materials, and learning activities is usually made within the context of language program design and development. When the director of a language school or institution announces to the staff that an incoming client group will consist of forty-five Japanese businessmen requiring a six-week intensive course in spoken English, the teachers will not leap to their feet and exclaim "Let's use Silent Way!" or "Let's use Total Physical Response!" Questions of immediate concern will focus on who the learners are, what their current level of language proficiency is, what sort of communicative needs they have, the circumstances in which they will be using English in the future, and so on. Answers to such questions must be made before program objectives can be established and before choice of syllabus, method, or teaching materials can be made. Such information provides the basis for language curriculum development. Curriculum development requires needs analysis, development of goals and objectives, selection of teaching and learning activities, and evaluation of the outcomes of the language program. Let us consider each of these briefly (for a fuller discussion see Richards 1984).
NEEDS ANALYSIS
Needs analysis is concerned with identifying general and specific language needs that can be addressed in developing goals, objectives, and content in a language program. Needs analysis may focus either on the general parameters of a language program (e.g., by obtaining data on who the learners are, their present level of language proficiency, teacher and learner goals and expectations, the teacher's teaching skills and level of proficiency in the target language, constraints of time and budget, available instructional resources, as well as societal expectations) or on a specific need, such as the kind of listening comprehension training needed for foreign students attending graduate seminars in biology. Needs analysis focuses on what the learner's present level of proficiency is and on what the learner will be required to use the language for on completion of the program. Its aim is to identify the type of language skills and level of language proficiency the program should aim to deliver. Needs analysis acknowledges that the goals of learners vary and must be determined before decisions about content and method can be made. This contrasts with the assumption underlying many methods, namely, that the needs and goals of learners are identical, that what they need is simply "language," and that Method X is the best way to teach it.
FORMULATION OF OBJECTIVES
Information obtained from needs analysis is used in developing, selecting, or revising program objectives. Objectives detail the goals of a language program. They identify the kind and level of language proficiency the learner will attain in the program (if the program is successful). Sometimes program objectives may be stated in terms of a proficiency level in a particular skill area or in the form of behavioral objectives (descriptions of the behaviors or kinds of performance the learners will be able to demonstrate on completion of the program, the conditions under which such performance will be expected to occur, and the criteria used to assess successful performance). The American Council on the Teaching of Foreign Languages has developed provisional proficiency guidelines for use in planning foreign language programs - "a series of descriptions of proficiency levels for speaking, listening, reading, writing, and culture in a foreign language. These guidelines represent a graduated sequence of steps that can be used to structure a foreign-language program" (Liskin-Gasparro 1984: 11).
Decisions about program goals and objectives, whether expressed in terms of behavioral objectives, proficiency levels, or some other form, are essential in language program design. Without clear statements of objectives, questions of content, teaching and learning activities and experiences, materials, and evaluation cannot be systematically addressed. In cases where a specific method is being considered for use in a language program, it is necessary for the program planner to know what the objectives of the method are and the kinds of language proficiencies it seeks to develop. The program planner can then compare the degree of fit between the method and the program goals. However, methods typically fail to describe explicitly the objectives they are designed to attain, leaving teachers and learners to try to infer objectives from the materials and classroom activities themselves.
SELECTION OF TEACHING AND LEARNING ACTIVITIES
Once decisions have been made about the kinds and levels of language proficiency the program is designed to bring about, teaching and learning activities can be chosen. Classroom activities and materials are hence accountable to goals and objectives and are selected according to how well they address the underlying linguistic skills and processes learners will need in order to attain the objectives of the program, that is, to acquire specified skills and behaviors or to attain a particular level of language proficiency. At this phase in language curriculum development, teachers and program developers first select different kinds of tasks, activities, and learning experiences, the effectiveness of which they then test in meeting program goals. This activity is often referred to as the domain of methodology in language teaching. It involves experimentation, informed by the current state of the art in second language learning theory, and research into the teaching and learning of reading, writing, listening, speaking. Curriculum developers typically proceed with caution, since there is a great deal that is unknown about second language acquisition and little justification for uncritical adoption of rigid proposals.
At this phase in curriculum development, choice of a particular method can be justified only when it is clear that there is a close degree of fit between the program goals and objectives and the objectives of the method. Information concerning the kinds of gains in language proficiency that the method has been shown to bring about in similar circumstances would also be needed here, if available. When a close degree of fit between method and program objectives is lacking, a choice can be made through "informed eclecticism." By this we mean that various design features and procedures are selected, perhaps drawn from different methods, that can be shown to relate explicitly to program objectives. Most language teaching programs operate from a basis of informed eclecticism rather than by attempting to rigidly implement a specific method. A policy of uninformed eclecticism (which is how the term eclectic or eclectic method is often used), on the other hand, would be where techniques, activities, and features from different methods are selected without explicit reference to program objectives.
EVALUATION
Evaluation refers to procedures for gathering data on the dynamics, effectiveness, acceptability, and efficiency of a language program for the purposes of decision making. Basically, evaluation addresses whether the goals and objectives of a language program are being attained, that is, whether the program is effective (in absolute terms). In cases where a choice must be made between two possible program options geared to the same objectives, a secondary focus may be on the relative effectiveness of the program. In addition, evaluation may be concerned with how teachers, learners, and materials interact in classrooms, and how teachers and learners perceive the program's goals, materials, and learning experiences. The relatively short life span of most language teaching methods and the absence of a systematic approach to language program development in many language teaching institutions is largely attributable to inadequate allowance for program evaluation in the planning process. In the absence of a substantial database informing decisions
about how effective a language program is or how its results are achieved, chance and fashion alone often determine program adoption and adaptation. Consequently much has been written about the design of language teaching courses, methods, syllabuses, and materials, but little has been published about the impact on learners of programs, approaches, methods, instructional strategies, and materials. The relationship of the different components of language curriculum development are summarized in Figure 11.1. In order to illustrate relevant issues in the evaluation of methods, we will outline the different dimensions of evaluation chat could be applied to the approaches and methods we have discussed in this book.
Method
approach design procedure
Figure 11.1 Language curriculum development process
Evaluating methods
If adequate evaluation data were available about the methods we have analyzed, we could expect to find answers to such general questions as What aspects of language proficiency does the method address?
With what kinds of learners (children, adults, etc.) is the method most effective? Is the method most effective with elementary, intermediate, or advanced learners?
What kind of training is required of teachers? Under what circumstances does the method work best? (E.g., has it been found to be effective with learners from different cultural backgrounds?) How have teachers and students responded to the method? How does the method compare with other methods (e.g., when used to attain a specified type of competency)? Do teachers using the method use it in a uniform manner?
Answers to questions like these would enable decisions to be made about the relevance of specific methods to particular kinds of language programs. In order to answer these kinds of questions we look to four kinds of data: descriptive data, observational data, effectiveness data, and comparative data. Let us consider each of these in turn.
Descriptive data
Descriptive data are objective (as far as possible) descriptions and accounts, usually by teachers, of specific procedures used in teaching according to a particular method. They may take the form of amplified records of lesson plans, with detailed comments on the exact steps followed. Evaluation specialists sometimes refer to these as "thick descriptions," by which is meant "literal description of the activity being evaluated, the circumstances under which it is used, the characteristics of the people involved in it, the nature of the community in which it is located, and the like" (Guba and Lincoln 1981: 119). David Cohen refers to the use of such descriptions in foreign language teaching as detailed first person description... that fixes vivid perceptions in time and prevents their deterioration into TEFL folklore and even myth. Such a history of a teaching year is of applied value both pedagogically in the language classroom and in terms of an ordered system of guided curriculum development. It provides a reliable "organizational memory" and, over time, becomes the framework for an integrative longitudinal analysis of student cohorts as they move from level to level within the ability streams of an ongoing English language program. (Cohen 1984: 30)
Sylvia Ashton-Warner's book Teacher exemplifies many of the characteristics of "thick description." Here is part of her commentary on the use of key vocabulary in teaching reading.
The words, which I write on large tough cards and give to the children to read, prove to be one-look words if they are accurately enough chosen. And they are plain enough in conversation. It's the conversation that has to be got. However, if it can't be, I find that whatever a child chooses to make in the creative period may quite likely be such a word. But if the vocabulary of a child is still inaccessible, one can always begin him on the general Key Vocabulary, common to any child in any race, a set of words bound up with security that experiments, and later on their creative writing, show to be organically associated with the inner world: "Mommy," "Daddy," "kiss," "frightened," "ghost."
"Mohi," I ask a new five, an undisciplined Maori, "what word do you want?"
"Jet!"
I smile and write it on a strong little card and give it to him. "What is it again?"
"Jet!"
"You can bring it back in the morning. What do you want, Gay?"
Gay is the classic overdisciplined bullied victim of the respectable mother.
"House," she whispers. So I write that, too, and give it into her eager hand.
"What do you want, Seven?" Seven is a violent Maori.
"Bomb! Bomb! I want bomb!"
So Seven gets his word "bomb" and challenges anyone to take it from him.
And so on through the rest of them. They ask for a new word each morning and never have I to repeat to them what it is. And if you saw the condition of these tough little cards the next morning you'd know why they need to be of tough cardboard or heavy drawing paper rather than thin paper.
(Ashton-Warner 1965: 32-3)
We have found that for most of the approaches and methods we have reviewed, there is a lack of detailed description. Most methods exist primarily as proposals, and we have no way of knowing how they are typically implemented by teachers. The protocols in the procedure section of each chapter represent an attempt to provide at least partial descriptions of how methods are used in the classroom.
Observational data
Observational data refer to recorded observations of methods as they are being taught. Such data can be used to evaluate whether the method as it is implemented actually conforms to its underlying philosophy or approach. The observer is typically not the teacher, but a trained observer with a note pad, tape recorder, video equipment, or some other means of capturing the moment-to-moment behaviors of teachers and learners in the classroom. Gathering observational data is much more problematical than obtaining descriptive data, but ultimately more essential, since it provides a more accurate record of what actually occurred, relying as it does on an outsider's observations rather than on what the teacher thought occurred or should occur. Classroom observation studies are a well-established and reasonably noncontroversial part of educational reporting in other fields, and we should expect reports in language teaching to be equivalent in quality to those in general education. Studies carried out in L2 classrooms in recent years have highlighted the potential contribution of observational studies to method evaluation.
Long and Sato (1984), for example, looked at language use in classes taught by teachers trained in "communicative" methodology and compared it with the language of real communication outside of classrooms (native speakers addressing nonnatives of the same level of proficiency as the classroom learners). They found the type of language used by the "communicative" teachers to be very different from the language of natural communication outside the classroom. The teachers' language shared many of the features of the mechanical question-and-answer drills characteristic of audiolingual classrooms. Such studies emphasize the need for empirical study of the classroom processes (i.e., the types of interactions between learners and learners, learners and teachers, learners and materials) as well as the classroom discourse (i.e., the types of utterances, question-and-answer exchanges, turn taking, feedback, and so on) that characterize methods as they are actually used in the classroom, as opposed to how they are described by writers on methods. Observed differences between methods at the level of classroom processes and classroom discourse may be less marked than differences at the descriptive or theoretical level.
Swaffar, Arens, and Morgan (1982), for example, conducted a study of differences between what they termed rationalist and empiricist approaches to foreign language instruction. By a rationalist approach they refer to process-oriented approaches in which language learning is seen as an interrelated whole, where language learning is a function of comprehension preceding production, and where it involves critical thinking and the desire to communicate. Empiricist approaches focus on the four discrete language skills. Would such differences be reflected in differences in classroom practices?
One consistent problem is whether or not teachers involved in presenting materials created for a particular method are actually reflecting the underlying philosophies of these methods in their classroom practices. (Swaffar et al. 1982: 25)
Swaffar et al. found that many of the distinctions used to contrast methods, particularly those based on classroom activities, did not exist in actual practice.
Methodological labels assigned to teaching activities are, in themselves, not informative, because they refer to a pool of classroom practices which are used uniformly. The differences among major methodologies are to be found in the ordered hierarchy, the priorities assigned to tasks. (1982: 31)
The implications of these findings for the study of methods are profound. They suggest that differences among methods of the kind highlighted in the present analysis need to be complemented by observational studies of methods as they are implemented in classrooms. For example, what kinds of techniques and strategies do teachers operating with different methods use for such tasks as clarifying meanings of words, eliciting repetition, giving feedback, correcting errors, giving directions, and controlling learner behavior? What patterns of turn taking are observed? What is the nature of teacher and learner discourse, both quantitatively and qualitatively, and how do these, as well as the other features noted here, vary according to level? We know a great deal about methods and approaches at the level of philosophy and belief, that is, in terms of how the advocates of a particular method believe a method or technique should be used; but few data are available on what actually happens to methods when teachers use them in the classroom. It is no exaggeration to say that in reality, there is virtually no literature on the Natural Approach, Communicative Language Teaching, the Silent Way, and so on; what we have is a number of books and articles on the theory of these methods and approaches, but almost nothing on how such theory is reflected in actual classroom practices and processes. Hence the crucial question is, Do methods really exist in terms of classroom practices, or do teachers, when using methods, in fact transform them into more complex but less distinctive patterns of classroom processes?
Effectiveness data
The third kind of information needed is data on the extent to which particular methods have been found to be effective. What is needed minimally for specific methods is (1) documented studies of instances where a method has been used with reference to a specific set of objectives and (2) reliable and valid measures of gains in proficiency made by learners relative to the objectives. Our profession will indicate its maturity by means of the candor with which we are able to design, carry out, and report measures of effectiveness in something like normal teaching circumstances. The need to provide such data is considered normal in most other areas of educational planning, but data of this kind are virtually nonexistant in the literature on language teaching methods. It is surely not too much to demand of method promoters documentation of instances where students have made gains in proficiency from being taught according to a particular approach or method. To demonstrate this, it is necessary not only to compare pretest and posttest results (and state clearly what is being tested) but to show that the results were achieved as a result of method rather than despite it.
The St. Lambert French immersion program in Canada offers perhaps the closest one can come to a model evaluation study of this kind. In that project, a reasonably large number of students have been followed longitudinally over a six-year period, and their language progress and language attitudes have been measured against the standard of cohort groups of monolingual French and monolingual English students. An outline of the domains of the evaluation and summary statements of results in four of the domains will suffice to suggest the findings:
The evaluation covered seven separate domains:
1. English language arts.
2. French language arts.
3. French-and-English-speaking skills.
4. French phonology.
5. Achievement in content subjects.
6. Intelligence.
7. Attitudes toward French Canadians, English Canadians, European French, and self.
In the area of English Language Arts (as measured by the Metropolitan Achievement Tests and the Peabody Vocabulary Test), the students in the experimental class performed as well as their English peers who had been educated in their native language.
In the area of French Language Arts, the bilingual students when compared with native French-speaking students are somewhat behind in vocabulary knowledge; write compositions in French which, although they contain no more grammatical errors, are less rich in content; and score at approximately the 60th percentile on a test of French achievement.
When asked to tell in English about a film they had been shown, the bilingual students performed similarly to their English instructed counterparts on all measures taken which included the number of episodes, details, and inferences recounted, as well as the number of false starts, grammatical self-corrections, and content self-corrections made. When asked to tell in French about the film, the bilingual students made more grammatical and content self-corrections than native French students but otherwise performed similarly to them.
A number of phonological traits not characteristic of French native speakers were noted in the speech of many of the bilingual children. They included the diphthongization of the mid-vowels, the aspiration of voiceless stops, and inappropriate placing of stress on the first syllable. (Swain and Barik 1978: 33) Comparative data
The most difficult kind of data to provide is that which offers evidence that one method is more effective than another in attaining program objectives. St. Pierre (1979) describes the conventional method for such evaluations:
Both experimental and quasi-experimental evaluations exhibit many of the same ideal characteristics. Program goals subject to evaluation are selected, success criteria are stated, measures are selected/constructed, an evaluation design is developed, treatment and comparison groups are formed, data are collected and analyzed, conclusions about the effectiveness of the program are drawn, and a report is written. (St. Pierre 1979: 29)
However, the history of attempts at method comparisons should be kept in mind. Since the 1950s a number of ambitious attempts have been made at testing the comparative effectiveness of methods. Most often, researchers have been unable to demonstrate the effectiveness of specific methods. For example, a major large-scale investigation of the Audi-olingual Method (Smith 1970), like other methods studies before it, failed to demonstrate that the Audiolingual Method had any significant impact on improvement of language learning. As Kennedy observes,
The repeatedly ambiguous results of these and other attempts to demonstrate experimentally the superiority of one or another foreign language teaching method suggest, it would seem, not only that it is extremely difficult to compare methods experimentally, but, more important, that methodology may not be the critical variable in successful foreign language teaching. (Kennedy 1973: 68)
Critics of the conventional model have noted that "not all sciences are experimental; not all aspire to be. An approach to evaluation that stresses the experimental test of causes is not ipso facto a more scientific approach" (Glass and Ellett 1980: 223).
One way to minimize the difficulties of large-scale comparative method evaluations is through studies that are much more restricted in scope. An example of an evaluation of this kind is a study by Wagner and Tilwey (1983). The method they examined was derived from Sugges-topedia (Lozanov 1978) and Superlearning (Ostrander, Schroeder, and Ostrander 1979). Advocates of Superlearning claim that learners can learn 2,000 lexical items in twenty-three hours by studying just three hours a day. Wagner and Tilney designed a study to evaluate these claims. In their study, twenty-one subjects were randomly assigned to one of three experimental treatments or modes of vocabulary presentation. The experimental group received German language training with Superlearning methodology. A second group received the same Super-learning methodology but without the use of Baroque music - the use of which is a key feature of Lozanov's method. A third group received language training in the classroom and served as a no-contact control group. Levels of vocabulary learning in each group were compared. The results revealed no significant improvement across the five-week experimental period. When modes of presentation were compared, those subjects taught by a traditional classroom method learned significantly more vocabulary than those taught according to Superlearning principles. Although this study contained a very limited number of subjects, it suggests how specific claims of a method can be tested before a commitment is made to implementation on a wider scale.
None of the four levels of evaluation we have described here can be considered sufficient in itself. Descriptive data often lack reliability; they record impressions and recollections rather than facts. Observational data record processes and interactions but do not enable us to determine how these affect learning outcomes. Effectiveness data record results, but do not always tell us how or why these results were brought about. Comparative data likewise compare outcomes, but fail to take account of processes and actual classroom behaviors. The need for an integrated approach to evaluation is consequently stressed:
1. Evaluation ... can be seen as a continuing part of management rather than as a short-term consulting contract. 2. The evaluator, instead of running alongside the train making notes through the windows, can board the train and influence the engineer, the conductor and the passengers. 3. The evaluator need not limit his concerns to objectives stated in advance; instead he can also function as a naturalistic observer whose enquiries grow out of his observations. 4. The evaluator should not concentrate on outcomes; ultimately it may prove more profitable to study just what was delivered and how people interacted during the treatment process. 5. The evaluator should recognize (and act upon the recognition) that systems are rarely influenced by reports in the mail. (Ross and Cronbach 1976: 18)
Unfortunately, evaluation data of any kind are all too rare in the vast promotional literature on methods. Too often, techniques and instructional philosophies are advocated from a philosophical or theoretical stance rather than on the basis of any form of evidence. Hence, despite the amount that has been written about methods and teaching techniques, serious study of methods, either in terms of curriculum development practice or as classroom processes, has hardly begun. Few method writers locate methods within curriculum development, that is, within an integrated set of processes that involve systematic data gathering, planning, experimentation, and evaluation. A method proposal is typically a rationale for techniques of presentation and practice of language items. Seldom is it accompanied by an examination of outcomes or classroom processes. Language teaching has evolved a considerable body of educational techniques, and the quest for the ideal method is part of this tradition. The adoption of an integrated and systematic approach to language curriculum processes underscores the limitations of such a quest and emphasizes the need to develop a more rigorous basis for our educational practice.
Bibliography
Ashton-Warner, S. 1965. Teacher. New York: Bantam.
Cohen, D. N. 1984. Historical TEFL: a case study. RELC Journal 51(1): 30-50.
Curran, C. 1976. Counseling-Learning in Second Languages. Apple River, 111.: Apple River Press.
Glass, G. V., and F. S. Ellett. 1980. Evaluation research. Annual Review of Psychology 31: 211-28.
Guba, E. G., and Y. S. Lincoln. 1981. Effective Evaluation: Improving the Usefulness of Evaluation Results Through Responsive and Naturalistic Approaches. San Francisco: Jossey-Bass.
Kennedy, G. 1973. Conditions for language learning. In J. W. Oiler and J. C. Richards (eds.), Focus on the Learner, pp. 66-80. Rowley, Mass.: Newbury House.
Liskin-Gasparro, J. 1984. The ACTFL proficiency guidelines: an historical perspective. In T. Higgs (ed.), Teaching for Proficiency, pp. 11-42. Lincoln-wood, 111.: National Textbook Co.
Long, M. H. 1984. Process and product in ESL program evaluation. TESOL Quarterly 18(3): 409-25.
Long, M. H., and C. Sato. 1983. Classroom foreigner talk discourse; forms and functions of teacher's questions. In H. Seliger and M. Long (eds.), Classroom Oriented Research in Second Language Acquisition, pp. 268-86. Rowley, Mass.: Newbury House.
Lozanov, G. 1978. Suggestology and Outlines of Suggestopedy. New York: Gordon and Breach.
Ostrander, S., L. Schroeder, and N. Ostrander. 1979. Super learning. New York: Dell.
Richards, J. C. 1984. Language curriculum development. RELC Journal 15(1): 1-29.
Ross, L., and L. J. Cronbach. 1976. Review of the Handbook of Evaluation Research. Educational Researcher 5(9): 9-19.
Smith, P. D. Jr. 1970. A Comparison of the Cognitive and Audiolingual Approaches to Foreign Language Instruction. Philadelphia: Center for Curriculum Development.
Swaffar, J. K., K. Arens, and M. Morgan. 1982. Teacher classroom practices: redefining method as task hierarchy. The Modern Language Journal 66(1): 24-33.
St. Pierre, R. G. 1979. The role of multiple analyses in quasi-experimental evaluations. Educational Evaluation and Policy Analysis 1(6): 5—10.