| | The use of scoring rubrics to determine clinical performance in the operating suiteAccepted 24 June 2008. published online 28 August 2008. Summary This research evolved out of the need to examine the validity and inter-rater reliability of a set of performance-based scoring rubrics designed to measure competencies within the operating suite. MethodBoth holistic and analytical rubrics were developed aligned to the ACORN Standard [Australian College of Operating Room Nurses Standard NR4, 2004. ACORN Competency Standards for Perioperative Nurses: Standard NR4: The Instrument Nurse in the Perioperative Environment. Australian College of Operating Room Nurses Ltd, Adelaide] and underpinned by the Dreyfus model (1981). Three video clips that captured varying performance of nurses performing as instrument nurses in the operating suite were recorded and used as prompts by expert raters, who judged the performance using the rubrics. ResultsThe study found that the holistic rubrics led to more consistent judgments than the analytical rubrics, yet the latter provided more diagnostic information for intervention purposes. Despite less consistency, the Analytical Observation Form had sufficient construct validity to satisfy the requirements of criterion referencing as determined by the Item Separation Index (Rasch, 1960), including high internal consistency and greater inter-rater reliability when average ratings were used. ConclusionThe study was an empirical investigation of the use of concomitant Analytical and Holistic Rubrics to determine various levels of performance in the operating suite including inter-rater reliability. The methodology chosen was theoretically sound and sufficiently flexible to be used to develop other competencies within the operating suite. Introduction  The context of the research investigation was in the field of perioperative nursing, with the focus of the study being on the development and testing of psychometric properties of the scoring rubrics when assessing the performance of the instrument nurse in the operating suite. Background  Despite 40 years of nursing research history into the development of a valid and reliable method of assessing the clinical performance of nurses, there are no universally accepted tools. This remains a matter of concern for the profession (Dolan, 2003, Robb et al., 2002, Watson et al., 2002). Therefore the development of analytical, validated tools for clinical assessment is justified (Chambers, 1998, Dumas et al., 2000, Dunn et al., 2000, Failla et al., 1999, Garland, 1996, Keonig et al., 2003, Nahas and Yam, 2001, Pelletier et al., 2000, Watson et al., 2002). The assessment of candidates in clinical practice presents a multitude of problems and is an issue that will not be easily resolved, with educationalists experiencing difficulties in developing a tool that includes objectivity, validity and reliability for the assessment of clinical competencies in nursing (Chambers, 1998, Wigens and Westwood, 2000, Wiles and Bishop, 2001). Ensuring any instrument developed to measure the construct of interest, in this case the operating room nurse competency, is reliable and valid is vital (Watson et al., 2002). While there is consensus that all instruments must be valid and reliable (Gillis and Bateman, 1999, Watson et al., 2002) these technical terms are applied to the instrument without sufficient evidence to support their development (Gillis and Bateman, 1999, Watson et al., 2002). There is little empirical nursing research into factors that influence the reliability and validity of assessment tools (Watson et al., 2002). Problems of objectivity, validity and reliability are experienced during the assessment of clinical competence in the nursing profession (Robb et al., 2002). Even when an agreement has been reach defining competence, and the performance required to demonstrate it, the assessment requires some form of measurement. The person rating the performance is required to make a judgement about the performance of the candidate, giving rise to the problems surrounding subjectivity involved in assessment (Watson et al., 2002). Inter-rater reliability remains an issue where observation is the method of collecting evidence for determining the level of competence of nurses in specialty areas of practice, which include a multitude of factors impacting on the assessment due to the complexity of the clinical setting. The development of the ideal tool therefore would include simplicity, conciseness and should be able to be used by multiple assessors in a wide range of settings (Wiles and Bishop, 2001). Method  Aim The unit of competency under consideration in this study was The Instrument Nurse from the Australian College of Operating Room Nurses Standards (ACORN, 2004). The aim of the study was to examine validity and inter-rater reliability of a set of performance-based scoring rubrics designed to measure the competency of ‘The Instrument Nurse’. Research aims The research questions addressed in this investigation included: 1.To what extent were the ratings of the nurses educators consistent when using analytical and holistic rubrics to judge the performance levels of perioperative nurses? 2.What was the relationship between perceived competence and performance levels as defined by the Dreyfus Model of Skill Acquisition? 3.Does the use of analytical scoring rubrics in determining varying levels of clinical performance produce sufficient measures of inter-rater reliability? 4.What were the psychometric properties of the Analytical Observation Form? Instrument development Development of the background questionnaire A questionnaire was developed to collect descriptive data from the assessors involved in the survey. Development of the Analytical Observation Form The unit of competency considered for the instrument required for the study was the ACORN Standard The Instrument Nurse (ACORN, NR4, 2004), which provides a minimum standard against which the performance of the instrument nurse could be assessed. A set of rubrics, which is defined as “an assessment tool that uses clearly defined evaluation criteria and proficiency levels to gauge student achievement of those criteria” (Truemper, 2004, p. 562), specifically designed to be assessed against the ACORN standard was developed based on the criterion-referenced rating scale described by Benner (1984); and Bondy (1983), and underpinned by the Dreyfuss model (1981). The rubrics contain descriptive statements clearly describing each level of behaviour (Gillis and Griffin, 2005), that were organised into categories that vary along a continuum (Allen, 2003), revealing to the candidate the criteria against which their work will be judged (Huba and Freed, 2000). Therefore a detailed analysis of the unit of competency was undertaken to develop an initial item pool consisting of set of specifications for the assessment of the instrument nurse. Each step of the rubrics development requires anticipating different qualities of performance from different candidates, providing the assessor with the most information for making a judgement. There were 12 items relating to areas of practice of the instrument nurse in the development of the instrument. Behavioural descriptors were developed for each item and numerically coded on a scale of one to four where four represented the highest level of performance. Each item included an option for the assessor to select “not observed in the video”, which was coded zero. The codes enabled a continuum of increasing competence to be demonstrated. Table 1 shows a sample item and its descriptors. | | |  | Item | Quality indicator |  |
|---|
 | 1 | 2 | 3 | 4 |  |
|---|
 | Setting up a sterile trolley | Would require continual supervision and prompting during setting up of trolley | Would require some prompting and assistance especially during handling of unfamiliar instrumentation | Becoming familiar with setting up equipment | Demonstrates independent practice, and would be able to adapt by modifying practice |  | | | |
Development of the holistic competence rubrics A number of studies have shown that the analytical rubrics can be used by subject matter experts to identify broad band levels of performance as well as determine cut points for competent/not yet competent decisions (Bateman, 2003, Connally, 2004, Connally et al., 2003, Griffin et al., 2004, Nicholson, 2004). The final stage of the instrument development included the development of a holistic rubrics used by subject matter experts to identify broad band levels of performance as well as determine cut off between competent/not yet competent decisions (Bateman, 2003, Connally et al., 2003, Connally, 2004, Griffin et al., 2004). Four band levels were developed in accordance with the Dreyfus model of Skill Acquisition, namely: Beginning Practitioner, Developing Practitioner, Consolidating Practitioner and Effective Practitioner. The band levels Beginning, Developing and Consolidating Practitioner contained the majority of the descriptors as these levels are the expected performance levels of candidates with less than 6 months experience in the operating suite. Greater emphasis of the instrument was also placed on the lower levels to allow for a greater discrimination of candidates at or below the Developing Practitioner Level. The instrument was designed to collect information about the expected cut-off between competency and not yet competent performance. The information was to be used to explore the relationship between competent decisions and the Dreyfus Model of Skills Acquisition for perioperative nurses against the ACORN Standard. Development of the video clips Video clips were produced to match the ACORN Competency Standard (ACORN NR4, 2004) and Analytical Observation Form. Recording of the video clips was completed by the researcher, an experienced perioperative nurse, therefore awareness of the sterile field and restricted areas in the Operating Suite was observed, reducing the risk of inadvertent contamination during the recording of the video clips. Five perioperative nurses (RNs) with various levels of proficiency and experience as instrument nurses were recorded in the video clips. These nurses included expert practitioner, consolidating and developing practitioner categorised according to Benner (1984) to ensure varying levels of practice were captured in the video clips to ensure the range of competencies and experiences likely to be encountered in the operating suite were included and could be measured against the standard. The final video clips used in the study were: •Video clip A depicted the performance of an expert practitioner. •Video clip B depicted the performance of a practitioner at a developing level of practice. •Video clip C depicted a perioperative nurse performing at the level of consolidating practitioner. Once the perioperative nurses were recruited, the surgeons, anaethetists and perioperative nurses in the operating suite were approached to obtain informal consent prior to recording of the video clips. Confidentiality of the ‘video actors’ was also ensured by staff wearing face masks during the surgical procedure, making identification of the person difficult. Editing was undertaken to prevent identification of the perioperative nurse therefore decreasing bias that may have occurred during data collection. Final editing resulted in a 10 min video clip which showed the perioperative nurse functioning as an instrument nurse in the operating suite. Data collection  Sample composition Forty perioperative nurses (nurse unit managers, nurse educators and preceptors) employed in public and private hospital operating suites in Victoria, participated in the study. These nurses provide clinical support for graduate and postgraduate students and were recruited through a letter of invitation to participate in the study. Staff employed at eight metropolitan and one rural hospital involved in educating graduate and postgraduate students within the Operating Suite were invited to participate in the study. The order in which video clips were viewed by the participants was changed for each group to minimise the risk of observer fatigue. Participants were instructed to complete the instrument by making an independent judgment of each segment using the Analytical Observation Form and completing the Holistic Competence Rubrics. Participants were also requested to supply demographic information by completing the background questionnaire. Each participant was assigned a unique identification code therefore names and contact details were not collected. Participants signed the consent form and placed the three completed instruments, one for each video, background questionnaire and consent form in an envelope which was collected by the researcher on completion of the survey. Each survey was facilitated by the researcher. Rubric and questionnaire review and pilot testing the video clips  Prior to data collection a panel of five expert perioperative nurses with previous workplace assessment experience evaluated the Analytical Observation Form, the Holistic Rubrics and the appropriateness of the edited video clips. They were asked to assess whether they addressed the requirements of the ACORN standards, whether the sampling population would understand the questionnaire and the clarity of the instructions for completing the questions. Only minor changes were required to be made to the Analytical Observation Form to clarify the performance criteria. Ethical considerations  The intrusiveness of video clips compared to observational fieldwork needed to be considered when the study was being designed. Patients are vulnerable in the operating suite and it was important to protect their privacy by editing the video clips to ensure their identity was not revealed. The intrusive nature of video recording was addressed by providing adequate preparation of the patient and perioperative nurse before the video clips were recorded. Instructions were provided prior to recording of the videos to reduce the anxiety of appearing on camera (Roberts et al., 1996) including adequate preparation of the perioperative nurses of the expected outcome of the recording session. Patient’s identity was protected at all times using an anaesthetic screen and surgical drapes, including editing of the video clips removing any video material permitting clear identification of the patient. Ethics approval was granted by the Human Research and Ethics Committees of the hospitals involved in the study. Data analysis  Calibrating the instrument using item response modelling procedures Each item in the Analytical Observation Form was calibrated using the Item Response Modelling (IRM) (Rasch, 1960). Two measures that were used to determine the precision and accuracy of the Analytical Observation Form: 1.The standard error of measurement for each of the item difficulty estimates, which provides an estimate of its precision. The standard error of measurement for each item was calculated by determining the difference between the true (or modelled) item difficulty and the estimated item difficulty using the responses of all raters to that particular item (Wright and Stone, 1979). 2.A measure of the extent to which the data fitted the Rasch model to investigate how accurately the variable can predict performance ability. Errors within the evidence pertaining to the performance observed include the complexity of the clinical context (Dunn et al., 2000). This evidence would need to be considered during observation of nurses because competency is attributed according to inferences drawn from performing complex tasks (Griffin, 1995, Masters, 1993) and continual consideration of the patient’s physiological status. Therefore, the Item mean square range of 0.5–1.7 was considered to be acceptable in the current study (Wright et al., 1994). Traditional approaches to estimating reliability consist of Cronbach’s alpha co-efficient of reliability (Burns and Grove, 2001). When estimating internal consistency it is assumed that the raw score is composed of two components, a true score and an error component. Cronbach’s (1971) approach calculates the ratio of the variance of the scores. The measurement may vary from 0.00 to 1.00. According to Griffin (1997), the closer the co-efficient is to 1.0 the more internally consistent the instrument is, reflecting item sampling adequacy in the development of the rubrics. To examine the reliability of the judgments made using the Analytical Observation Form within each video segment, the data were analysed using interclass correlations (ICC) ANOVA analysis of ratings. In the study design, the raters all rated the same video segments using the same rubrics. Two measures of reliability were examined: 1.The reliability of the raters referred to as ICC Single in which the unit of analysis is the individual item ratings. 2.The reliability of the mean of all the raters referred to as ICC Average, where the unit of analysis is the mean rating of the item. The ICC Average measure produces higher inter-rater reliability estimates than the ICC Single estimates. The former is an important measure of inter-rater reliability in settings in which individuals make independent judgments. The second measure is important in instances where a panel of experts are independently involved in any decision-making. Therefore, the ICC Average is a more appropriate estimate of inter-rater reliability. Discussion/results  A descriptive analysis of the patterns of ratings was undertaken to determine agreement among raters using the Analytical Observation Form, the Holistic Performance Level Rubrics and the Holistic Competence Rubric for each video clip. Patterns of ratings across raters for the 12 item Analytical Observation Form A descriptive analysis of the patterns of rating to determine inter-rater agreement across the 12 items revealed certain trends. Figure 1, Figure 2 show the results for the video clip A and B respectively. Certain items appear to have produced the greatest variability, with no predominant quality indicator emerging for the observations related to Video clip A. The results for video clip C as shown in Fig. 3 indicate a high level of agreement in two of the more difficult indicators with a few raters selecting the lower indicators. There appeared to be a high level of consistency for items 2 ‘Gowning and gloving’, 4 ‘Setting up a trolley’, 5 ‘Sterile field’, 7 ‘Surgical count’ and 8 ‘prepping and draping’. It is possible that these items were easier to observe in the video clips and therefore, produced a more consistent pattern of rater agreement. Item 1, ‘Surgical scrub’, was rated consistently and had more variability among the quality indicators across all three video clips. The item was easy to observe in the video clips but due to the editing required in order to present a 10 min video clip of the performance of the instrument nurses, the actual scrub time of 5 min was not observed which may have influenced the scores for the item. There was less consistency among raters for items 9 ‘Standard and additional precautions’, and 10 ‘surgical team interventions’. It is possible the video clips did not provide sufficient opportunity for the raters to observe the tasks being performed. Most raters consistently did not observe items 3 ‘Risk management’, 6 ’Patient positioning’, 11 ‘Prosthesis and implants’, and 12 ‘environmental control’. These items were performed and shown in the video clips but there may not have been adequate opportunity for the raters to view them during the study. Patterns of ratings across raters using the holistic performance level rubric and the holistic competence rubric An analysis of the patterns of ratings for each video clip was undertaken to determine the level of agreement among raters in determining the overall level of performance according to the holistic Performance Level and overall competency level, competent versus not yet competent on a dichotomous scale, for each of the video clips. These results are shown in Table 2, Table 3 respectively. | | |  | Level of performance | Video A | Video B | Video C |  |
|---|
 | n = 40 | n = 40 | n = 40 |  |
|---|
 | Effective Practitioner | 55% | | 70% |  |  | Consolidating Practitioner | 35% | | 22.5% |  |  | Developing Practitioner | 10% | 35% | 7.5% |  |  | Beginner Practitioner | | 62.5% | |  | | | |
| | |  | Level of performance | Video A | Video B | Video C |  |
|---|
 | n = 40 | n = 40 | n = 40 |  |
|---|
 | Competent | 97.3% | 83.3% | 97.3% |  |  | Not yet competent | 2.7% | 16.7% | 2.7% |  | | | |
The Holistic Performance Level Rubric that comprised a four-point behaviourally anchored rating scale (1 = Beginner, 2 = Developing, 3 = Consolidating and 4 = Effective practitioners) produced higher levels of agreement among the 40 raters than the 12-item Analytical Observation Form. In fact, the level of consistency of ratings was greatest in relation to the dichotomous Holistic Competence Rubric followed by the Holistic Performance Level Rubric and the Analytical Observation Form. These findings suggest the finer the distinction to be made, the more ratings tend to vary. Although a higher level of inter-rater reliability was found using the dichotomous Holistic Rubrics, it provides little diagnostic information about the development of the candidate. The compromise may be to use the Holistic Performance Level Rubric in which each of the four band levels has a profile description that can be used for intervention purposes (Griffin et al., 2004). Adapting this practice means the level of specificity of the scoring rubric will be depend on the purpose of the assessment (Gillis and Griffin, 2005). The relationship between perceived competence and performance levels as defined by the Dreyfus model of skill acquisition Cross tabulation was performed to determine the interaction between the holistic ratings of the performance level and the overall dichotomous rating of the instrument nurse as competent/not yet competent to provide an indication of the level of performance that would be acceptable to nursing. Ninety per cent of the raters who selected the first performance level descriptor ‘Beginning Practitioner’ for the three video clips agreed that the performance indicated that the practitioner was ‘Not yet competent’. Sixty percent of the raters who selected the second performance level descriptor ‘Developing Practitioner’ also thought that the three video clips reflected not yet competent performance at this level. There was 100% agreement among raters that performance level descriptors 3 ‘Consolidating Practitioner’ and 4 ‘Effective Practitioner’ represented competent performance which indicates a rating of 3 suggests the practitioner is competent. Reliability of the analytical scoring rubric The results of internal reliability scores of the Analytical Observation Form for each video clip rated by the same forty raters are shown in Table 4. | | |  | | Internal reliability score (n = 40) |  |
|---|
 | Video Clip A | 0.98 |  |  | Video Clip B | 0.97 |  |  | Video Clip C | 0.98 |  | | | |
The Analytical Observation Form used in the present study had high internal consistency (>0.98). Furthermore, the 2-way ANOVA random effects model analysed using SPPS, produced acceptable inter-rater reliability estimates. The ICC Single measure had reliability (>0.51) whilst the ICC average measure had excellent reliability (>0.98) (Cicchetti and Sparrow, 1981, Fleiss, 1981). These results have implication for assessing nursing clinical practice. Usually only one rater assesses the candidates performance in the operating suite. Therefore, the results obtained for the single and average measures of inter-rater reliability in the current study using the rubrics, suggests that, while current practice appears to be acceptable in the perioperative setting, it could be improved if the average ratings of multiple, independent assessors were used in future assessments. Both measures of ICC are included to show the reliability of both one and multiple raters across the items. See Table 5. The results indicate that the inter-rater reliability of the instrument was greater when the average rating of the assessors was used to calculate reliability. | | |  | | Video A | Video B | Video C |  |
|---|
 | | n = 40 | n = 40 | n = 40 |  |
|---|
 | ICC Single | 0.612 | 0.508 | 0.545 |  |  | ICC Average | 0.984 | 0.976 | 0.980 |  | | | |
The calibration of the Analytical observation Form based on the assumption that responses to the 12 items were determined by the raters’ observation of the three video performances and the responses could be used to measure the underlying latent continuum. The underlying continuum was constructed to describe increasing levels of competence in the competency unit The Instrument Nurse. In total 120 observations were made, 40 observations per video. Both Classical and Rasch analysis of the scale data were undertaken. The results from the Rasch Analysis revealed that with the exception of two items (i.e. 4 and 9), all items fitted the model according to the rule of thumb for making clinical observations (Wright et al., 1994). A review of Item 9 indicated that while the items did fit the Rasch (1960) model, the video may not have provided sufficient evidence for the assessing the item. Therefore it should be retained if the rubrics are implemented as an assessment tool for use in real time assessment, in which case it should be reworded. The relative difficulty of items indicated there may be as many as five band levels for the scale, as opposed to the four levels predicted from the Dreyfus Model. It was evident from the analysis that the five performance levels have a wide range of varying levels of difficulty. The most difficult item on the Analytical Observation Form was related to caring for and maintaining implants and prosthesis, followed by control of environmental factors within the perioperative environment. The inclusion of the four steps for prosthesis and implants in the most difficult level indicates that it should be removed if the video clips are used for further research, or be validated by using the instrument in the clinical setting. Limitations of the study This study had several limitations and these include: •The study was limited to one unit of competency within the perioperative setting. •The use of the video clips during the survey required editing in order to meet the ethical considerations for the study. This presented various difficulties during the survey due to the lack of sound and identification of the person being observed once surgery commenced. All members of the sterile team wore green gowns and therefore certain movements of the ‘actor’ in the video clip, such as prepping and draping, made the observation more difficult. •Editing was required by the ethics committee and included removing the sound and de-identifying the person performing the role of the instrument nurse, which made viewing of certain tasks more difficult, and may have resulted in the items receiving a score of ‘not observed’. Item 11 was present in video clips A and C and was consistently not observed in any of the video clips. The lack of sound may have affected the observation process because certain quality indicators would have been more apparent if sound was present due to the communication that occurs between the surgeon and instrument nurse. Despite these limitations, some conclusions can be reached on the basis of the data analysed and an evaluation of the use of the scoring rubrics in the Operating Suite explored. Clinical implications The Analytical Observation Form and the Holistic Performance Level Rubric satisfy the requirements of criterion referencing and hence competence assessment which enhances the content, construct and criterion validity as well as the inter-rater reliability of the assessment. It also minimises the implementation costs associated with the assessment because it enables the same evidence to be used to report a range of assessment outcomes such as competent/not yet competent and the performance level achieved in terms of the Dreyfus model as well as providing meaningful information for summative, diagnostic, and formative assessments. The inter-rater reliability of the instrument was greater when the average ratings of assessors were used to calculate the reliability rather than the individual rating. In the nursing profession it is usual practice to use a single rater. Inter-rater reliability therefore, would be enhanced if future assessments were made using the average ratings of more than one independent assessor. However researchers would need to investigate the practical implications of such a change in the hospital setting, in particular, the operating suite. Conclusion  The current study provided an empirical investigation of the use of both Analytical and Holistic Rubrics in determining various levels of performance in the operating suite, including inter-rater reliability. The methodology chosen was theoretically sound and was sufficiently flexible to be used for the development of other competencies within the operating suite (Nicholson, 2005). Given that there is limited literature surrounding of the factors that impact on inter-rater reliability, including the development of valid and reliable assessment processes in nursing, there is evidence to support further research in this area. Furthermore, as the current study was limited to metropolitan hospitals within Melbourne and included only one rural hospital, a much larger study that would include nurses from rural and interstate hospitals is required to generalise the findings. A more detailed exploration of inter-rater reliability and the use of assessment scoring rubrics in determining the competency level of perioperative nurses are required, especially with the concerns surrounding competencies required by nurses working within complex areas such as operating suite. References  Allen, 2003. 1.Allen, M., 2003. Using Scoring Rubrics (retrieved 07.08.04). <http:www//calstate.edu/AcadAff/SLOA/links/using_rubrics.shtml>. ACORN, 2004. 2.Australian College of Operating Room Nurses, 2004. ACORN Competency Standards for Perioperative Nurses: Standard NR4: The Instrument Nurse in the Perioperative Environment. Australian College of Operating Room Nurses Ltd, Adelaide. Bateman, 2003. 3.Bateman, A., 2003. The appropriateness of professional judgement to determine rubrics in competency based assessments. Unpublished Master of Assessment and Evaluation Thesis, Springer, Victoria, Australia. Benner, 1984. 4.Benner P. From Novice to Expert. Excellence and Power in Clinical Nursing Practice. London: Addison-Wesley Publishing Company; 1984;. Bondy, 1983. 5.Bondy K. Criterion-referenced definitions for rating scales in clinical evaluation. Journal of Nursing Education. 1983;22(9):376–382. MEDLINE Burns and Grove, 2001. 6.Burns, N., Grove, S.K. 2001. The Practice of Nursing Research. Conduct, Critique, & Utilization, fourth ed. WB Saunders Company, Philadelphia. Chambers, 1998. 7.Chambers M. Some issues in the assessment of clinical practice: a review of the literature. Journal of Clinical Nursing. 1998;7:201–208. MEDLINE Cicchetti and Sparrow, 1981. 8.Cicchetti DV, Sparrow SA. Developing criteria for establishing interrater realibility for specific items: applications to assessment for adaptive behaviour. American Journal of Metal Deficiency. 1981;86(2):127–137. Connally, 2004. 9.Connally, J., 2004. A multi source management approach to the assessment of higher order competencies. Unpublished Doctoral Dissertation, The University of Melbourne, Victoria, Australia. Connally et al., 2003. 10.Connally, J., Jorgensen, K., Gillis, S., Griffin, P., 2003. An integrated approach to the assessment of higher competencies. Refereed paper presented at the Sixth VET Research Association Conference “The changing face of VET”, Sydney, 9–11 April. Cronbach, 1971. 11.Cronbach LJ. Test validation. In: Thorndike RL editors. Educational Measurement. second ed.. Washington DC: American Council of Education; 1971;p. 442–507. Dolan, 2003. 12.Dolan G. Assessing student nurse clinical competency: will we ever get it right?. Journal of Clinical Nursing. 2003;12:132–141. MEDLINE |
CrossRef
Dumas et al., 2000. 13.Dumas L, Villeneuve J, Chevrier J. A tool to evaluate how to learn from experience in clinical settings. Journal of Nursing Education. 2000;39(6):251–258. Dunn et al., 2000. 14.Dunn SV, Lawson D, Robertson S, Underwood M, Clark R, Valentine T, et al. The development of competency standards for specialist critical nurses. Journal of Advanced Nursing. 2000;31(2):339–346. MEDLINE |
CrossRef
Failla et al., 1999. 15.Failla S, Maher MA, Duffy CA. Evaluation of graduates of an associated degree nursing program. Journal of Nursing Education. 1999;38(2):62–68. MEDLINE Fleiss, 1981. 16.Fleiss JL. Statistical Methods for Rates and Proportions. second ed.. New York: John Wiley & Sons; 1981;. Garland, 1996. 17.Garland GA. Self report of competence. A tool for the staff development specialist. Journal of Nursing Staff Development. 1996;12(4):191–197. MEDLINE Gillis and Bateman, 1999. 18.Gillis S, Bateman A. Assessing in VET: Issues of Reliability and Validity. Adelaide: National centre for Vocational Education Research; 1999;. Gillis and Griffin, 2005. 19.Gillis, S., Griffin, P., 2005. Principles underpinning graded assessments in VET. A Critique of Prevailing Perceptions. Assessment Research Centre, The University of Melbourne. Griffin, 1995. 20.Griffin P. Competency assessment: avoiding the pitfalls of the past. Australian New Zealand Journal of Vocational Education research. 1995;3(2):34–59. Griffin, 1997. 21.Griffin P. Assessing and reporting outcomes. In: Griffin P, Smith P editor. Outcome-Based Education: Issues and Strategies for Schools. Canberra: ACSA; 1997;p. 10–20. Griffin et al., 2004. 22.Griffin, P., Gillis, S., Calvitto, L., 2004. Connecting competence and quality: scored assessment in Year 12 VET. NSW Board of Vocational Education and Training. Huba and Freed, 2000. 23.Huba M, Freed J. Using Rubrics to provide feedback to students. In: Huba M, Freed J editor. Learning-centred Assessment on College Campus: Shifting the focus from teaching to learning. Boston: Allyn & Bacon; 2000;p. 151–200. Keonig et al., 2003. 24.Keonig K, Johnson C, Morana CK, Ducette JP. Development and validation of a professional behavior assessment. Journal of Allied Health. 2003;32(2):86–91. MEDLINE Masters, 1993. 25.Masters, G., 1993. Certainty and probability in assessment of competence. Paper presented at the VEETAC National Assessment Research Forum on Competency-based Assessment Issues, Sydney. Nahas and Yam, 2001. 26.Nahas VL, Yam BMC. Hong Kong nursing students’ perceptions of effective clinical teachers. Journal of Nursing Education. 2001;40(5):233–237. MEDLINE Nicholson, 2004. 27.Nicholson, K., 2004. Trial of a Standard Referenced Framework for the defining and Measuring of the Manutention Competency. Unpublished Master of Assessment and Evaluation Thesis, The University of Melbourne, Victoria, Australia. Nicholson, 2005. 28.Nicholson, P., 2005. Nurse educators use of scoring rubrics to determine varying levels of clinical performance in the perioperative setting. Unpublished Master of Assessment and Evaluation Thesis, The University of Melbourne, Victoria, Australia. Pelletier et al., 2000. 29.Pelletier D, Duffield C, Nagy S, Mitten-Lewis S. Australian nurse educators identify gaps in expert practice. The Journal of Continuing Education in Nursing. 2000;31(5):224–231. Rasch, 1960. 30.Rasch G. Probabilistic Models for Some Intelligence and Attainment Tests. Chicago: University of Chicago Press; 1960;. Robb et al., 2002. 31.Robb Y, Fleming V, Dietert C. Measurement of clinical performance of nurses: a literature review. Nurse Education Today. 2002;22:293–300. Abstract |
Full-Text PDF (144 KB)
|
CrossRef
Roberts et al., 1996. 32.Roberts BL, Srour M, Winkelman C. Videotaping: an important research strategy. Nursing Research. 1996;45(6):334–335, 338. MEDLINE |
CrossRef
Truemper, 2004. 33.Truemper CM. Using scoring rubrics to facilitate assessment and evaluation of graduate-level nursing students. Journal of Nursing Education. 2004;43(12):562–564. MEDLINE Watson et al., 2002. 34.Watson R, Stimpson A, Topping A, Porock D. Clinical competence assessment in nursing: a systematic review of the literature. Journal of Advanced Nursing. 2002;39(5):421–431. MEDLINE |
CrossRef
Wigens and Westwood, 2000. 35.Wigens L, Westwood S. Issues surrounding educational preparations for intensive care nursing in the 21st century. Intensive and Critical Care Nursing. 2000;16:221–227. Abstract |
Full-Text PDF (127 KB)
|
CrossRef
Wiles and Bishop, 2001. 36.Wiles LL, Bishop JF. Clinical performance appraisal: renewing graded clinical experience. Journal of Nursing Education. 2001;40(1):37–39. MEDLINE Wright and Stone, 1979. 37.Wright BD, Stone MH. Best Test Design. Chicago: MESA Press; 1979;. Wright et al., 1994. 38.Wright BD, Linacre JM, Gustafson JE, Martin-Lof P. Reasonable mean-square fit values. Rasch Measurement Transactions. 1994;8(3):370;. a School of Nursing, Level 5 234 Queensberry Street, Carlton, 3010 Melbourne, Victoria, Australia b Assessment Research Centre, The University of Melbourne, Victoria 3010, Australia c Department of Endocrinology and Diabetes, St. Vincent’s Health, 41 Victoria Parade, Fitzroy, Victoria 3065, Australia Corresponding author. Tel.: +61 3 8344 9415; fax: +61 3 9347 4172.
PII: S0260-6917(08)00079-8 doi:10.1016/j.nedt.2008.06.011 © 2008 Elsevier Ltd. All rights reserved. | |
|