This post is part of a series, describing the assessments used to develop the Institute for Research Design in Librarianship (IRDL).
Participants submitted a research proposal with their applications and then worked on revising the proposals during and after the Summer Research Workshop ended. Once the revised proposals were submitted one month after the end of the workshop, we conducted a review process to compare the pre-workshop proposals (the proposal submitted as part of the application) with the post-workshop proposals (the proposal that is revised throughout the workshop) on multiple criteria.
We developed the evaluation process to help us learn if there was a change in how the Scholars wrote about their research plans, from before the workshop to after. To measure any change, we created a rubric that guided us to look in the proposals for mastery of specific components of their research plans. One of the components, for example, was to look at the proposal and consider the scope, if the design they selected was appropriate for their project’s needs. We wanted the Scholar to design a project that was reasonable for a novice researcher to complete, given their needs for information, resources, and time. On the rubric for that component, a Scholar would score a 1 (Beginning) if we saw evidence in the proposal that there were few components of the proposal that can be feasibly completed, based on the information needed, resources needed, and timeline identified. A Scholar would score a 2 (Developing) if some of the components can be feasibly completed, based on the information needed, resources needed, and timeline identified. A Scholar would score a 3 (Accomplished) if many of the components of the proposal can be feasibly completed, based on the information needed, resources needed, and timeline identified. A Scholar would score a 4 (Exemplary) if the majority of the components of the proposal can be feasibly completed, based on the information needed, resources needed, and timeline identified.
During the workshop, in group discussions and one-on-one consultations with experts, we worked with each Scholar to design their projects so that they would have success within the one-year program time frame. This often meant treating a project as a phased project, so they fully completed a part of the project within the one year, before moving on to the next phase. If a Scholar was committed to completing their full project in one year, we discussed alternatives for the amount or type of data that could be gathered or prompted them to seek help from a colleague to share the work. Our expectation for each Scholar was that they would leave the workshop with a project that was in scope for their individual work-life situation, so they would be more likely to be successful in completing it. We expected to see in their revised proposal specifics about the steps they would take within the year, with the various phases of the project mapped to a timeline.
Each of the components measured on the rubric were specifically addressed during the workshop. The other components were: significance and purpose, wanting the Scholar to demonstrate a consideration of the potential impact of their research; research question/hypothesis, wanting the Scholar to produce a project that relies on a well-constructed research question or hypothesis; literature review organization, wanting the Scholar to construct a clear review of the literature; literature review explanation, wanting the Scholar to make effective use of the literature to support the thesis in their proposed project; methods – research design, wanting the Scholar to select methods relevant to their problem statement; methods – context, population, and sampling, wanting the Scholar to select appropriate members of the population to include in the project; and methods – procedures, wanting the Scholar to consider the permissions, treatments, and data gathering. All of these components were measured on a 1 to 4 scale, with a rationale for each.
To prepare the proposals for evaluation, the office of the LMU Director of Assessment, Laura Massa redacted each, to remove any identifying information (personal names, institution names) and anonymized each, assigning only a manuscript number. Massa’s team produced an evaluation packet for each reviewer, with included a mix of some pre-workshop proposals and some post-workshop proposals; each reviewer was presented with a unique mix of proposals. Each proposal had attached a printed scoring worksheet. Three of the proposals were removed from the evaluation process so that they could be used during the norming process.
We then invited a group of regional librarians to participate by acting as evaluators, over a two-day period. The in-person review experience was held in a large room in the LMU library. We prepared the room for breakfast, lunch, and snacks on each day, and offered an honorarium for each reviewer. At the outset of the review process, each participant signed a form saying that they would honor the intellectual property presented in the proposals and not attempt to conduct any research in the proposals they read.
During the first day of the evaluation process, Massa spent the majority of the day norming the rubric. This entailed talking through each of the components of the rubric and describing what each measurement meant. The participants then attempted to score a sample proposal and shared their scores with the group. Massa led a discussion about the scores, to resolve any variation among the scores, so that each reviewer had a clearer understanding about what each possible score meant. The group then scored another sample proposal and then again shared their scores and discussed any variation, so that the reviewers in the end were scoring in the same way.
The rest of the first day and the full second day was given to quiet time for the reviewers to evaluate the proposals in their packet. The reviewers were not told which was a pre-workshop or post-workshop proposal; each proposal was in two separate reviewers’ packets. Massa was available for consultation throughout the review process, to help guide any reviewer with a question about how to score something to consider their previous norming discussions. When the reviewers completed their evaluations, they gave the entire packet and scoring sheets back to Massa and were excused.
Massa’s office calculated the scores for each Scholar and for the group overall, using a paired samples statistic and effect sizes in SPSS. Massa returned to us an anonymized report and data, showing that in each case the post- score was greater than the pre-score. This is indeed what we were hoping to learn, that what the Scholars gained from the workshop was evident in the writing about their research projects, and they had increased their mastery at every component that was addressed in the curriculum.
Linked here is the version of the rubric and the scoring sheet in use from 2022-2024.
My reflection on the use of this tool for assessing the program
Once Massa left the position and we worked with a new Assessment officer at LMU, the weakness of a rubric as an evaluation method became apparent to me. The norming process, ensuring that each reviewer is understanding each component and its measurement in the same way, requires a commitment on the part of the person leading it. The new person we worked with did not challenge the group as strictly as Massa, to come to an agreement for when a component should be scored high or low. Without the agreement of the group of what was truly exemplary (a 4) or was rather developing (a 2), many reviewers assigned them to the accomplished category, a 3. The reviewers used this middle score of 3 throughout their evaluations, and the result was that the statistics did not show any gains from pre- to post-workshop proposals. This was quite a striking change from the first few years where there was a clear improvement in mastery from the post-workshop proposals. It points to the necessity of needing a consistent and assertive leader in the norming process, to make sure that the rubric is used as intended. This is, in my opinion, a weakness in using this method for an important assessment.
In the first few years of using the rubric, we invited regional librarians to participate in person as evaluators, without requiring that they had conducted original research themselves, only that they were familiar with the process and interested in participating. Some of the evaluators did have experience conducting research, however, which produced an uneven group for the evaluation. This is also a weakness of using a rubric as an evaluation method. Seeking a homogenous group would have produced a more consistent evaluation. Practically, however, coordinating such a group within a geographical area is limiting.
The cost of this assessment tool
For the first few years of this assessment, the person running the rubric process was employed at LMU and we included a percentage of her time (salary and fringe) on the grant. After that person left, we used a subset of funds from the grant to pay for the time of the person to run the process.
In addition to paying for food throughout the day, we paid for parking and an honorarium for each librarian participating in the two-day assessment.
Earlier posts in this series:
Introduction post, Confidence scale, Research networks of the Scholars, External review, Post-workshop survey

