Seeking Your Input on Simplifying Review Criteria

Author

Bruce Reed

Deputy Director
View all posts

February 27, 2020

Over the past several years we have heard consistent concerns about the complexity of review criteria and administrative load of peer review. CSR shares the concern that the current set of standards has the unintended consequence of dividing reviewer attention among too many questions, thus reducing focus on scientific merit and increasing reviewer burden. Each element was intended make review better, but we worry that the cumulative whole may in fact distract from the main goal of review — to get input from experts on the scientific and technical merit of the proposed work.

To address these concerns, CSR has convened a working group of our advisory council, charged with recommending changes to research project grant review criteria that will improve review outcomes and reduce reviewer burden. The group is co-chaired by Tonya Palermo and me, and includes some of our council members, other members of the scientific community, and the NIH Review Policy Officer from the Office of Extramural Research.

We would like to hear your thoughts on the issue. How might review criteria be modified to obtain the best evaluations of scientific merit? You can provide feedback directly to me at bruce.reed@nih.gov, to feedback@csr.nih.gov, or to any member of the working group. Before you fire off that email, though, read on.

First, be aware that current criteria derive from multiple regulations; changes that conform to them well are more feasible than those that don’t. The Code of Federal Regulations (42 C.F.R. Part 52h.8) requires that research project applications be evaluated based on significance, investigators, innovation, approach, and environment. Protections for humans, animals, and the environment, adequacy of inclusion plans, and budget must be evaluated. The “21st Century Cures” Act (Public Law 114-255) requires attention to rigor and reproducibility and aspects of clinical trials. That said, there is room for improved implementation.

Second, consider how simplified criteria that might also help address some of the issues below:

Multiple studies show that reviewer ratings of Approach carry the most (perhaps too much) weight in determining overall impact scores. Yet, aspects of rigor and reproducibility are too often inadequately evaluated. Can better criteria help?
Review is often criticized as being risk-averse, as too conservative. If you agree, how might revised criteria help?
How can criteria be defined to give the applications of all investigators, regardless of their race, ethnicity, gender, career stage, or setting, fair hearing on a level playing field?

Third, focus on the criteria for R01s. The criteria for training grants (F’s, K’s, T’s) and for SBIR/STTR grants are different. Addressing criteria for R01s would be a great start.

Finally, please be patient. Getting from good ideas to a revised set of criteria is a complex, multi-level process that will include NIH’s Office of Extramural Research, eRA, NIH Institutes and Centers, Office of the General Counsel, and other relevant stakeholders. This is a preliminary effort to get your input on what changes we should think about. Were we to propose regulatory changes, we would ask for additional public input. We are starting a conversation. Share your ideas.

Members of the CSR Advisory Council Working Group

Co-Chairs

Palermo, Tonya M., Ph.D
Professor, Department of Anesthesiologyand Pain Medicine
Principal Investigator and Associate Director, Center for Child Health, Behavior and Development
Seattle Children’s Research Institute

Reed, Bruce, Ph.D..
Deputy Director
Center for Scientific Review
National Institutes of Health

Members

Amero, Sally, Ph.D.
Review Policy Officer
Office of Extramural Research
National Institutes of Health

Corbett, Kevin D., Ph.D.
Associate Professor
Department of Cellular and Molecular Medicine
University of California, San Diego, School of Medicine

Gao, Jinming, Ph.D.
Professor of Oncology, Pharmacology, and Otolaryngology
Co-Leader, Cell Stress and Nanomedicine Program
Simmons Comprehensive Cancer Center
UT Southwestern Medical Center

George, Alfred L., M.D.
Chair, Department of Pharmacology
Director, Center for Pharmacogenomics
Magerstadt Professor of Pharmacology
Northwestern University

Hurd, Yasmin L., Ph.D.
Professor, Department of Psychiatry,
Neuroscience, Pharmacological Sciences
Director, Addiction Institute of Mount Sinai
Icahn School of Medicine at Mount Sinai

Janelsins-Benton, Michelle C., Ph.D..
Associate Professor
Departments of Surgery, Neuroscience, and Radiation Oncology
University of Rochester, Medical Center

King-Casas, Brooks, Ph.D.
Associate Professor
Fralin Biomedical Research Instituter
Department of Psychology
Virginia Polytechnic Institute and State University

Kroetz, Deanna L., Ph.D.
Professor, Department of Bioengineering and Therapeutic Sciences
Director, Pharmaceutical Sciences and Pharmacogenomics Graduate Program
University of California, San Francisco

López, José A., M.D.
Professor of Medicine, Hematology
Member of Bloodworks NW Research Institute
Adjunct Professor, Biochemistry, Mechanical Engineering, and Pathology
University of Washington, School of Medicine

Comments are now closed. If you have thoughts to share with CSR or questions, please email us at feedback@csr.nih.gov.

72 Comments on "Seeking Your Input on Simplifying Review Criteria"

Liliane Windsor says:

March 11, 2020 at 7:02 pm

There are some excellent points in these comments. I agree that minimizing reviewer burden is key. Keep the application short and have admin review other sections such as IRB, etc. Innovation is not helpful as a category in and of itself because if traditional methods are appropriate, the applicant should not be penalized on innovation. That said, I think innovation can be a part of significance and still considered. Truly innovative applications should be rewarded. I do not like the idea of separating the significance and approach reviews. I think that it is critical to consider how one informs the other.
Mark Gomelsky says:

March 11, 2020 at 2:25 am

I strongly support the suggestion that has been raised in this blog about removing ENVIRONMENT from scored criteria and moving it into the “Acceptable” or “Unacceptable” category. I am a biomedical scientist at a medium-size research university. Many reviews of my proposals contain the lack of specific expertise at my institution as a Weakness in the ENVIRONMENT section. Indeed, because our university has no medical school, I find missing expertise by collaborating with scientists located elsewhere. Lower ENVIRONMENT scores put researchers like myself at competitive disadvantage compared to our colleagues at large Medical Schools, irrespective of proposal values. If NIH is serious about supporting biomedical research throughout the country, as I know it is, it needs to move ENVIRONMENT from scored criteria to the “Acceptable”/ “Unacceptable” category. It is neither fair nor reasonable to use the perceived “richness” of an institution as a predictor of a potential impact of a specific project.
Sean O'Connor, M.D., Indiana University School of Medicine says:

March 3, 2020 at 5:45 pm

My perspective derives from 30 years as an applicant, a serial panel member and a teacher of grant writing – all regarding human research – to young faculty.
Reviewer burden is substantial, dominated by the time required to comprehend the import of elements of a proposal that are not in the specific area of the reviewer’s expertise. It is the responsibility of the applicant to articulate the consequence of choices made and alternatives considered. Call it grantsmanship if you like, but my experience is that scores are inversely correlated with the energy required by the reviewer to understand the applicant’s intent.
The page limit for communicating scientific merit should NOT be increased beyond the current 13. Judgement of merit should NOT require reading anything but the 13 pages. The trend towards moving substantive information (e.g. recruitment criteria, analytical power, resources, etc.) to peripheral sections should be discouraged.
All five current review domains (significance, investigators, innovation, approach, and environment) are necessary for adequate assessment of scientific merit. But these are insufficient. A sixth domain, rigor and reproducibility, should be included – perhaps as a sub-domain of the approach – but ONLY if an explicit definition of that domain is provided both for reviewers and applicants. Current perceptions of that domain vary so substantially across reviewers that its contribution to scientific merit is lost.
I recommend consideration of a change in reviewer process; a 2-stage review by the same three reviewers. First: send out ONLY the 13-page core of the application and have the assigned reviewers submit scores for each domain based on this information alone, due 2 weeks before the panel meeting. Any reviewer who misses this deadline should be disqualified from subsequent service. Second, have the SRA determine the fraction of grants to be further reviewed, based on the mean AND standard deviation (discounting outliers); send the whole grant to the same reviewers – one week before panel meeting – for consideration of peripheral issues.
Anonymous says:

March 2, 2020 at 5:01 pm

Two stage review process. Stage 1 uses “blinded” review (i.e., identity of applicant/s not revealed to reviewers) and focuses on specific aims, background & significance, and approach (no innovation category, redundant/useless). Top scoring proposals then move forward to step 2. In Step 2, reviewers are unblinded, scores can be re-adjusted based on applicant/s bio, facilities & resources, and other pertinent info. Important: eliminate the “investigator” category (highly subjective element). Step 3 involves discussion and final scoring of top-scoring proposals. Steps 1 and 2: online, step 3: online conference (no more physical meetings, please). Certain aspects of the application: e.g., special budgetary considerations, biohazards, etc. should be evaluated by NIH administrators.
EMILY A KESHNER says:

March 2, 2020 at 4:43 pm

I like the idea of a two-stage process where Approach is reviewed separately from Significance and Innovation. This would help the reviewers focus on shaping scientific direction and allow for more innovative (and interdisciplinary) ideas to be considered even if reviewers were unfamiliar with the methods. The first stage would review the aims and the background and the second stage focus on the approach. Environment and Investigators could be ranked as acceptable/not acceptable (with an explanation) in the first stage and influence whether the grant should move forward. A bulleted summary of 2-3 major strengths and weaknesses by each reviewer might help objectify judgements on the quality of the proposal.
Jeffrey Kieft says:

March 1, 2020 at 6:55 pm

A couple other minor points:

1. I find the current format for the biosketch to be not very useful. I almost always Pubmed search the investigator to get a feel for overall productivity, collaborative papers, etc. In the current biosketch format, papers can be listed twice, the descriptions of scientific contributions are often too long, tedious to read, and overly self-promoting. A personal statement is OK (although needs a word limit!) but after that, I just want a list of publications.
2. Letters of support should be in a standard format (like the NSF) and 2-3 sentences. “I agree to support this work” is all I really need to see.
3. Do the equipment and facilities sections really ever affect a score? Perhaps they enter into Environment, but I suspect most reviewers skim them (at best). I have never heard them discussed at section. I get why there are there, but are they used?
Jeffrey Kieft says:

March 1, 2020 at 6:42 pm

I find the Innovation section of proposals to be generally not helpful – it takes up space and often constitutes a convoluted attempt by the PI to come up with some way the proposed work is “innovative.” Reviewers often have trouble with this criterion and fall back on making this a score based solely on whether the techniques used are new or are established – which should not matter as long as they are appropriate.

In my mind, if a proposal is addressing an unexplored and important problem (and thus is Significant) then it is Innovative because it has identified and is addressing a key unknown. If the techniques are appropriate (new or old), then the Approach is solid. If there is creativity in the proposal, that speaks well to the Investigator. Innovation can be rolled into these.
Kent Thornburg says:

March 1, 2020 at 1:08 am

As someone who has reviewed for study sections over many years. I agree with others on the following points.
Reviewers should not have to comment on environment. Human subjects, vertebrate animal sections should be left to trained CSR administrators.
I like the Significance section which should be used to articulate innovation, potential importance and impact. No need for the Innovation section.
Premise should be replaced with rationale as a subsection of approach. Rigor should be a subsection of Approach.
Voting outside the range by more than one point should not be allowed. I have seen investigators give a 9 to applications where the recommended range was 1-2. This allows a veto of funding by a single individual.
The application should remain at 12 pages. I remember the onerous task (and poorly written applications) reviewing 25 pages.
I do not agree with a 2 tier review process.
David S. Carrell says:

February 28, 2020 at 10:06 pm

I agree with several others that innovation is currently given too much weight, unnecessarily inducing innovation for the sake of innovation. For many research projects traditional methods are more than adequate, and in some cases traditional methods are ideal. Such projects should not be penalized for lack of innovation.
Arturo Hernandez says:

February 28, 2020 at 7:48 pm

I think that the current system should be amended so that reviewers continue to evaluate applications as they are evaluated now. But rather than having an absolute numerical score, the task should be to identify the top 3rd, middle 3rd, and lower 3rd of applications. The top 3rd will then be awarded based on a lottery system. This ameliorates three problems:

1) Humans are not computers and cannot provide exact numerical scores for all these different criteria.

2) The system right now does not necessarily reward the projects with greatest impact. Some applications that fall below the line for funding lead to higher impact research.

3) It reduces the feeling that applicants are somehow being mistreated or are the victims of some form of bias. The current system assumes that investigators can tell the difference between an application at the 12th percentile and one at the 21st percentile. If the funding cut line is 15th percentile, the current system is allowing very small differences to determine success. A lottery would actually make it random and thus relieve everyone from feeling that there is some nefarious process going on during review.

Finally, I think that some percentage (10 or so) of the middle third should be funded. Again this should be via lottery. NIH and other funding agencies wonder how to introduce innovation into the system. This is the way. Creating randomness is likely to lead to more innovation and a bit more of risk taking from applicants. They no longer will feel they are going to take a hit for proposing something that is more innovative.
Rachael S says:

February 28, 2020 at 5:37 pm

All of the yes/no criteria (human subjects research, etc.) and peripherals such as resource sharing plans and VAS should be evaluated by an administrative panel, not by the primary scientists doing the reviewing. Any possible concerns could be flagged prior to study section for scientific review. Investigator/Environment should be combined and scored as a “adequate” or “not adequate” criteria. Otherwise, I’m very happy with Significance/Innovation/Approach, and I see no problem with the fact that Approach so often drives the scores. I disagree that the first paragraph should be taken out, as I agree that it provides an important narrative. I like the idea of giving a rubric (say a major vs. minor checkbox, or a score) to each bullet point
Nora Engel says:

February 28, 2020 at 5:12 pm

1. Rigor, reproducibility, adherence to animal and human subject safety guidelines, inclusion and sharing plans should all be addressed by an administrative body separate from the committee that evaluates SIIAE and budget. Concerns could be flagged and forwarded to the study section for further discussion.

2. The Innovation section is too subjective and should be eliminated.
3. I agree with Dr. Fleiszig that Significance should be judged separately from Approach, but without revealing the Investigator name or Institution, by different members of the same study section so that they could arrive at some consensus. Only after that, reveal the Investigator name and Institution. This reduces bias of all sorts, including against mid-career investigators (who are the ones most suffering lapses in funding right now), against women and minorities, etc.
Elisabeth Smela says:

February 28, 2020 at 3:39 pm

Perhaps a small fraction of awards could be made by lottery, with due diligence (IRB, for example) but without the usual review. It would be interesting to compare impact and outcomes between research funded by the existing system and this group. In particular, would we find that the current system produces more incremental work? Would we find that a lottery results in a more equitable distribution of funding? Would we find that it made no difference at all how proposals were chosen?

I agree that review is often risk-averse. I also agree that emphasis on approach leads to fault-finding and “this will never work” rejections, which are sometimes made from ignorance. There is an element of micromanaging by reviewers that can sap creativity, kill alternative ideas, and rob researchers of agency. Why do we assume that researchers cannot adapt or entirely change their approach if necessary? More emphasis on “this would be awesome if it would succeed” might mitigate this problem.
Warren Kruger says:

February 28, 2020 at 1:21 pm

Innovation is the least useful section. It leads to people thinking more about the new hottest technique as opposed to focusing on answering questions in a intelligent way. If the work is innovative it will be reflected in the significance and/ or approach. In my opinion too many reviewers focus on approach and not enough emphasis is given to significance. Things like Investigator and Environment seem to be more like a pass/fail then a real determinant of score.
Jeffrey Petruska says:

February 28, 2020 at 6:02 am

As a reviewer, I think the current system is fairly good. A lot of minutia has crept into the effort that can be taken off of the reviewer’s plate and addressed in JIT. How seriously applicants take the minutia varies wildly, but the reviewers are required to cover all of it. Putting many of these items into JIT can increase how seriously the applicants/institutions take those items as it can delay their check.
It is very clear that Approach is the greatest driver of scoring. I have no problem with this, and think that parsing it out further would offer little benefit. It cannot stand alone, and the value of the Approach is naturally factored (amplified or diluted) according to Significance and Innovation. I will also agree with the many comments regarding Environment and Investigator essentially being treated as secondary criteria. They factor into discussions and scoring only as binary functions – either generating some serious concern (very rare) or as acceptable (most common). In some cases they can play into considerations of Approach, which may or may not be appropriate. Since they are legislatively necessary, they can’t go away, but also can be treated by reviewers with a wide range of fastidiousness.
The tendency for applications to drift into providing significant experimental detail outside of the 12 pages is a point of internal debate for me. On one hand, it increases the sheer number of pages that must be read, and the sites where potentially-vital information resides. On the other hand, separation of some of the important-but-dry information from the main body can allow a very clear reading of the story in which the Significance and Approach emerge cleanly.
If I were not worried about the possible unintended consequences of shifting some burden back to the applicant institutions (I’m not a fan of this sort of answer in principle), I would suggest that applications be required to carry certifications from institutional internal review processes. It could simply be to cover some of the oft-repeated concerns – power analyses, effort/budget, statistical plan, etc. The question is how to give the policy some teeth so that it does not simply become a meaningless check box. The real problem is that all of the measures I conceive lead to bad outcomes for the individual applicants (unintended consequences). The approach would generally be to require the certification, but if the certified items turn out to be flawed, it would indicate the certification was not taken seriously or the plan/environment was flawed.
Finally, in my estimation, the factor that drives reviewer effort to the greatest degree is the quality of grantsmanship. Poorly-written grants take a lot of effort. Well-written grants take far less. There is little that can be done on this end to improve that factor, save perhaps additional specific guidance on the required sections.
David Mankoff says:

February 28, 2020 at 3:55 am

I think that Innovation is under-emphasized in most current reviews in favor of an Approach where the ideal is that proposed methods are well tested and supported by fairly complete preliminary data, almost to the point of having carried out a substantial component of the proposed research. A review guidance to weight innovation more heavily in the overall score, and an advisory that not all aspects of the Approach for a 4 or 5 -year R01 need to be fully worked out at the time of grant submission, might help restore an emphasis on new and innovative science.
Xian-Jie Yang says:

February 28, 2020 at 3:09 am

I agree that rigor, reproducibility, animal and human subject regulation, inclusion of specific subjects and categories, sharing plan etc should all be addressed by an administrative body. If they have questions, they can be sent them to the study section for discussion. The “environment” is also an item that has never been an important factor on ranking applications. In contrast, evaluating whether the “investigator” have the expertise and capability to carry out the proposed study is important. Even though this evaluation could be subjective at times and influenced by the reputation or “fame” of an investigator.

There is no need returning to the 25-page RO1 grant application, which increases the burden for reviewers, especially when it is poorly or densely written. If a proposal can’t sufficiently describe and explain the research in 13 pages, longer format won’t help.

Ratings on the Significance and Approaches tend to drive the score, and should be the main focus for the reviewers. Scoring significance and impact could be tricky because the current tendency to emphasize on translational research. Some good proposals can suffer from lower scores because reviewers’ nitpicking on perceived or real flaws. This is especially the case when the Reviewer #1 fails to understand or fully appreciate the proposal, and does a poor job presenting it to the whole panel.

One way to eliminate reviewer bias and reduce reviewer burden is to have a two-step review process. First, to evaluate the scientific merit of proposals only, without revealing the name and institution of the applicants to reviewers. This step could involve more reviewers but score by simple check box format. Second, to recalibrate the top tier of grants (30% if funding rate is 15%) by revealing the applicants’ information and engaging in more in-depth discussions. This would reduce existing personal bias in study sections and lessen the reviewer burden.
Stephanie Eisenbarth says:

February 28, 2020 at 2:31 am

In agreement with many of the comments above, the innovation section often seems arbitrary and not so useful in the review. Can we just eliminate it? Or alternatively incorporating it as part of the significant section
Mike Summers says:

February 27, 2020 at 11:17 pm

Study Section reports written in the 80’s, 90’s and early 20’s contained extensive discussions and included a summary statement that generally did a good job of summarizing the discussion. The grant reviews today often comprise little more than three or four sentences associated with each of the several bulleted review topics. Peer review is not perfect, but it is the best system available for evaluating merit. I would prefer to see the peer review strengthened by involving more reviewers with fewer reports to write, but have the reports contain more substantive insights into the critiques and discussion.
Anonymous1 says:

February 27, 2020 at 10:56 pm

Thank you for the opportunity to provide input. Certainly, this is a difficult challenge.
First, we need a better knowledge management system-approach built into PubMed that includes ratings for evidence-level quality, rigor, innovation; and can be filtered by a standard set of study-design types, the translational science spectrum categories; NIH funding and type; and somehow synthesizes data that identifies where we know enough and where there are knowledge gaps. Currently, the approach to identifying innovation, gaps, and initiatives is not particularly systematic and unbiased.
Second, NIH might recommend standard criteria for each of the translational science categories, and/or for study designs, that reflect quality or check lists, i.e. a grading rubric for the Approach.
Third, the innovation and significance needs to undergo a blinded review. Scoring can be biased by pedigree, past NIH funding, who knows who, and grantsmanship.
Fourth, create a calculated score from each of the subscores, and overweight the innovation in the total score calculation to address the conservative approach to reviews.
Finally, to address the level playing field issue, I would suggest to review under separate cover the biosketches, preliminary work, etc and blind the review for the significance, innovation, and approach, etc.
- Sarah Gaffen says:
  
  March 2, 2020 at 5:46 pm
  
  For most grants it is hard to see how they could be anonymized– most projects are built on what the PI has previously done, and a legitimate part of the review criteria is whether the PI in question can actually accomplish the work.
Terje Dokland says:

February 27, 2020 at 10:34 pm

While it is true that evaluating applications is a considerable burden, I don’t think it is writing the review that is burdensome, but reading the applications. Maybe a more standardized format for applications that clearly states where to put background/rationale, preliminary data, approach and data analysis would streamline the review process.

I would hesitate to shorten the review itself, and fear that you would end up with a process more similar the NSF, which is great for the reviewers, but much less helpful to the applicants. Personally I think the overall impact paragraph is good and I like the strengths/weaknesses format.

With regards to the criteria, it may make sense to split the “approach” part into separate sub-criteria, such as “rigor of prior research” or “foundation” (AKA “premise” in oldspeak), “feasibility”, and “methodology”. I think it would also make sense to consider “innovation” as part of approach rather than its own criterion. It is typically difficult to evaluate and tends to play little role in the overall scoring.
James Gern says:

February 27, 2020 at 10:02 pm

I agree with some of the other commenters that there is little value in the scores for Environment, and the Innovation criterion could probably just be folded into Approach. The Approach and Significance sections are the most important, but additional instructions that differentiate these concepts would be helpful to most reviewers.

I am not sure about whether the Investigator criterion should be kept or eliminated. It doesn’t enter into the scoring very much, but sometimes valid concerns about productivity, training or prior scientific achievement are raised. I would like to see some research as to whether this criterion holds back underrepresented investigators. If so, consider eliminating it.

I definitely don’t agree that grant applications should be increased back to 25 pages, as in the past. This led to inclusion of needless detail (and a lot of wishful thinking), increased the burden on the reviewers, and tended to obscure the main message of the application. If anything, consider shortening further. Most of the scoreable information is present in the abstract alone.

Finally, to reduce burden on reviewers, consider discussing a smaller percentage of grants (maybe 40%). These days, if 15% of grants will be funded, there is not much point in reviewing half.
Nick Crispe says:

February 27, 2020 at 8:39 pm

I’ve served twice as a regular study section member, once before the current criteria and once while they were in force. I think the current criteria make sense and see no reason to change them. Like at least one other commentator, I’m at a loss for how to evaluate significance. If the criterion was re-named “Significance of the problem addressed” that would help, and it would make clear the the significance of the project as proposed derives from distinct criteria. It must follow than an average proposal, addressing a problem of outstanding significance, still gets an average score, but significance of the problem could legitimately come into play for proposals along the borderline.

The problem of conservatism will not be addressed by changing the criteria, and it has clearly not been addressed by special categories of grants supposedly for non-mainstream ideas. If you read the abstracts for the Transformational awards that end up funded, they just read like any other R01. My suggestion is that Study Sections be given a special category of score, intended to be used <1 times per review session, for a grant that's genuinely off-the-wall (code OTW). These might go to a distinct pathway of secondary review that is obligated to spend set-aside funds on a subset of these eccentric grants. To avoid conservative bias that this stage, allocate funds to OTW proposals at random. Accept that most OTW proposals will get nowhere, but now and them one will. That's the genuine innovation that you can't specify, ask for or produce to order.
Greg Carter says:

February 27, 2020 at 8:05 pm

Regarding risk aversion, I think the “count the weaknesses” approach in current review guidelines overly weighs risk and under-emphasizes reward. It probably also brings Approach to the forefront, since that’s the area in which we can most confidently identify weaknesses. I have recently served on a study section (non-R01) in which we were instructed to start every score with a 5 and adjust up and down accordingly, which focuses reviewers on positive score drivers as well as negative ones.
Mukesh Nyati says:

February 27, 2020 at 7:11 pm

I wish that the reviews should have three steps involving three different expert groups.

Step 1: This should be a blinded review to assess the novelty and significance of the work. I think at least 50% score should come from this step. This should be done by experts in the field.

Step 2: This step is an open review where the records of the team/PI, the technical aspect, approach of the work, etc. should be judged and could have a 30% impact on the final score. One additional thought I have on the approach section is that it doesn’t have to be innovative or novel, it just needs to be sound enough to address the question posed in the application. Many reviewers find faults in the approach section as not innovative. This review can involve statisticians and other technical experts.

Step 3: This part of the review could involve regulatory experts to judge all other important aspects of an application that most scientific reviewers either ignore or are not experts in such areas. This review can have about a 20% impact on the final score.
Steve Schwartz, Fred Hutchinson Cancer Research Center says:

February 27, 2020 at 7:06 pm

I am surprised by the high proportion of commentors who believe that the “Approach” section receives too much weight. Ultimately, is not the essence of successful science the application of rigorous methods? Indeed, the “Rigor and Reproducibility” criteria should be part of the “Approach.” That these terms were added later as review criteria mostly spoke to concerns that some (presumably large) percentage of reviewers were not paying attention to the details of the “Approach” section of the applications. This is not to say that how “Approach” is evaluated and discussed could use some substantial improvement. When I serve on a review panel, when I have not been assigned to review an application I will try to read the written critiques specifically for their comments about the approach so that I can see if I have any useful comments to add during the meeting. In the past I’ve been able to spot serious threats to validity that were missed by others, or occasionally identify misplaced concerns about the validty of particular methods. Unfortunately I rarely have time to do this for all of the applications that are discussed. The 2-step approach that was used for the ARA applications helped alleviate some of this problem.
Patricia C. Heyn, PhD, FGSA, FACRM says:

February 27, 2020 at 7:04 pm

My suggestion is to create a pre-peer-review checklist tool where the investigators involved with the grant proposal submission will have to pre-review their application by using the checklist that will include all the criteria from their application (similar criteria and format that we use in the study sections). As part of grant submission, it to complete a pre peer-review with all investigators involved in the proposal including one external non- author/participating investigator in the review by using the Investigator Pre-Review Checklist tool.

The Investigator Initiated Pre-Review Checklist Tool will accomplish the following:
1) The grant proposal research team will have a chance to critically review their application and make additional improvements before submitting for funding consideration.
2) It will increase the research team participation and transparency related to their contribution and scientific capacity with the proposal.
3) The study section panel members will have the checklist tool as a guide to confirm eligibility as well as scientific merit consideration.
4) It will increase efficiency during the panel meetings since reviewers will spend less time reviewing the usual “missing/incomplete” technical content, or lack of attention to details, that many of the submissions have.
5) It will also improve scientific advancements as the original investigators will need to critically review their own scientific work and make the recommendations and needed changes before submitting to the funding agency.

Also, many international funding agencies, request that the background/introduction section be organized as a systematic review/meta-analysis evidence synthesis review and this important section has great influence in the application as to show that the application has included the most rigorous and current methodological science.

My 2 cents !
Sheila Collins says:

February 27, 2020 at 7:00 pm

The recommendation above “One way to reduce Reviewer burden would be to eliminate that paragraph in favor of a some sort of ranked criterion Review approach.” I strongly disagree with. The overall impact paragraph is a very important and valuable way to provide feedback, including sometimes nuanced points of view and perspective. As someone still serving on a regular study section I do not find the review criteria and process over burdensome. The real time-sink is reading the proposal, looking up cited literature or other aspects in order to be certain you as a reviewer have your facts and your impressions straight.

Another recommendation above is to lengthen the page limit beyond 12. I think this is a terrible idea as i remember the days when grants were 25 pages – – and it was even more onerous and time consuming to read. A well-written grant can convey ideas and their significance within the 12 pages as they are now.
Alicia McDonough says:

February 27, 2020 at 6:58 pm

I appreciate the opportunity to comment – this is a good idea. I am doing my third 5 yr deployment since the 1990s (just received my $200.00 honorarium for the Feb panel,yippee!). I score the current process very good- excellent. I find all the categories are important for one application or another. Understanding the criteria/categories upon which we need to comment before I read through is very helpful. Further shortening or simplifying won’t save us time needed to understand what we are reading, checking the literature, etc.

My suggestion: Rigor and Reproducibility and SABV and power calculations are all very important and some applicants “miss the boat.” I like it when the proposal clearly discusses these in a section with a header. It becomes “boilerplate,” I know, but it’s hard to dissect the proposal to find these sometimes. That’s my only suggestion.
Sanford Bernstein says:

February 27, 2020 at 6:53 pm

I suggest that the order of the application and the review document be changed to make them more logical and to have the two match. The main body of the text (specific aims +12 pages) should follow the abstract page. Following this, the investigator should write one paragraph as to why they and their team are capable of doing the project. Then they can write one paragraph as to why they have the appropriate facilities. The biosketch/budget/detailed facilities/animal and human subjects/resource sharing/clinical trial, etc should all be at the back of the application, essentially as appendices. The scored categories on the review sheet should be in the order: significance/innovation/approach/investigator/environment. If rigor and reproducibility are considered critical, then a separate section in the application should address this and it should be scored separately. The reorganization will save reviewer time in organizing the review and put the less critical material at the back of the application for reference as needed.
Karen Pierce, Ph.D. says:

February 27, 2020 at 6:27 pm

My comment relates specifically to Reviewer time burden. Although the overall impact summary paragraph is a great way to integrate all of the information presented in each of the scored sections, it takes considerable time to write. One way to reduce Reviewer burden would be to eliminate that paragraph in favor of a some sort of ranked criterion Review approach. For example, the importance level of each of the critique comments could be rated on a scale of 1-5 to give Program Staff a clear picture of what strengths or flaws/issues were most important to the Reviewer. The end result of this approach should be similar to what would have been written in the overall impact summary paragraph but much less time consuming for the Reviewer (and potentially more useful for Program Staff that has to decide the importance and weight of each comment).
SMZ says:

February 27, 2020 at 6:08 pm

Here are my suggestions based on my experience as an applicant and reviewer.

1) I disagree with two-step review process suggested here, which will make the cycle of getting funding longer. It is especially not practical for writing a renewal RO1. This will create more stress and workload on applicants, reviewers and NIH. I remember NSF used this method few years ago and recently abandoned it.
2) I agree with the idea for removing two criteria (investigators and environment) because they are highly subjective. Or two criteria are combined into one, reducing overall impact of these highly subjective factors on the project.
3) The logical order of subsections in proposal (Strategy) should be Significance, Approach and Innovation. Without reading Approach, it is hard to know the innovation of the project.
guido silvestri says:

February 27, 2020 at 5:25 pm

I would suggest that the rigor, reproducibility, adherence to animal and human subject regulation, inclusion of specific subjects and categories, and sharing plans should all be addressed by an administrative body separate from the committee that evaluates scientific impact and budget. Potential concerns could be flagged and forwarded to the study section for further discussion.
Kevin M. King, University of Washington says:

February 27, 2020 at 5:15 pm

1. It would be helpful to find a way to enforce some kind of review standards across committees, if there are concrete principles that are broadly agreed upon. For example, reviewers frequently request or comment on statistical significance tests from pilot data for R01s when that is known to be inappropriate. Investigators vary widely in how directly their power analysis actually reflects the statistical analysis (and aims) that they propose, and power frequently focuses on a single dimension of the study rather than informing all aims (which is inappropriate).

2. Innovation should be strongly downweighted, or the criteria should be changed to reflect the nature of a studies’ potential contribution. Innovation is far too over-valued, especially in the context of bodies of prior research that may have been selectively reported, p-hacked, HARKed, or simply weak. Studies that aim to replicate prior work, especially when that prior work may be questionable, should be much more strongly valued.

3. It would help to have more guidance on Environment criteria. That often has a bimodal distribution.

4. Rigor and reproducibility criteria should be much more clearly articulated and more central to the review, and should be connected to specific behaviors that will occur in the course of a study. Reviewers should be commenting on how PIs are instituting procedures to ensure that others would be able to directly reproduce analyses, share data, and pre-register study aims, analyses and procedures (or pre-register the process for coming to those decisions) to avoid selecting reporting, HARKing, p-hacking, and other behaviors that undermine the quality of information produced from grants.
Emilia Bagiella says:

February 27, 2020 at 5:10 pm

As a member of a study section that reviews multicenter clinical trials, the amount of material that is currently assembled by the applicants requires hours and hours of reading. The investigators use the 12 page limit for background and significance while study design, inclusion criteria, statistics and data management and study conduct are referenced and found in an increasing number of attached documents (study protocol, statistical analysis plan, human subject, management, resources, etc.). Reading all this information for the assigned applications requires a huge amount of time which often is not given to the reviewers. It is also very difficult to find the supporting material for the review criteria across the different documents.
The innovation section is often uninformative and carries almost no weight in the overall score. There is a general confusion about the difference between significance and overall impact and these, in my experience, are interpreted differently across study sections. Therefore reviewing for different study section become difficult, The environment section is also often not informative and rarely moves the overall score.
Patrick Griffith says:

February 27, 2020 at 4:56 pm

PATRICK GRIFFITH, UT South Western M C says: February 27, 2020 at 11:45 am
The 2 committee process has merit.
The 1st Subcommittee should rate proposals primarily on unique targets and written specific aims .
The composition of the subcommittee would depend on the announced RFA.
The most creative should have a score of 1.0 up to 2.5. –no limit on those applicants getting such scores.
The 2nd committee would then review ALL aspects of the submissions and fund only those with scores of 1.0 up to 2.0
Paul Brookes says:

February 27, 2020 at 4:42 pm

A suggestion I made to CSR many years ago (as part of a different RFI), was to change the order in which proposal materials are presented. Although the research proposal component of the R01 is only 12 pages, as a reviewer one has to wade through 50+ pages of biosketches, budgets and admin’, before actually getting to any science! It is inevitable that unconscious bias creeps in during such a linear read-though. As such, reviewers may have made up their mind before they even get to the science part. If the objective is to score based on scientific merit, then the science part needs to come first in the document, followed by the other material. This would be incredibly simple to implement – just change the NIH submission software that compiles the final PDF, so it puts the research part first.

An evolution of this idea is as follows… As part of the electronic review process, make ONLY the 12 page research proposal available to reviewers (with PI name and institute redacted). Let them assess the proposal and score it for scientific merit. Then, ONLY after they have submitted the merit score (some combination of approach, significance and innovation), would the other parts of the proposal be made available for review, to allow scoring of investigator and environment criteria.

Given that scores already have to be entered well before study section, CSR could give reviewers a break here… science scores on part 1 due a week before the panel, and maybe part 2 scores due a couple of days later. Preliminary scores could be based solely on part 1 (remembering that the objective is to judge proposals on scientific merit).
- Margaret Woland Sullivan, Ph.D. Professor Emeritus Beharioral Health and Nursing Science, Rutgers University says:
  
  February 28, 2020 at 4:17 pm
  
  I like the suggestion of blind review for significance and innovation as an initial step in the review process. I believe this is the only way to offer the level playing field for RO1s among investigators with varying years of experience and would allow more rising stars to be competitive. Many of the comments suggest that environment and investigator are lesses weighted criteria, and I agree, However, I have seen some potentially exciting work sunk by a less stellar “pedigree”. Having this separately evaluated and then recalibrating the scores among the top ranking grants with the criteria that the individual is competent or not to carry out the work in his/her environment.
Arturo Hernandez says:

February 27, 2020 at 4:28 pm

I would propose creating a lottery system for the top 30% of the applications. The data pretty clearly show that reviewers are able to identify the most meritorious applications. However, going from a score in the 33 range to the 12 range is much more difficult and subject to bias or to looking for conservative applications. I would also suggest that a lower portion of applications in the 30-45% should be randomly given funding.

The current system is creating unpleasant feeling amongst applicants. Everyone feels that reviewers or program people or some other aspect of the “system” is biased against them. Why not leave it to chance? It is much more gratifying to know that actually going below the cut line will happen at some point.

The current system assumes that human’s can precisely scale their judgements at the upper end. It’s a flawed assumption and using a random system with some filtering will help to alleviate some of that pressure.

Furthermore, this will allow reviewers to NOT feel compelled to add additional possible constructive criticism. Right now reviewers feel that an application has to be perfect in order to get below a 10-15% percentile score.
Jeff Bulte - Johns Hopkins University says:

February 27, 2020 at 4:26 pm

The scoring system is flawed and funding often becomes arbitrary. Many reviewers do not like to spread scores too much as they are afraid to bark up the wrong tree and want to be on the safe side with the other two, not an outlier, so most scores range between 3 and 5 and can then be adjusted at the discussion (they tend to merge in the middle). For me, in the same study section, a score of 38 became 16 percentile and 40 became 23 percentile. This is an example. It had 3s and 4s from the reviewers. A score of 38 or 40 is totally arbitrary with the whole panel voting (many may have not read the grant in detail) however such scores are just below or above the funding cut-off. I think we should spend more time or have a second look at scores between 3 and 4. Also, there is a lot of BS with “innovation”. If you just published a great idea in Nature, and build the grant on that, it is not innovative anymore as it has been published. But then you need to have those prelim data for an R01 otherwise the grant is not supported. There should be a more active step-in role from the SRO at the discussion.
no says:

February 27, 2020 at 4:22 pm

I was a standing member of a study section and currently holds multiple RO1 grants, some of which were submitted several times before the award, so I have some experience in NIH review process. My comments are shown below.
1) Innovation criterium can be deleted from regular RO1 review criteria in my view. When I write RO1 grants, I typically find writing under Innovation challenging. If it contains technical innovation, it is easier, but if the innovation is conceptual, it can easily fall under Significance. If I look at my scores, often significance and innovation scores are identical. And, NIH now has a different mechanism which focuses on technically innovative grants. Innovation in RO1 can easily be without.
2) As a reviewer, I was asked to also comment on the potential impact of an application in scoring. I believe Impact and Significance can be merged as one.
3) For the rigor and reproducibility, I suggest that we make a category outside of 12+1 page limit. Our research is mainly based on mice and primary cultures from rats. So, we end up providing detailed experimental plan under Vertebrate Animals section. In it, we provide power analyses of each Aims and sub Aims, which requires showing very detailed experimental plans and settings, such as how many sets of negative and positive controls, replicate numbers, and so forth. For research based on cell lines and in vitro systems can benefit from such additions, without having to justify rigor and reproducibility within the 12 page limit.
4) For reviews being too conservative, I agree that it has been. It is well acknowledged that you have to present almost all of what you propose to do as preliminary data in order get the award. So, the feasibility is taking up too much weight in Approach section, in my view. Perhaps, we can make a separate category that combines Risk vs Impact assessment that would incorporate Investigator? If one has produced consistently impactful significant papers over the years, the risk will not be as high compared to those who proposes high risk application without a proven history. We have an Innovation mechanism separately for that at NIH.
Ernesto Marques says:

February 27, 2020 at 4:16 pm

The criteria proposed by lay (significance, investigators, innovation, approach, and environment. Protections for humans, animals, and the environment, adequacy of inclusion plans, and budget must be evaluated) are very good. The main issue as pointed out is that reviewers often have very little time to read and understand the proposal. So in general simpler projects have an advantage for the primary reviewers understand and present to the group. Excellent but very complex projects often get good scores but not good enough to be funded and are often misunderstood.

I think that the proposals could work better if the proposals could be evaluated in steps and interactive using much simpler documents describing the proposals. Perhaps one step could cover in significance, investigators, innovation, approach, and environment in a very simplified version of the current document and in a second step the selected projects would go in a more detailed analyzes of the approach.

Multiple studies show that reviewer ratings of Approach carry the most (perhaps too much) weight in determining overall impact scores. Yet, aspects of rigor and reproducibility are too often inadequately evaluated. Can better criteria help?

I think that the current criteria are very good but need to be examining with more care and in more depth. However the process needs to be more productive. The current system demands that an enormous amount of information to be submitted together demanding a lot of effort from the applicant and at the same time resulting in poor reviews. It is a lot of effort from the applicant, NIH and reviewers wasted. The current triaging system reduces the amount of time debating applications considered less worthy but the triage is based on a superficial evaluation of the preliminary stores.

Review is often criticized as being risk-averse, as too conservative. If you agree, how might revised criteria help?

It is correct as well. I think it is the result that more complex ideas are more difficult for the reviewer understand and defend the project to the group. Simpler dumb down projects that anyone can pick up the concepts in 1 or 2 minutes have a huge advantage.

How can criteria be defined to give the applications of all investigators, regardless of their race, ethnicity, gender, career stage, or setting, fair hearing on a level playing field?

I think that a multi step evaluation would work better for all. One step that could be done in a much simplified version of the current document and more focused significance, investigators, innovation, environment and only a preliminary overview of the approach. Once pass this stage the approach would be evaluated in much greater depth and together with Protections for humans, animals, and the environment, adequacy of inclusion plans, and budget. At this point a project could be resubmitted once after the review.
Steven Feldman says:

February 27, 2020 at 4:08 pm

Given the specifications in the regulations, the current system seems entirely reasonable, addressing specifically each point the regulations call for. I have the sense from serving on study sections that the current system allows reviewers– within the constraints of the review criteria– to meet the specifications of the regulations while providing a good overall sense of the quality and value of the grant applications.
Nanette Bishopric, Georgetown University says:

February 27, 2020 at 4:06 pm

As a 10+ year member of standing study sections, I think the rigor, reproducibility, adherence to animal and human subject safety guidelines, inclusion and sharing plans should all be addressed by an administrative body separate from the committee that evaluates SIIAE and budget. Concerns could be flagged and forwarded to the study section for further discussion.

Approach is unquestionably the dominant driver in critiques, and that may be appropriate since this is where most “fatal flaws” lie. But I believe the scoring of Approach is so dominant that it is often reflected back into the Significance score – as in, this can’t work, so the proposal (rather than the problem) isn’t “significant”-‘ this can mean the difference between a getting helpful critiques for revision and going unscored. In some cases I’ve seen grants scored poorly not because the methods were inappropriate, but because a reviewer preferred different ones, without explaining why they were better. On the other hand, I’ve seen reviewers so infatuated with the Approach that they overlook the incremental nature of the output.

I agree with Dr. Fleiszig that Significance and Innovation should be judged separately from Approach, but by different members of the same study section so that they could arrive at some consensus.
Elise Weerts says:

February 27, 2020 at 4:02 pm

The applications page limitations in which the entire application fits int he 12 pages require applicants to make choices on what to focus on and applications are getting very compact. Having designated form pages for key elements would insure adequate space is allocated and can include guidance in required elements would be part of submissions. At first glance the clinical record for human research that meet criteria for a clinical trial seemed like a great way to ensure this. BUT the unclear instructions regarding non-duplication in the application vs. requirements that materials in clinical trial also are in body of the approach makes reviewing very difficult and has added to the number of pages to review increased reviewer burden). Instead reduce duplication by having designated pages for the following sections for ALL applications (not just clinical trials, but all research projects involving animal and human studies): 1) Drop down box for Primary study purpose (basic, treatment, health service etc ) 2) tables of outcomes or measures, 3) statistical design and power, 4) Overall structure of the study team, 5) study timeline. Have clear instructions that this information should be in the designated sections to reduce duplication, and allow applicants to have sufficient space for significance and innovation. The approach is how reviewers determine reproducibility so the fact it is a driving score is not surprising. The terms used the instructions and review criteria can often be confusing to applicants. For example, there has been much discussion of what exactly is “rigor of prior research”. This phrase does not much better than scientific premise for new grant writers. I usually tell young investigators to think of this as the supporting evidence/rationale for research, and model/hypothesis (if hypothesis driven). Finally, the instructions/terms are currently slanted towards promoting less risky research. If it is truly cutting edge or ground breaking it will be unclear if it will be reproducible as the proposed study will be the first step.
Chunyu Liu says:

February 27, 2020 at 4:01 pm

I think, the investigator, and environment evaluation can be mostly skipped. First, those evaluation are quite subjective. Second, I rarely see applications from investigators that does not qualify for submission, for doing research, nor Inst that does not have working environment. Except for a few extremely good ones, majority are similar. Giving a score of 2 -4 does not differ at all.
So, I recommend, remove it them from the 1-10 score matrics. We should put more time, efforts to evaluate the science, the study design, innovation.

Significance is also a tricky one. Given NIH emphasizes clinical, translational and basic research, it is rare, and difficult to make objective conclusion that one study is not significant, not important. We can always argue for the significance. Criticizing the others’ proposal to be not significant frequently are typically biased for personal opinion.

In short, I think, investigator, environment, and significance can be removed from the scoring matrix. It is OK to have them as Yes/No. Leave the text justification optional. That could save me time when writing the critiques. Putting more time to evaluate study design, innovation.
Bijan Najafi, PhD says:

February 27, 2020 at 3:57 pm

Based on my experience on the panel and review critiques received for my own applications, it seems clinical studies are often hammered by minor weaknesses in statistical analysis, lack of a biostatistician in the team (despite the analytical plan are fair and investigators published prior studies with high rigor), and other minor issues in the approach despite the proposed study may have high significance and highly innovative with strong investigative team relevant to the scope of the study! If feel this could make clinical studies less competitive than basic science studies. The definition of rigor of prior studies is also vague. A preliminary study could be hammered because simply a reviewer may not believe the preliminary results and wish to see much larger sample size and rigorously controlling for all confounders to believe it. The irony is the application actually seek funding to validate their observation on a larger sample with good control of confounders. So this becomes contradictory between the definition of rigor or preliminary studies and the scope of the proposed study.
Avi Nath says:

February 27, 2020 at 3:57 pm

It is not unusual to see applications with really good science being scored poorly due to issues related with grantsmanship. Examples include comments on font, lack of detail on meetings with mentors, not enough or too many training courses, lack of details in collaboration letters, etc. These issues with grantsmanship can be commented upon but should not be included in the scoring process. Similar to animals, human subjects and budget discussions, they can be mentioned after the grant has been already scored.
- Oliver Wilson says:
  
  February 28, 2020 at 12:35 am
  
  This is an excellent point and will address many of the other issues raised in this forum. Everyone seems to agree that we need to focus on good science and try to eliminate comments that drag down scores for nit picky things
JP Pandey says:

February 27, 2020 at 3:56 pm

ALL submitted proposals should be discussed. At present, a single reviewer can make it “Not Discussed”, and the applicant has no idea what the Study Section as a whole thinks of the application. Occasionally, the three reviewers contradict one another, making it impossible for the applicant to make a coherent response. I realize that during the review process anyone can ask that a particular application be discussed; in practice, this hardly ever happens for fear of alienating some vocal/powerful members of the panel.

At present, there is too much emphasis on the “preliminary data”. You get paid for what you have already done (using money from an unrelated grant)! This is contrary to the spirit of scientific research/investigation. (The history of science is replete with examples of great discoveries that would never have been funded using these criteria.) A greater emphasis should be on the SOLID scientific rationale for the proposed studies.
- CSR Admin says:
  
  February 28, 2020 at 11:37 am
  
  Correction – a single reviewer can pull an application up for discussion. It’s the opposite for not-discussed applications – all panel members must agree.
Sokol Todi says:

February 27, 2020 at 3:54 pm

Several things have become clear to me during the several years that I have participated in study sections. I hope this information, and my thoughts on potential solutions, might help.

– It is clear that the category-specific scores mean nothing in terms of overall score for impact and percentile. Might be best that if we stick to the current system, those scores be eliminated and only strengths and weaknesses are shown. Applicants pay too much attention to those numbers that almost never add up properly to the actual impact score, and they detract from being able to properly prepare revisions.
– The innovation score is really useless. It seems to never carry any real weight, and seems to be adjusted to reflect the prelim score assigned.
– The investigator score also carries little to no weight, as currently used by reviewers. For example, reviewers will grumble in approach about a PI not having a specific expertise (even a simple technical expertise that is easy to acquire), even though they give the PI a score of 1 or 2. That just does not make sense. Approach is then driven to a higher number (worse outcome), even though the idea would be that the PI should be able to learn new things if (s)he is deemed at least outstanding or exceptional. (Who doesn’t learn new things?)
– The environment score is also rather useless and does not seem to drive overall score.

So, yes, the Approach and Significance carry the most weight. But, often reviewers will also confuse Significance with Approach. Providing even more details to reviewers on what Significance and Approach means, and what verbiage they need to address will not fix this, as you might have noticed from attending study sections. (How many reviewers actually abide by the specific verbiage requirements to drive their score, and how many of them simply adjust language when required to do so, but without changing scores?)

Perhaps it might be worth considering that a single score be given (overall score, 1 to 6; how many grants in total to do you see scored worse than a 6?) that encompasses significance and approach, together. The rest of the criteria, if there need to be others: innovation (in my mind, this is the single least useful criterion of them all), investigator expertise in the proposed work, investigator productivity, environment (should do away with this one as well), alongside some new ones such as perceived impact to the field, feasibility for timely completion, scientific rigor be rated as: (1) Exceptional/ (2) outstanding/ (3) excellent/ (4) adequate/ (5)inadequate, without room for comments (yes, I can see how some people might complain about this). From these category-rated criteria, alongside the Overall score, a single number is then derived that is used for ranking before the meeting, on a total scale from 1 to 10 (formulae here is easy. happy to send one if needed.) Ideally, the Significance and Approach score would not weight more than 30% of the total value used to rank.

At the meeting, discussion is based on the overall score for significance and approach, and the entire room can comment on specific criteria if they deem the ones given by the three assigned reviewers as improper. For a final impact score, a weighted average is used where a scale of 1 to 10 is used based on significance and approach, which are the nearly sole focus of the discussions anyway, and at the end of there study section, those that are discussed are now curved on a scale from 1 to 10 to counter inflation.

Happy to chat and expand on anything, if it helps.
Raymond Runyan says:

February 27, 2020 at 3:53 pm

First item on review form: If successful, will the proposed research significantly advance the research area. Does the research alter paradigms, test new theories or alter medical procedures?

Second item on review forms: Does the described level of rigor and reproducibility give confidence that the proposed advancements will be adequately tested?

Third question: Are there issues or problems with the approach that would prevent the optimal progression of the research project.

Big issue: In the current review process, 1 reviewer can effectively veto a grant with a high score. This allows biases to prevent review of otherwise strong grants by less established investigators. If the review process is simplified sufficiently, adding a fourth reviewer would improve fairness. Alternatively, limiting the initial score range to a maximum of 6 would disable the power to veto consideration.
Gitendra Uswatte says:

February 27, 2020 at 3:48 pm

I want to second the idea of submitting a letter of intent. Helps to keep the focus on the big picture at the first cut.
Gitendra Uswatte says:

February 27, 2020 at 3:47 pm

Make the now not so new human subjects and clinical trials section a JIT component. Some reviewers read it, some not. So, for submitters this section is often a waste of time. For reviewers who are faithful to the instructions and read these sections it imposes a substantial additional burden with little additional incremental value for determining whether a project should be funded or not.
- James OHalloran says:
  
  February 27, 2020 at 9:47 pm
  
  Agreed on making the human subjects and clinical trials section JIT components.
Christos Coutifaris says:

February 27, 2020 at 3:46 pm

I like Suzanne Fleiszig’s suggestion of a two-step review. The first can be done without a face to face meeting. I would also like to recommend that either under the Overall Evaluation section or the Approach, to include a subsection where the reviewer can state any “Fatal Flaws” of the proposal (or state NONE). Even though such notations are informally expressed by reviewers, I believe it would help having them as part of the FORMAL review process. The advantage of doing this, can help streamline and focus the review/discussion and potentially spread the scores. It would also help the applicants prioritize and address THE most critical review points in their revised applications.
Boris Zaslavsky says:

February 27, 2020 at 3:46 pm

The most important criteria in my view is if the advancement of the theoretical or practical knowledge is made assuming the study provides the results the author envisions. The general statements of “the study is important for development of cancer treatment” should be viwed seriously.
Victor Mark says:

February 27, 2020 at 3:46 pm

Eliminate the category of Innovation. Guidance for assessing a score to the area of “Innovation” is lacking. Furthermore, a great many of promising funding applications of necessity must draw upon earlier work. Such a promising line of research would thus be faulted for not being “innovative.” In contrast, a truly “innovative” application would be risky because of its lack of prior related work. It is also straining how one can compare one application to another in the area of “Innovation.” This is so subjective as to be useless. I doubt that the inter-rater reliability of “Innovation” for a particular application would be high.
- Seth Schwartz says:
  
  February 27, 2020 at 5:54 pm
  
  Great comment. In my opinion the innovation criterion is impossible to score.
Dr. Bellur S. Prabhakar says:

February 27, 2020 at 3:44 pm

A substantial proportion of the applications receive poor scores on approach which drags down the overall score. Perhaps, an initial group can evaluate, the significance, innovation, Investigator, environment, and programmatic priority. This evaluation can be summarized and sent forward along with the application for secondary review. Those applications in the upper 3rd after the first review can move to the technical review (i.e. scientific approach). This will allow full and thorough review of more than twice the number of applications likely to be funded. Those that fail to receive fundable scores in the 2nd stage can be revised and resubmitted; having been considered to be of higher significance, those applications can directly go to the 2nd stage for technical re-review.
Tom Brenna says:

February 27, 2020 at 3:44 pm

I can confirm in my experience the same issues are problems: Approach weighting; risk-aversion. I am not sure how criteria can be fixed. I will however take time to say that >20 years ago the change from 25 page limits to 13 page limits has been a monumental failure, leading to the predictable doubling of proposal submissions and attendant huge increase in workload, reaching far deeper into the eligible reviewer pool. Probably most importantly, is the discrimination against interdisciplinary research. By definition, innovative proposals that link disparate areas must inherently include major areas that are unfamiliar to all reviewers. For instance, putting together fields A and B that usually do not go together, and having the proposal reviewed by an expert in A and and an expert in B does not help. Neither understands the other field. The space limitation means that no tutorial can be provided to explain essential elements and why a concept from A applies to B. One way that may help is to insist that reviewers comment only in their areas of expertise. Better is to bring back the 25 page limit. And while we’re at it, bring back the study section membership named in advance. It makes no sense to ask overburdened NIH SRAs to chose “experts in the field”. It is far better if I have a list of the persons on a study section, can look up what they know, and then write to their expertise. We would avoid preposterous errors of core chemical and biological principles. We would also avoid most of the seemingly random scoring from proposal version to version, specifically the previously very rare big drop in a score for a revised application. 25 pages is less work than 13 pages. and assigning the study section for 4 years and then leaving those reviewers alone was a better system. I’ve had more grants in the last 25 years than before that, but I still greatly prefer the 25 page limit.
Candace L Floyd says:

February 27, 2020 at 3:41 pm

It is my view that approach scores often drive the overall score. Additionally, this category also includes assessments of preliminary data, feasibility, and the adequacy of proposed methods. Perhaps if the approach evaluation is further divided to separately evaluate these three constructs, the reviewers may be able to send a more clear message as to which elements of “approach” need improvement or are the score-driving elements.
- Wayne Frankel says:
  
  February 27, 2020 at 7:01 pm
  
  I like this idea, perhaps not by adding additional scores per se but at least making instructions clear that each of these three aspects require commenting upon.
Laura Hanson says:

February 27, 2020 at 3:41 pm

First, a simple endorsement for the current review criteria. No process is perfect, but as a participant on both sides (reviewed and reviewer) the criteria are robust and — used correctly — can allow for effective peer review. Further, the weight given to Approach can seem unbalanced, but rigorous methods are critical for successful implementation of the protocol and validity of the science — brilliant ideas don’t get anywhere without effective implementation.

Consider an internal review of funded R01s to examine metrics of success by sampling those at the extremes of scoring (e.g. the last percentile scoring making it over the line vs the very top percentile scores). Metrics could include impact and number of publications, R01 re-funding cycles, or even more qualitative assessments of high impact science, etc. While that range is limited to those achieving funding, it might hold clues regarding peer review — are the most successful grants also earning the highest overall scores? Are they marked by high scores within one of the 5 major scoring domains? Are they marked by “score spread” among reviewers, which may suggest more controversial or risk-taking ideas?
anonymous says:

February 27, 2020 at 3:37 pm

Many of the criteria for R01 review listed above could be handled as “box-check” reviewed administratively by NIH staff. Then the reviewers and study section could use their limited time productively to focus on the 2 main issues that their expertise is required and best suited for: 1) What are the (unique) qualification(s) of the invesitgator(s) for the proposed project 2) A single rating for the research project that would include significance, innovation and approach. Environement would also be handled as an admin check box. I have served as an NIH reviewer and investigator. The problem with review right now as noted above is mission creep. The process has been micro-managed essentially to death, so that we have lost sight of the big picutre composed of the invesitgator and research proposal. I applaud NIH/CSR for now addressing this.
Suzanne Fleiszig says:

February 27, 2020 at 3:37 pm

As a follow up to my previous idea of having 2 committees (first committee reviews only significance and innovation, with only top 50% going forward to second committee for review of other criteria): If the investigator submitted only a letter of intent with their idea (for the initial significance and innovation review), it would save both investigators and reviewers an incredible amount of time and effort, and turn around to investigators could be much faster.
- Chunyu Liu says:
  
  February 27, 2020 at 4:03 pm
  
  I like your idea!
Suzanne Fleiszig, UC Berkeley says:

February 27, 2020 at 3:21 pm

Run applications through two committees. The first evaluates significance and innovation. Only the top half proceed to second committee who evaluates other aspects. This would reduce burden per individual, and allows the right expertise on each committee – Big picture/thought leaders, clinician scientists on the first – detailed thinkers, specific experts in methods /approaches on the second.
- RJ says:
  
  February 27, 2020 at 4:25 pm
  
  I think this is a great idea. It may be the only way to increase emphasis on the significance and innovation.
- Francis P. Crawley, Good Clinical Practice Alliance - Europe (GCPA) & Strategic Initiative for Developing Capacity in Ethical Review (SIDCER), Brussels says:
  
  February 27, 2020 at 5:35 pm
  
  Suzanne, this may indeed be a good proposal. What also might help is a self-assessment on the part of the applicant. Perhaps a kind of SWOT analysis as a cover sheet.

Comments are closed.

Subscribe to Review Matters

Review Matters

Seeking Your Input on Simplifying Review Criteria

Author

72 Comments on "Seeking Your Input on Simplifying Review Criteria"