scholar-evaluation

Systematically evaluate scholarly work using the ScholarEval framework, providing structured assessment across research quality dimensions including problem formulation, methodology, analysis, and writing with quantitative scoring and actionable feedback.

Score scholarly work across research quality dimensions

Source K-Dense AI
License MIT
First documented

Trigger phrases

Phrases that activate this skill when typed to Claude Code:

  • evaluate this paper
  • score this manuscript
  • assess research quality
  • ScholarEval review
  • rate this study

What it does

scholar-evaluation is a Claude Code skill from K-Dense AI’s scientific-agent-skills repo. It turns Claude into a structured evaluator that applies the ScholarEval framework to scholarly work — producing quantitative scores across research quality dimensions (problem formulation, methodology, analysis, writing) alongside actionable feedback for each dimension.

A session produces a scored evaluation report: dimension-level scores, an overall rating, strengths, weaknesses, and specific improvement recommendations — structured for use in grant panels, research training programs, or self-assessment workflows.

When to use it

Reach for it when:

  • You’re on a grant review panel and need a consistent scoring framework across multiple applications
  • You’re running a research training program and want structured, reproducible feedback on student work
  • You want a quantitative self-assessment of your own manuscript before submission

When not to reach for it:

  • Writing the narrative text of a peer review — use peer-review
  • Evaluating evidence quality for a clinical decision — use scientific-critical-thinking

Install

Copy the SKILL.md from K-Dense AI’s scholar-evaluation folder into .claude/skills/scholar-evaluation/ in your project.

Trigger phrases: “evaluate this paper”, “score this manuscript”, “assess research quality”, “ScholarEval review”.

What a session looks like

A typical session has three phases:

  1. Manuscript intake. Provide the paper text or file. Claude identifies the research type and calibrates which ScholarEval dimensions apply (e.g., clinical research dimensions differ from computational work).
  2. Dimension scoring. Each ScholarEval dimension is evaluated independently with a score and supporting rationale drawn from the manuscript text.
  3. Report generation. Scores are aggregated, strengths and weaknesses are summarized, and prioritized improvement recommendations are listed — structured for easy comparison across a set of papers.

Receipts

Where it works well:

  • Grant panel workflows where consistency across reviewers matters — the framework produces comparable scores across different reviewers using the same rubric
  • Training contexts where students need specific, dimension-level feedback rather than general comments

Where it backfires:

  • Highly interdisciplinary work that doesn’t map cleanly onto standard research quality dimensions
  • The scoring can produce confident-seeming numbers for dimensions where the manuscript is ambiguous; scores should be treated as a starting point for discussion, not final verdicts

Pattern that works: use the dimension scores to structure discussion in panel review, not to make yes/no decisions mechanically; the framework’s value is in consistent vocabulary, not algorithmic selection.

Source and attribution

Originally authored by K-Dense Inc.. The canonical SKILL.md lives in the scholar-evaluation folder of their public scientific-agent-skills repository.

License: MIT. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner’s perspective. For the formal spec and any updates, defer to the source repo.