What is a role-graph and why does your assessment library need one?
A normalized graph of roles → skills → questions. Why it changes coverage, calibration, and JD parsing.
The flat-list problem
Most assessment libraries are a flat list of questions tagged with a skill name. Java. SQL. Python. System design.
This works until you try to do anything beyond exact-skill lookup. "Senior Backend Engineer with Spring Boot, 5+ years, must know @Transactional propagation." A flat list can't tell you that.
So the platforms ask the customer to tag questions manually — pick from a dropdown of 200 skills. Coverage gaps stay invisible. The question marked "java.spring" is doing the work of "java.spring.security" and "java.spring.transactional" and "java.spring.config" all at once. Difficulty calibration is anchored to "the Java tag" instead of to the specific concept.
What a role-graph is
A role-graph is a normalized tree where:
- Roles decompose into skills.
- Skills decompose into sub-skills (and sub-sub-skills, until the leaf is a coherent topic).
- Each leaf has tagged format-mix preferences (how much MCQ vs coding vs SJT for that topic) and difficulty bands (which levels are reasonable to test for that topic).
- Each leaf points to the questions authored for it, with their IRT calibration metadata.
A real example for a real role:
Senior Backend Engineer (Java Spring Boot, 5+ yrs)
├── Java 21 fundamentals (band: 3-4 · MCQ × 4, Coding × 2)
├── Spring Boot
│ ├── @Transactional propagation (band: 4 · SJT × 2, Coding × 1)
│ ├── Configuration & profiles (band: 3 · MCQ × 3)
│ └── Security & filters (band: 4 · SJT × 1, Coding × 1)
├── JPA / Hibernate
│ ├── N+1 query patterns (band: 4 · Coding × 2)
│ └── Optimistic vs pessimistic locks (band: 5 · SJT × 1)
├── PostgreSQL
│ ├── Index design (band: 4 · MCQ × 2, SQL × 1)
│ └── Explain plans (band: 5 · SQL × 2)
└── System design
├── Idempotency (band: 4 · SJT × 1)
└── Rate limiting (band: 4 · SJT × 1)
That's not "the Java tag." That's a structured map of what you're actually testing. Different concept entirely.
Why this matters
1. Coverage gaps surface automatically.
Walk the graph. Every leaf either has questions or doesn't. The empty cells are your authoring backlog. With a flat list, you don't know what you don't know — every "Java" question looks like coverage.
2. Calibration becomes scientific.
IRT difficulty estimates anchor to specific concepts. "How hard is @Transactional propagation in Spring Boot at band 4" is a real question with a real answer measured against a real reference panel. "How hard is Java" isn't.
3. JD parsing becomes mechanical.
When a customer uploads a JD, the parser doesn't have to guess at "what is this role." It walks the JD's stated skills against the role-graph, picks the matching subtree, and generates a format-mix that mirrors the role's expected depth. JD-Forge's 30-second SLA is only possible because the role-graph turns parsing into traversal.
4. The graph compounds.
Every JD parsed adds nodes the graph didn't have before. Every successful customer drive teaches us that one node calibrates harder/easier than the IRT estimate. Every leak retiring questions adds new variants to the leaves. The graph is the moat.
What it doesn't do
A role-graph does not magically tell you what questions are good. SME validation still happens. Bias detection still happens. Anti-leak rotation still happens. The graph is the organization — a precondition for all the other operations to be efficient.
A role-graph isn't a knowledge graph in the academic sense (no edges across concepts, no inference). It's a routing tree for a content engine. Pragmatic, not philosophical.
How we built ours
It started as a flat list of 200 skills. We onboarded 18 SMEs across Tech Core and India Stack, gave each one a role and asked them to break it into sub-skills. The result was inconsistent — different SMEs decomposed differently. So we wrote a SME-onboarding doc with worked examples and a structured template. Now the variance is controlled.
The role-graph is in PostgreSQL with a role_graph_node table and self-referencing parent_id. It's nothing exotic. The work is in the discipline of authoring it.
We expose graph traversal in the ReadyBank API: pass a role string, get the matching subtree, get the questions. JD-Forge uses the same traversal internally. Stack-Vault customers get a private overlay of the graph mapped to their stack.
A worked example: Spring Security → JWT
A subtree we use a lot in Talpro India's Java backend hiring drives:
Java → Spring → Spring Security → JWT
│ ├── Authentication filter chain (band: 4 · SJT × 1, Coding × 1)
│ ├── Token signing (HS256 vs RS256) (band: 4 · MCQ × 2)
│ ├── Refresh-token rotation (band: 5 · Coding × 1, SJT × 1)
│ └── Stateless vs session-based trade-offs (band: 4 · SJT × 1)
└── Spring Data → Repository injection (band: 3 · MCQ × 2)
When a JD lands with "Spring Security required, JWT preferred, 5+ yrs," JD-Forge parses to that subtree directly. The format mix is set per leaf, the difficulty bands are calibrated, and the AI-draft step generates new variants only against leaves that need rotation under anti-leak. The output is repeatable: same JD, same graph traversal, same pack shape (modulo anti-leak variants). That repeatability is what makes a 30-second SLA possible.
Why the role-graph powers all three SKUs
It's tempting to assume the role-graph is just an internal taxonomy. It isn't — it's load-bearing across all three customer-facing products.
- ReadyBank API uses it to route customer queries into the right subtree. A platform integration can request "Senior Spring backend questions, band 4-5, 30 items" and the API walks the graph deterministically. No fuzzy LLM tagging in the request path; that's the point.
- JD-Forge uses it to do parse-then-traverse on uploaded JDs. The 30-second SLA is graph-mechanical: parse → walk → AI-draft only against leaves needing rotation → SME-validate → ship.
- Stack-Vault customers get a private overlay — their own subtree forks where the customer's tech stack diverges from canonical (e.g., an automotive customer's embedded stack adds nodes that aren't in the shared graph). The overlay is exclusive per Constitution SO-10; it doesn't appear in any other Stack-Vault or in shared ReadyBank.
Without a single normalized graph, you'd build three different content models — one per SKU — and they'd drift. The graph is the unifier.
M3 vs M12 trajectory
We track the graph's growth as a leading indicator. At M0 (April 2026) the graph had ~200 leaves with 530 validated questions hanging off them. At M3 (July 2026) the target is 5,000 questions across 800 leaves. At M12 the target is 40,000+ questions across ~3,500 leaves spanning General Tech, India Stack, and AI-era role families.
The growth isn't proportional — early waves bias toward depth (filling out high-traffic leaves with many calibrated questions), later waves bias toward breadth (adding new role families like AIPE, embedded automotive, financial services). Wave 2 already added Salesforce CPQ, SAP ABAP, Oracle HCM Cloud, Finacle/Flexcube and Embedded Automotive — leaves that didn't exist in Wave 1.
What this means in practice: when a customer asks "do you cover X?" the answer is binary at the leaf. Either the leaf exists and there are calibrated questions on it, or the leaf is in next quarter's authoring backlog. No wishful "we have Java content."
See it work: ReadyBank deep-dive · JD-Forge · Stack-Vault