Reading engineering capability without looking at the source code

You can evaluate the capability of an engineering organization without ever reading the source code. In fact, the information that lives outside the code often reflects the organization’s true state more accurately than the code itself.

Reading source code as if it were a financial statement

Mapping engineering evaluation onto a financial framework helps clarify what each kind of information is for.

Source code = balance sheet (B/S). It captures the state of accumulated technical assets and technical liabilities up to today. The quality, structure, and size of the codebase are the result of every judgment the organization has made along the way. But a balance sheet alone does not tell you “how much is the business earning right now?”

Flow data from the development process = income statement (P/L). PR merge frequency, deployment speed, commit cadence, and bug-fix cycles show how much value the engineering organization is producing at this moment. In financial terms, this is the operating margin — a direct read on current productivity.

Relying on code review alone is like making an investment decision while looking only at the balance sheet. You only see the engineering organization clearly when you read the P/L-equivalent flow data alongside it.

As with the balance sheet, code reads differently in context

Just as the figures on a balance sheet cannot be judged in isolation, the state of source code must be interpreted against business stage and team size.

A startup (CTO plus a few engineers) with rough code is not necessarily a red flag. In financial terms, this is a company deliberately taking on leverage to accelerate growth. If the team is taking on technical debt to prioritize market validation, and the P/L equivalent — deployment frequency and feature release cadence — is healthy, that can be a defensible decision. The thing that matters is whether the leadership team is consciously aware of why the code looks that way.

Identical code roughness in a large team (20–50 engineers) means something very different. The total debt is likely beyond the organization’s processing capacity — the equivalent of a balance-sheet insolvency. Often no one understands the codebase end to end, and the P/L-equivalent velocity is starting to drop.

The reductive judgment “messy code = weak engineering” mirrors the financial fallacy that “high leverage = bad business.” Evaluation must always be done in the context of business stage, team size, and strategic intent.

The assumption “you can only judge engineering by reading the code” is what makes tech DD feel like work only an engineer can do. It is wrong. This article lays out a framework that non-engineer investors and M&A professionals can use to assess engineering capability. At the end, we provide an interactive on-page diagnostic tool you can use directly.

Why information outside the code matters

Code review has fundamental limits.

Limit 1: You see only a snapshot

Code shows the state at this instant. You cannot tell from the code whether something written six months ago is still running, or whether it was rewritten last week. You cannot distinguish between an organization that “can write good code” and one where “the engineer who could write good code left last week, and only the residue remains.”

Limit 2: You cannot see why the decisions were made

You can distinguish good code from bad, but you cannot read off why the architecture is what it is, why the stack ended up this way, or whether the team is aware of the technical debt. Decision quality lives in the process and in conversation, not in the code.

Limit 3: You cannot see the organization

Beautiful code written by one star engineer and code written collaboratively by ten average engineers can be nearly indistinguishable from the artifact alone. But in an investment or acquisition context, what matters is whether the organization has reproducible engineering capability.

The principle is not “don’t read the code” but “prioritize what lives outside the code.” That is the right starting posture for non-engineer evaluation.

Reading engineering capability from the development process

If you can get access to the GitHub or GitLab repository, you can learn a great deal without reading the code itself. Three indicators carry most of the signal.

1. Pull-request patterns

Pull requests (PRs) are where code review actually happens. The culture of the engineering organization is laid bare in them.

Figure 1: Reading the engineering organization through PR patterns

A repository whose review comments are nothing but “LGTM” is a warning. The form of review is being observed but the substance — actual quality checking — has likely failed.

2. Deployment and commit frequency

Looking at commit history and deployment records over the last 30–90 days reveals the team’s actual development velocity. Two angles matter.

(1) Consistency of frequency: Are commits piling up daily, or do hundreds land in one push on a single day? The latter pattern hints at code management bottlenecked on a few people.

(2) Trend in frequency: Compared to three months ago, are commits increasing or decreasing? Organizations whose technical debt is pulling down velocity tend to show a gradual decline in commit cadence.

3. State of issues and bug-tracking

The state of the issue tracker mirrors the organization’s capacity to manage problems.

Observation	Healthy signal	Caution signal
Issue volume	Reasonable open-and-close pace	Hundreds open, neglected
Prioritization	Organized via labels and milestones	Stacked without prioritization
Bug resolution speed	Critical bugs resolved within days	Months-old bugs still untouched
Regressions	Same problem does not recur	Same bug logged multiple times

Reading engineering capability from documentation and incident history

Documentation and the record of past incident response are often more eloquent than the code.

What documentation reveals about organizational maturity

The simple fact that “there is no documentation” is itself evaluation material. Mature engineering organizations have a habit of writing documentation so that new members can ramp themselves up. Organizations that rely on oral transmission lose institutional knowledge the moment a key person leaves.

Three categories of documentation to verify, with notes on the form they should take.

(1) Architecture documentation

A document that describes the overall system structure and the relationships between major components. Look for content like the following.

TITLE: System architecture overview

## High-level layout
[Diagram or text describing the overall system]

## Major components
- Frontend: Next.js (Vercel)
- Backend API: Node.js / Express (AWS ECS)
- Database: PostgreSQL (RDS) + Redis (ElastiCache)
- Auth: Auth0

## Data flow
User request -> CloudFront -> ALB -> ECS -> RDS

## External dependencies
- Payments: Stripe
- Email: SendGrid
- Monitoring: Datadog

## Scalability: current state and issues
Current bottleneck: single-node DB write
Plan: add read replicas in 2025 Q3

If the only “documentation” is “we’ll explain it verbally,” onboarding new members will take months. If documents at this level are absent, that is a warning sign.

(2) ADRs (Architecture Decision Records)

Records of technical decisions. Whether the rationale survives — “why did we choose this database?” “why did we not go with microservices?” — speaks to both the quality and the continuity of the organization’s judgment.

TITLE: ADR-012: Adopt Auth0 as our authentication platform

**Status**: Accepted (2024-08-15)

**Context**
With more than 50,000 users, multiple customers are asking for SSO and MFA.
In-house implementation carries security risk and significant maintenance cost.

**Decision**
Adopt Auth0.

**Alternatives and reasons for rejection**
- Firebase Auth: Strong dependency on Google. Conflicts with our multi-cloud posture.
- Cognito: Strengthens AWS lock-in. Conflicts with our multi-cloud posture.
- In-house: Without security specialists in-house, the implementation risk is too high.

**Trade-offs**
- Pro: MFA, SSO, and compliance-readiness available quickly
- Con: Increased monthly cost. MAU-based pricing requires scenario modeling at scale.

**Review date**: 2025-02-15

When records of this kind accumulate, you can confirm that technical decisions are being made on the basis of explicit rationale.

(3) Incident response runbooks

When something goes wrong in production, can anyone follow a documented procedure to respond? Verify whether such runbooks exist.

TITLE: Runbook: DB connection exhaustion

**Trigger**: Alert "DB connections > 80%" fires

**Impact**: Latency and timeouts across all API endpoints

**Pre-checks**
1. Check current connections: `SELECT count(*) FROM pg_stat_activity;`
2. Check long-running queries: `SELECT pid, query, state, query_start FROM pg_stat_activity ORDER BY query_start;`

**Response**
1. If a specific query is the cause: `SELECT pg_terminate_backend(pid) WHERE pid = [offending pid];`
2. If a sudden traffic spike is the cause: confirm autoscaling is firing (ECS in the AWS console)
3. If unable to resolve immediately: enable maintenance mode (procedure in docs/maintenance.md)

**Escalation**
If unresolved within 30 minutes, contact [on-call lead]

“Only that person can handle it” is the most acute form of key-person risk. An organization without runbooks, where incident response lives only in one head, is one departure away from operational collapse.

What incident history reveals about engineering culture

After a major incident, whether a post-mortem (after-action review) is conducted reflects the organization’s ability to learn. A well-run post-mortem looks like this.

TITLE: Post-mortem: 2024-11-15 Payments API down for four hours

**Impact**
- Window: 2024-11-15 14:32 - 18:41 (4h 9m)
- Affected users: ~2,800 (failed payment attempts)
- Estimated revenue loss: ¥3,200,000

**Timeline**
14:32 Alert fires (error rate above 5%)
14:40 On-call engineer begins triage
15:10 Root cause identified (Stripe webhook endpoint failed handshake after cert renewal)
16:00 Mitigation (rolled back the cert to the prior version)
18:41 Permanent fix complete, full service restored

**Root cause**
The certificate auto-renewal script did not include the Stripe webhook
receiving endpoint in the renewal target.

**Prevention (with owners and deadlines)**
1. Add a webhook endpoint reachability check on cert renewal to CI (Tanaka / end of November)
2. Add a Stripe-reachability health check (Suzuki / first week of December)
3. Update runbook with the cert-renewal pre-test procedure (Yamada / end of November)

**Lessons**
External API reachability must be considered explicitly as part of the
cert-renewal blast radius.

Three things to verify.

Are the prevention measures concrete? “Be careful” or “double-check” do not prevent recurrence. What matters is whether structural fixes — “added an automated test,” “appended this to the runbook” — are recorded with owners and deadlines.
Are the same problems recurring? Look across two years of incident reports for the same incident type repeating. Recurrence is evidence that root cause was never addressed.
Is the granularity of incidents appropriate? Whether small incidents are also recorded, or only large ones, reveals the organization’s incident sensitivity.

Five questions for the engineering interview

Building on the information you have already gathered, in interviews with engineers — particularly the CTO and tech leads — use the following five questions. The aim is not to “verify the right answer” but to “observe the soundness of the thinking process.”

Figure 2: Five interview questions and the lens for each

Q1. What technical problem do you want to fix right now?

If the answer is “nothing in particular” or “everything is running well,” either there is no awareness of the issues or the leader is not being candid. Every engineering organization has problems it wants to improve. A CTO who can name two or three concrete items immediately is genuinely tracking the state of the system.

A healthy answer, for example: “Query performance on a particular page has degraded — there is an N+1 query problem we’ve left in. We’ve raised the priority and start work on it next week.”

Q2. If user count grew 10x from today, where would the system break?

“It will be fine” or “we have a scalable architecture” are answers to avoid. Truly capable engineers know exactly where the bottlenecks in their system are and have plans to address them.

A healthy answer, for example: “The auth server breaks first. The current architecture isn’t built for horizontal scaling there, so we have stateless redesign on the half-year roadmap.”

Q3. Are there past technical decisions you regret?

The ability to talk about failures measures both learning ability and honesty. A CTO who cannot discuss failures either does not see the issues or only gives surface-level answers. Engineers who can describe specific failures and what they learned from them are credible.

Q4. What do you weight most when hiring engineers?

Watch for whether the answer is generic (“strong people,” “good problem-solving”) or specific. “Skills needed when migrating from monolith to microservices” or “people who fit a writing-first documentation culture” — that level of specificity reflects mature hiring philosophy.

Q5. How do you manage technical debt?

“We don’t have technical debt” is an immediate caution flag. “We keep a technical debt list and re-prioritize it quarterly against feature work” — that response describes a mature organization that recognizes and manages its debt.

Tied Inc. supports VCs and operating-company M&A teams in technical due diligence using these frameworks. For details, see our services for investors, or contact us directly.

When to bring in a technical expert

This framework has limits. Add an expert code review in the following cases.

When AI/ML or proprietary algorithms are at the core of competitive advantage. Out-of-code evaluation cannot confirm the substance of the technical edge.
When security is foundational to the business (finance, healthcare, personal data). Vulnerability assessment requires specialized expertise.
For large-scale systems (more than 20 engineers, hundreds of thousands of users). Checklist-style evaluation cannot catch every risk.
When the diagnostic returns 10 or more "×". The depth of the issues warrants more than a surface review.

Summary

You can evaluate the strength of an engineering organization without reading source code — the development process, documentation, incident history, and conversations with the engineers themselves carry most of the signal. If code is the balance sheet, these are the income-statement equivalents: flow data that reveal how the organization actually operates. With the framework above, even a non-engineer can cover the key questions of a technical DD.

PR patterns: review quality and dependence on individuals
Commit and deploy frequency: development velocity and consistency
Documentation state (ADRs, runbooks, architecture diagrams): the organization's ability to manage knowledge collectively
Incident history and postmortems: capacity for learning and recurrence prevention
Five questions: how the CTO frames problems and reasons through them

Use the result to identify areas that warrant a deeper look from a technical expert — that's where this kind of evaluation pays off.

Tied Inc. supports VCs and corporate M&A teams with technical due diligence. We can help you apply this framework, layer in expert code review, and structure findings into investment or PMI terms — see our services for investors.