Guidelines for AI-driven legacy code modernization
AI will not be able to refurbish legacy systems at the push of a button. Still, with proper guidance and oversight, AI tools can speed up code modernization projects.
Most enterprise organizations run on yesterday's software.
A significant chunk of core business systems are legacy and a large portion of IT budgets are often dedicated to keeping them alive. Legacy tech can restrict growth and block strategy, with many companies identifying it as a major barrier to digital transformation and a main driver of IT spend. This impact team productivity directly as they end up devoting time to legacy system maintenance.
AI won't fix that magically. It does change the slope of a team's productivity curve when used well, however. Instead of flat output or incremental gains, AI can help teams modernize faster over time.
Generative AI can read code and logs, suggest safe refactors, generate tests, and sketch platform moves. People still make the calls -- set the target architecture, check behavior and decide when to ship. Learn some practical ways AI can drive legacy code modernization efforts.
Challenges of legacy code
Legacy code hurts in two ways: what's in the code and what's around it.
Inside the code, there can be years of tangled dependencies, hidden side‑effects, mixed concerns, deprecated libraries and little in the way of tests or documentation. Data and behavior are coupled in surprising places; a seemingly harmless change in a handler ripples into reporting jobs or batch interfaces.
Around the code, environments are difficult to recreate, deploys are fragile, observability is thin and compliance and security posture lags. Addressing this isn't as simple as modifying the code. It's important to keep SLAs, manage risk and coordinate teams while choosing the right approach.
The hardest part of working around fragile systems is getting the order of changes -- code, configuration and data -- right and making sure the system continues to function. That means setting clear requirements, testing before changing anything, breaking work into safe steps, moving data without hiccups and avoiding the temptation to throw everything out and start over. Teams should work incrementally with guardrails and rollback mechanisms to gain progress without betting the business.
7 approaches to legacy code modernization
Modernizing legacy applications involves more than a rewrite plan. It is a business decision about how to reduce risk, improve operations and unlock new strategies. A good starting point is application rationalization. Take inventory, score each system on value, cost to maintain, technical fitness and business risk, then decide which ones to invest in, consolidate, keep as-is or retire. With that short list in hand, development teams can match each system to a practical application modernization path that fits the budget, timeline and cloud roadmap.
1. Encapsulate
Wrap the legacy system with stable APIs or events so teams can build new features around it without changing the core. This protects consumers from outdated interfaces and buys time to plan deeper changes. Encapsulation often pairs well with a strangler-style migration, where new capabilities live outside while old ones shrink over time.
2. Rehost
Lift and shift the workload to new infrastructure with minimal code changes -- such as moving a VM from a data center to the cloud for example. This approach provides quicker wins in reliability, cost transparency and basic automation but the legacy code and architecture stay the same. Use this when speed matters and deeper changes would add undue risk.
3. Replatform
Move to a modern runtime or managed service with small, targeted changes. Common examples are migrating to containers, a managed database or serverless front doors such as AWS API Gateway. This approach can improve operations and scalability without redesigning the application. This is a solid middle ground that provides cloud benefits quickly while only slightly modifying the code.
4. Refactor
Restructure the code to improve maintainability without changing external behavior. Refactoring can involve splitting large modules, adding tests, removing dead code and updating libraries. The goal is to make the legacy code easier to maintain and safer to evolve. Refactoring pays off when the business logic is still valuable but the codebase has aged.
5. Rearchitect
Change the system's design to meet new quality goals such as scalability, resilience or speed of delivery. Examples include moving from a monolith to well-bounded services, adopting event-driven patterns or decoupling data stores. This is where architecture leads the transformation and often unlocks the most long-term agility.
6. Rebuild
Rewrite the application from the ground up while preserving scope and core behavior. This approach offers a clean slate for technology choices and testing practices, which can dramatically improve velocity. Rebuilds are best when the current implementation blocks change, yet the domain logic still matches how the business works.
7. Replace
Retire the custom system and adopt a commercial or SaaS product. This can cut ongoing cost and risk if the market provides what the business needs. Replacing works well for commodity capabilities such as HR, finance or CRM. The tradeoff is customization flexibility, so validate the replacement's fit and integration early.
These paths are not mutually exclusive. Many teams encapsulate first, rehost or replatform to stabilize operations, then refactor and rearchitect targeted areas. Rebuild or replace can follow once the organization is ready. The sequence depends on risk tolerance, available talent and the urgency of digital transformation.
How can AI enable legacy code modernization?
AI is not a magic wand. It is an efficient colleague who reads everything, keeps perfect notes and does the boring parts without complaining. Used well, AI shortens the path from "we should modernize" to running software. Below are eight hands-on examples of AI-assisted code modernization tasks, generated by Claude Code CLI.
Used well, AI shortens the path from 'we should modernize' to running software.
1. Turn legacy code into living specs and tests
Capture what the system really does today in plain English and executable tests. This approach freezes current behavior so developers can improve structure, fix design issues or change technology without losing business rules.
Prompt:
"Scan ./cobol/ and ./jcl/ plus sample outputs in ./samples/ratings.csv. Produce a concise human spec of rating rules and a Python test pack that locks in current behavior. Use golden files based on ratings.csv. Keep the language business friendly."
Output:
Premium rules v1
- If driver_age < 25 and vehicle_class in {"sport","performance"} apply surcharge S3 = 0.12
- If prior_claims >= 2 apply penalty P2 = 0.07
- Multi-vehicle discount 0.05 if household_vehicles >= 2
- Round to nearest cent after all adjustments
—-
# tools/build_golden.py
import csv, json, os
os.makedirs("tests/golden", exist_ok=True)
with open("samples/ratings.csv") as f:
for i,row in enumerate(csv.DictReader(f), start=1):
case = {"input": row, "expected_total": float(row["total_premium"])}
with open(f"tests/golden/case_{i}.json","w") as o:
json.dump(case, o, indent=2)
print("golden files written")
—
# tests/test_rating_rules.py
import glob, json, math
from legacy_adapter import calc_premium # wrapper around COBOL batch calc
def cases():
for path in glob.glob("tests/golden/*.json"):
with open(path) as f:
yield path, json.load(f)
def money_eq(a, b): return math.isclose(a, b, rel_tol=0, abs_tol=0.01)
def test_golden_parity():
for path, case in cases():
got = calc_premium(case["input"])
assert money_eq(got, case["expected_total"]), f"mismatch in {path}"
def test_young_driver_surcharge_applies():
base = {"driver_age": 22, "vehicle_class": "sport", "prior_claims": 0, "base": 1000}
assert money_eq(calc_premium(base), 1120.00)
2. Apply minor, targeted code changes at scale
Make many small, mechanical edits across one or more codebases to meet a platform or runtime goal. Examples include dependency upgrades, API swaps, logging standards or container readiness updates. The business logic stays the same while the code becomes consistent and compatible.
Prompt
Create an OpenRewrite recipe to move services from Java 8 to 17 and Spring Boot 2.7 to 3.2. Replace Apache HttpClient with java.net.http.HttpClient.
Put a clean API in front of a legacy system and route traffic through it. The team can attach new features to the facade while gradually moving or rewriting internals behind the scenes. This reduces coupling and buys time to evolve the core safely.
Prompt
From ./traffic/*.har derive OpenAPI v3 for Orders with POST /orders and GET /orders/{id}. Generate an Express facade that maps to the legacy endpoints.
4. Discover usage patterns and guide application rationalization
Use real usage and cost data to decide what to keep, consolidate or retire. The goal is to shrink scope before modernizing to ultimately invest in the few applications that matter and remove redundant or idle ones.
Prompt
Analyze api_logs.csv and service_costs.csv. List endpoints unused in 180 days, cluster services by usage and cost and suggest retire or consolidate actions.
Output
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
logs = pd.read_csv("api_logs_features.csv") # req_per_day, latency_p95_ms, distinct_flows, service
costs = pd.read_csv("service_costs.csv") # monthly_cost, service
df = logs.merge(costs, on="service")
X = StandardScaler().fit_transform(df[["req_per_day","latency_p95_ms","monthly_cost","distinct_flows"]])
df["cluster"] = KMeans(n_clusters=4, n_init="auto").fit_predict(X)
candidates = df[(df["req_per_day"] < 5) & (df["monthly_cost"] > 2000)]
print("retire candidates:\n", candidates[["service","monthly_cost"]].sort_values("monthly_cost", ascending=False))
—
"Retire: hr-letter-2, hr-letter-3 (zero traffic 6 months). Consolidate into hr-letter-1.
Replace: expense-legacy with SaaS-X (85 percent feature overlap). Integration cost low.
Keep: payroll-core (high traffic, unique features)."
5. Synthesize infrastructure, runbooks and smoke tests for safer rehosting
Turn tribal setup knowledge into infrastructure as code (IaC), a clear cutover plan and quick health checks. This approach lift and shifts with fewer surprises because provisioning and validation are scripted, repeatable and easy to audit.
Prompt
From /runbooks/windows-farm.md, /scripts/*.ps1, and /configs/*.ini generate Terraform for EC2, a runbook titled Cutover Tuesday 7 pm, and a bash smoke test that hits /healthz and creates a sample order.
6. Extract domains and events to unlock a better architecture
Identify natural boundaries in code and data, then define the events those domains share. For example, teams could split a monolith into well-bounded services in small steps, improving scalability and delivery speed without a risky big bang.
Prompt
Analyze ./src and docs/erd.png. Propose bounded contexts for catalog, pricing, inventory, checkout. Produce JSON schemas for key events and minimal Kafka producer and consumer samples
from kafka import KafkaConsumer
import json
consumer = KafkaConsumer("price-events", bootstrap_servers=["kafka:9092"], value_deserializer=lambda v: json.loads(v.decode()))
for msg in consumer:
evt = msg.value
# TODO update read model or invalidate cache
7. Guide a rebuild with scaffolds, seeds and property-based tests
Start fresh on a modern stack while proving the new behavior matches the old one during the transition. Scaffolding speeds up setup, dual-run or replay. Property-based tests catch edge cases early.
Prompt
Generate a Go HTTP service with /price. Add a Python parity test that compares new pricing to the legacy function. Provide a small dual-write adapter for order creation."
Output
// cmd/pricing/main.go
package main
import ("encoding/json"; "log"; "net/http"; "time")
type PriceRequest struct{ SKU string; Qty int }
type PriceResponse struct{ SKU string; Price float64; Currency string }
func priceHandler(w http.ResponseWriter, r *http.Request) {
start := time.Now()
var req PriceRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil { w.WriteHeader(400); return }
resp := PriceResponse{SKU: req.SKU, Price: 17.99, Currency: "USD"} // TODO call domain
json.NewEncoder(w).Encode(resp)
log.Printf("path=/price sku=%s qty=%d latency_ms=%d", req.SKU, req.Qty, time.Since(start).Milliseconds())
}
func main() { http.HandleFunc("/price", priceHandler); log.Fatal(http.ListenAndServe(":8081", nil)) }
# tests/test_parity.py
from hypothesis import given, strategies as st
from new_impl import price as new_price
from legacy_impl import price as legacy_price
@given(sku=st.from_regex(r"[A-Z0-9\-]{3,12}"), qty=st.integers(min_value=1, max_value=10))
def test_new_matches_legacy_for_many_inputs(sku, qty):
assert round(new_price(sku, qty), 2) == round(legacy_price(sku, qty), 2)
—
8. Make buy-versus-build decisions with total cost of ownership and fit simulations
Compare long-term cost, feature fit and integration effort across three paths: keep plus refactor, rebuild or replace with SaaS. The outcome is a clear, numbers-backed plan that shows where custom code still adds value.
Prompt
Using usage.csv, vendor_pricing.csv, and run_costs.csv, compute five-year NPV for keep plus refactor, rebuild, and replace with SaaS at a 10 percent discount rate. List top integration hotspots and generate a TypeScript adapter stub for the vendor journal API.
Output
import numpy as np, pandas as pd
# thousands of dollars per year
keep_refactor = [300, 120, 120, 120, 120]
rebuild = [800, 90, 90, 90, 90]
saas = [400, 70, 70, 70, 70]
def npv(stream, rate=0.10):
return sum(v / ((1 + rate) ** t) for t, v in enumerate(stream))
print("NPV keep+refactor:", round(npv(keep_refactor), 1))
print("NPV rebuild:", round(npv(rebuild), 1))
print("NPV replace with SaaS:", round(npv(saas), 1))
—
1. Journal posting and reconciliation 2. User provisioning and SSO mapping 3. Historical report backfill
At its best, AI turns modernization plans into working code the user can see and trust. It writes plain-English specs and tests that lock in business rules, runs code mods that clean up APIs and fronts brittle systems with simple facades to let teams improve the core behind them. It trims scope by showing what to keep or retire, scripts rehosting with IaC and smoke checks, and suggests clear domain boundaries with events.
When a rebuild is right, it proves parity before cutover. A smart replacement strategy backs the call with numbers and adapter stubs. The result is steady, low-risk progress commits, tests and runbooks that make legacy code easier to maintain and primed for a cloud-ready architecture.
Limitations of AI-driven code modernization
AI can speed up modernization, but it has real limits to plan around.
Context and data quality. Models only see what they're fed. Outdated specs, missing logs and edge cases outside the sample set lead to wrong conclusions and brittle code.
Behavior coverage. Generated tests and golden files can lock in existing bugs or miss cross-system side effects. SMEs, exploratory testing and production safeguards are still necessary.
Security, privacy and compliance. Code generation can introduce vulnerabilities, mishandle secrets or reuse licensed snippets. Guardrails, SAST/DAST, SBOMs and clear data handling rules are still necessary.
Architecture and change management. AI can propose edits, not own trade-offs, sequencing or organizational alignment. Without governance, CI quality bars and rollback paths, mass changes raise risk.
Used with these limits in mind, AI turns strategy into steady, observable progress, but it is a power tool, not an autopilot.
Nick Femia is a Tech Lead and full-stack engineer with over six years of experience driving product engineering and AI innovation.