How Verify Was Built: Engineering Trust in Planning Compliance
A technical deep-dive into the architecture decisions that make every compliance claim traceable to its regulatory source.
Three DCPs. Three LEPs. Multiple SEPPs.
Every actionable clause traceable to its gazette source
The Problem We Set Out to Solve
Property development compliance in NSW involves navigating three tiers of planning controls: State Environmental Planning Policies (SEPPs), Local Environmental Plans (LEPs), and Development Control Plans (DCPs). A single property might be subject to hundreds of provisions across dozens of documents, each with different applicability rules based on zoning, land use, lot characteristics, and geographic location.
Traditional approaches either overwhelm users with every possible requirement or, worse, use AI to "summarize" regulations—creating liability when summaries miss critical controls. We chose a different path: build a system where every compliance claim traces directly to its regulatory source.
The provision IS the requirement. When users see a planning requirement, they see the exact clause text as published in the official instrument. This isn't a summary or interpretation—it's the authoritative source.
The Data Pipeline: From PDFs to Addressable Requirements
Source Document Processing
We built document-specific extractors for each regulatory instrument. The Inner West LEP alone required parsing 100+ pages of structured legal text, with special handling for:
- Part 6 site-specific provisions that apply only to particular Lot/DP combinations
- Heritage schedules with thousands of individually listed properties
- Key sites with bespoke height and FSR controls
Each extractor preserves the provision's structural metadata: clause number, parent section, applicable zones, and the exact PDF page where the text appears.
The 4-Layer Filtering Model
Not every provision applies to every property. Verify implements a cascading filter:
Provisions applying council-wide (tree preservation, parking rates)
Controls triggered by development type (boarding house provisions only for boarding house applications)
Requirements contingent on development characteristics (height controls only when proposing buildings above existing)
Location-specific controls (Marrickville Town Centre provisions)
This filtering reduces noise dramatically. A residential alteration in Stanmore doesn't need to see industrial setback requirements or heritage provisions for buildings in Balmain.
Address-Driven PDF Extraction
Early versions extracted every page from every planning PDF. This was wasteful and slow. We refined the approach to be address-driven: only extract and cache PDF page images for provisions that could actually apply to queryable addresses.
For example, LEP Part 6 contains site-specific provisions for particular properties. Rather than extract all 50+ Part 6 pages, we mapped each clause to its applicable Lot/DP identifiers:
Clause 6.17 → Lot 1, DP 963000 → Page 75 Clause 6.14 → Lots at 145-155 Parramatta Rd → Page 73 Clause 6.18 → Lot 2, DP 547231 → Page 76
When a user queries an address, the system checks if any Part 6 clauses apply to that Lot/DP. Only then does it retrieve the relevant PDF page image.
The Actionable Classifier
Raw regulatory documents contain more than just requirements:
- Objectives and aims — important context, not checkable requirements
- Definitions — necessary for interpretation, not actionable themselves
- Procedural text — how to lodge applications, not development controls
- Notes and examples — clarifying commentary
We developed a classifier to distinguish actionable controls from non-actionable text. The classifier examines:
- Linguistic markers ("must", "shall", "is required to" vs "aims to", "seeks to")
- Clause structure (numbered requirements vs explanatory paragraphs)
- Section context (controls sections vs interpretation sections)
After iterative refinement, the classifier achieves zero false negatives on our test corpus—no genuine requirements are incorrectly filtered out. We prioritized eliminating false negatives over false positives; showing an extra provision is far less harmful than missing a real requirement.
Council-Specific Adaptations
Inner West Council's three predecessor DCPs each use different organizational structures:
Heavily precinct-based. 102 distinct precincts with location-specific controls. Generic provisions are minimal.
77% generic provisions applying council-wide. Uses "C-markers" (C1, C2, C3...) for discrete controls within sections.
PC/DS format separating Performance Criteria from Design Solutions, requiring both to be displayed together.
Precinct Boundary Mapping
Making precinct-specific provisions addressable required digitising the precinct maps from each former council's DCP. These maps—often published only as static PDFs—needed to be converted into queryable geographic boundaries.
We extracted precinct boundaries using a combination of georeferencing (aligning PDF maps to real-world coordinates) and polygon digitisation. Each precinct boundary was then stored as a geographic shape that can be queried against any street address. When a user enters an address, the system geocodes it to coordinates and performs a spatial lookup to determine which precinct (if any) contains that point.
This geolocation layer is what makes the 4-layer filtering model work in practice. A property at 42 Smith Street doesn't just get "Marrickville DCP" provisions—it gets the specific controls for the Marrickville Town Centre Precinct, the Dulwich Hill Village Precinct, or whichever boundary contains that address. Without this spatial mapping, precinct-based DCPs would be unusable at scale.
Verify's data model accommodates all three DCP structures through a unified schema with council-specific metadata fields.
Processing Economics: The Tiered Approach
Enriching thousands of provisions with AI classification would be prohibitively expensive if done naively. We implemented a three-tier processing pipeline:
- Regex Layer — Fast pattern matching handles ~70% of provisions (clear objectives, obvious definitions)
- Batch LLM Layer — Groups similar provisions for efficient batch classification
- Selective Deep Enrichment — Expensive per-provision analysis only for ambiguous cases
What Verify Deliberately Does NOT Do
Equally important as what the system does is what it refuses to do:
- ✕No merit assessment — The system shows requirements; it doesn't judge whether a development is "good"
- ✕No council decision prediction — Consent authority discretion cannot be algorithmically predicted
- ✕No approximate data — If we don't have verified coordinates or boundaries, we don't show fake ones
- ✕No hallucinated sources — Every PDF link is verified to exist before display
Verification and Audit Trail
Every provision displayed includes:
- Source document identifier (e.g., "Inner West LEP 2022")
- Clause/section reference (e.g., "Clause 4.3(2A)")
- PDF page number for direct verification
- Processing metadata (extraction date, classification confidence)
Users—and their lawyers—can verify any requirement against the gazetted source document.
Built for Regulatory Change
Planning instruments change. SEPPs get amended, LEPs are updated, DCPs are revised. A compliance system that can't keep pace with regulatory change becomes a liability rather than an asset.
Verify's provision-based architecture was designed with this reality in mind. Because each provision is a discrete, structured data object with its own metadata, updating the system when regulations change is straightforward:
- Replace outdated provisions — When a clause is amended, we update that specific provision record while preserving its relationships and history
- Add new instruments — New SEPPs or DCP amendments slot into the existing schema without architectural changes
- Retire repealed controls — Superseded provisions are marked inactive rather than deleted, maintaining audit history
- Track effective dates — Users see the provisions that apply today, not last year's version
This modularity means regulatory updates that would require weeks of manual review can be processed and deployed in hours. The same extraction and classification pipeline that built the initial database handles ongoing maintenance—keeping the system current without rebuilding from scratch.
The Result
Verify transforms the compliance checking process from "read 2,000 pages and hope you didn't miss anything" to "here are the provisions that actually apply to your property, with links to verify each one."
The system doesn't replace professional judgment. It eliminates the mechanical drudgery of document trawling, letting planners and developers focus on design responses rather than regulatory archaeology.
Every claim the application makes is traceable. Every provision links to its source. The system acknowledges what it doesn't know rather than guessing. That's how we built trust into planning compliance.
See how we're integrating AI to synthesize Scout, Verify, and Validate into a single answer.
Read: From Three Tools to One AnswerVerify processes planning instruments for Inner West Council, covering the former Marrickville, Leichhardt, and Ashfield local government areas—three DCPs, three LEPs, and applicable SEPPs.