4.7 KiB
URL canonicalization and normalization guide
This document defines the standard flow for query-string canonicalization in page/controllers. Use it for all new route work and when touching existing page logic.
Why this exists
Canonical URLs make route behavior predictable and secure by:
- removing unknown query parameters,
- normalizing known parameters to expected types,
- preventing duplicate URL variants for the same page state,
- reducing controller-specific ad-hoc redirect logic.
Shared helper
All route canonicalization must use:
app/helpers/url_canonicalizer.php
Core functions:
app_url_build_query_from_policy(array $sourceQuery, array $policy): arrayapp_url_redirect_to_canonical_query(string $appRoot, array $currentQuery, array $canonicalQuery): voidapp_url_build_internal(string $appRoot, array $query): stringapp_url_policy_value(string $targetKey, array $rule, array $sourceQuery)
Standard controller flow
For GET routes, follow this order:
- Resolve request context (
app_root, user/session state, etc.). - Resolve a defensive GET guard (
$isGetRequest) from$_SERVER['REQUEST_METHOD']. - Define canonical policy rules for the route.
- Build canonical query from
$_GET. - Redirect if current query differs from canonical query.
- Continue regular page logic (rendering, DB loading, etc.).
Reference pattern:
require_once APP_PATH . 'helpers/url_canonicalizer.php';
$isGetRequest = strtoupper((string)($_SERVER['REQUEST_METHOD'] ?? 'GET')) === 'GET';
if ($isGetRequest) {
$canonicalPolicy = [
'page' => [
'type' => 'literal',
'value' => 'example',
],
];
$canonicalQuery = app_url_build_query_from_policy($_GET, $canonicalPolicy);
// Keep example URLs constrained to supported route state.
app_url_redirect_to_canonical_query((string)$app_root, $_GET, $canonicalQuery);
}
Policy rule types
Supported rule type values:
literal: fixed value from policy (value)string: trimmed scalar stringint: integer with optional bounds (min,max)enum: string limited toallowedvaluesbool_flag: emitsvalue_truefor truthy request inputsstring_list: normalized list values (optionallyunique)
Useful options:
source: map canonical key from another source keydefault: fallback valueinclude_if: callable gate to include rule conditionallyomit_if: drop key when value equals sentineltransform: callable value transformervalidator: callable final validator
Route design rules
When adding canonicalization:
- Always include
pageasliteral. - Keep allowed query set minimal.
- Use
enumfor fixed states (tab,action,status, etc.). - Use
intwith bounds for IDs and pagination. - Use
omit_ifto avoid noisy defaults in URLs (for examplep=1). - Preserve only query keys that materially represent page state.
What not to canonicalize as page URLs
Do not force page-style canonicalization on non-page endpoints that intentionally behave as API/callback streams, for example:
- JSON suggestion endpoints,
- payment webhook/callback handlers,
- binary/document output handlers,
- static asset streaming handlers.
For these endpoints, keep strict input validation and explicit allowlists as currently implemented.
Redirect behavior
app_url_redirect_to_canonical_query compares normalized current and canonical queries.
If different, it sends a Location header and exits.
Implications:
- Logic after the call runs only for canonical request URLs.
- Downstream code may continue reading
$_GET; values are already canonicalized by redirect gate. - If custom redirect URL construction is needed after POST actions, use
app_url_build_internalwith a policy-built query.
Update checklist for new/edited routes
When changing a route:
- Add/confirm
require_onceforurl_canonicalizer.php. - Use the standardized defensive guard:
$isGetRequest = strtoupper((string)($_SERVER['REQUEST_METHOD'] ?? 'GET')) === 'GET'; - Add/adjust GET canonical policy near route entry.
- Keep existing business logic unchanged unless explicitly requested.
- Add concise inline comment for non-trivial policy/condition blocks.
- Update deployment-facing route documentation used in your environment.
- Run syntax checks and PHPUnit as part of validation cadence.
Deployment notes
Coverage is deployment-scoped.
When auditing a specific environment:
- verify enabled route entry points use policy-based canonicalization,
- keep non-page API/callback/document/asset endpoints on strict allowlist validation,
- keep local operational/developer documentation updated according to the documentation set available in that installation.