Content Validation
Once incoming data has been deserialized, how and when do you ensure it's valid? And if you determine it's invalid, how do you report that information?
Taking a layered security approach, the sooner you can deliver validation errors, the better. Denial of Service attacks will often send invalid data in order to mire the system in long-running and processor intensive requests, thus denying service to valid requests.
Types of Validation
You know you need to validate your requests; the question is: how? With a typical HTML web
application, you have forms, and server-side logic for validating the forms. What tools do you have
for APIs, where the data is usually not of the application/x-www-form-urlencoded
media type?
One option is JSON Schema, which is both a way to describe the data format, as well as validate it. This approach requires having tools server-side for transforming the schema into validation rules that you can run against your code.
Another option is to treat the incoming data as form data; deserialize it into an array, and pass it
to the same logic you would use to validate a form. This requires that your form validation logic
does not operate directly on $_POST
or $_GET
, but instead allows passing the data set to
validate.
Laminas offers an approach similar to the latter, via the
Laminas\InputFilter
component. This component allows you to describe and validate data sets of arbitrary complexity.
Additionally, it allows for the ability to both set custom error messages as well as retrieve
validation error messages in a structured format. API Tools'
api-tools-content-validation module provides
functionality for mapping Laminas input filters to services, and utilizes API
Problem in order to return validation error messages to
the end-user of the API.
If the data provided does not overlap with the set described by the input filter, API Tools will
return a 400 Bad Request
status code. If any portion of the data set does overlap, but is
invalid, instead a 422 Unprocessable Entity
status will be returned with an
application/problem+json
payload that contains a validation_messages
key.
As an example, consider a "Status" service that accepts two fields, "message" and "user"; the first cannot be empty, and must be less than or equal to 140 characters; the second will be validated against a regular expression of valid users. Let's consider a request that provides an empty message and an invalid user:
POST /status HTTP/1.1
Accept: application/vnd.status.v2+json
Content-Type: application/json
{
"message": " ",
"user": "matthew"
}
API Tools will deserialize the data and pass it to the configured input filter, which will then determine that the data is invalid. The following response will be provided:
HTTP/1.1 422 Unprocessable Entity
Content-Type: application/problem+json
{
"detail": "Failed Validation",
"status": 422,
"title": "Unprocessable Entity",
"type": "http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html",
"validation_messages": {
"message": {
"isEmpty": "Value is required and can't be empty"
},
"user": {
"regexNotMatch": "Invalid user supplied."
}
}
}
Validation errors from API Tools will always follow this format, thereby providing predictability to consumers of your APIs.
HTTP Method-Specific Validation
Sometimes the validation rules for a given URI may change based on which HTTP method is being used.
As an example, during creation of a user, via POST
, you may want to specify just a name and email.
However, during a later operation to update a password via PATCH
, you may be able to receive only
the password. An operation that replaces all details of the user via PUT
may need to validate each
and every field representing the user.
The api-tools-content-validation
module provides granularity beyond just mapping input filters to
services; it also allows you to map input filters to specific HTTP methods for a given service. In
the case of REST services, it also differentiates between
collection and entity URIs, allowing an input filter for each HTTP method for each.
Summary
Laminas provides the ability to short-circuit the request lifecycle at any point by returning a "response" object. API Tools leverages this fact by registering an event listener after content negotiation completes, but before the service itself executes, ensuring we intercept validation errors early.
Read the content validation chapter for more details.