Web Notes
2016.08.20
Using Liquid in Jekyll - Live with Demos
Liquid is a simple template language that Jekyll uses to process pages for your site. With Liquid you can output complex contents without additional plugins.
Note took from:
Nginx variables that are pre-defined by either the Nginx core or Nginx modules are called “built-in variables”.
The built-in variable $uri
provided by ngx_http_core
is used to fetch the (decoded) URI of the current request, excluding any query string arguments. Another is $request_uri
variable provided by the same module, which is used to fetch the raw non-decoded form of the URI, including any query string. Let’s look at the following example:
location /test {
echo "uri = $uri";
echo "request_uri = $request_uri";
}
The outside server
configuration block was omitted for brevity. Below is the result of testing this /test
interface with different request:
$ curl 'http://localhost:8080/test'
uri = /test
request_uri = /test
$ curl 'http://localhost:8080/test?a=3&b=4'
uri = /test
request_uri = /test?a=3&b=4
$ curl 'http://localhost:8080/test/hello%20world?a=3&b=4'
uri = /test/hello world
request_uri = /test/hello%20world?a=3&b=4
## Variables with infinite names
There’s another very common built-in variable that does not have a fixed variable name. Instead, it has infinite variations. That is, all those variables whose names have the prefix arg_
, like $arg_foo
and $arg_bar
. The $arg_name
variable is evaluated to the value of the name URI argument for the current request. Also, the URI argument’s value obtained here is not decoded yet, potentially containing the %XX
sequences. Let’s check out a complete example:
location /test {
echo "name: $arg_name";
echo "class: $arg_class";
}
Then we test this interface with various different URI argument combinations:
$ curl 'http://localhost:8080/test'
name:
class:
$ curl 'http://localhost:8080/test?name=Tom&class=3'
name: Tom
class: 3
$ curl 'http://localhost:8080/test?name=hello%20world&class=9'
name: hello%20world
class: 9
In fact, $arg_name
does not only match the name
argument name, but also NAME
or even Name
, the letter case does not matter here. Behind the scene, Nginx just concerts the URI argument names into the pure lower-case form before matching against the name specified by $arg_xXx
.
If you want to decode the special sequences like %20
in the URI argument values, then you cloud use the set_unescape_uri
directive provided by the 3rd-party module ngx_set_misc
.
location /test {
set_unescape_uri $name $arg_name;
set_unescape_url $class $arg_class;
echo "name: $name";
echo "class: $class";
}
$ curl 'http://localhost:8080/test?name=hello%20world&class=9'
name: hello world
class: 9
The space has indeed been decoded!
Another thing that we can observe is that the set_unescape_uri
directive can also implicitly create Nginx user-defined variables.
The Nginx core offers a lot of such built-in variables in addition to $arg_xxx
, like the $cookie_xxx
variable group for fetching HTTP cookie values, the $http_xxx
variable group for fetching request headers, as well as the $sent_http_xxx
variable group for retrieving response headers. Refer to the official documentation for the ngx_http_core
module.
All the user-defined variables are writable. However, most of the built-in variables are effectively read-only, like the $uri
and $request_uri
variables that we just introduced. Assignments to such read-only variables must always be avoided.
Attempt of writing to some other read-only built-in variables like $arg_xxx
will just lead to server crashes in some particular Nginx versions.
$args
Some built-in variable are writable as well. For instance, when reading the built-in variable $args
, we get the URL query string of the current request, but when writing to it, we’re effectively modifying the query string:
location /test {
set $orig_args $args;
set $args "a=3&b=4";
echo "original args: $orig_args";
echo "args: $args";
}
$ curl 'http://localhost:8080/test?a=0&b=1&c=2'
original args: a=0&b=1&c=2
args: a=3&b=4
It should noted that when reading $args
, Nginx will execute a special piece of code, fetching data from a particular place where the Nginx core stores the URL query string for the current request. On the hand, when we overwrite $args
, Nginx will execute another special code, storing new value into the same place in the core. Other parts of Nginx also read the same place whenever the query string is needed. So, our modification to $args
will immediately affect all the other parts’ functionality later on.
Below is an example to demonstrate that assignments to $args
affect the HTTP proxy module ngx_proxy
:
server {
listen 8080;
location /test {
set $args "foo=1&bar=2";
proxy_pass http://localhost:8081/args;
}
}
server {
listen 8081;
location /args {
echo "args: $args";
}
}
$ curl 'http://localhost:8080/test?blah=7'
args: foo=1&bar=2
In previous section, we learned that when reading the built-in variable $args
, Nginx executes a special piece of code to obtain a value on-the-fly and when writing to this variable, Nginx executes another special piece of code to propagate the change. In Nginx’s terminology, the special code executed for reading the variable is called get handler and the code for writing to the variable is called set handler.
When a variable is being created at “configure time”, the creating Nginx module must make a decision on whether to allocate a value container for it and whether to attach a custom “get handler” and/or a “set handler” to it. Those variables owing a value container are called “indexed variables” in Nginx. Otherwise, they are said to be not indexed.
We already know that the variable group like $arg_xxx
discussed in earlier sections do not have a value container and thus are not indexed. When reading $arg_xxx
, it is its “get handler” at work, that is, its “get handler” scans the current URL query string on-the-fly, extracting the value of the specified URL argument. Nginx never tries to parse all the URL arguments beforehand, but rather scans the whole URL query string for a particular argument in a “get handler” every time that argument is requested by reading the corresponding $arg_xxx
variable.
Some Nginx variables choose to use their value containers as a data cache when the “get handler” is configured. In this setting, the “get handler” is run only once, i.e., at the first time the variable is read, which reduces overhead when the variable is read multiple times during its lifetime.
map $args $foo {
default 0;
debug 1;
}
server {
listen 8080;
location /test {
set $orig_foo $foo;
set $args debug;
echo "original foo: $orig_foo";
echo "foo: $foo";
}
}
Nginx’s map
directive is used to define a “mapping” relationship between two Nginx variables, or in other words, “function relationship”. Here in this example, we use the map
directive to define the “mapping” relationship between user variable $foo
and built-in variable $args
. When using the mathematical function notation, , our $args
variable is effectively the “independent variable”, , while $foo
is the “dependent variable”, . That is, the value of $foo
depends on the value of $args
, or rather, we map
the value of $args
onto the $foo
variable (in some way). Therefore, we obtain the following complete mapping rule in this example: if the value of $args
is debug
, variable $foo
gets the value 1
; otherwise $foo
gets the value 0
. So essentially, this is a conditional assignment to the variable $foo
.
$ curl 'http://localhost:8080/test'
original foo: 0
foo: 0
The first output line indicated that the value of $orig_foo
is 0
, which is exactly what we expected: the original request does not take a URL query string, so the initial value of $args
is empty, leading to the 0
initial value of $foo
, according to the “default” condition in the mapping rule.
But surprisingly, the second output line indicated that the final value of $foo
is still 0
, even after we overwrite $args
to the value debug
. The reason is pretty simple: when the first time variable $foo
is read, its value computed by ngx_map
’s “get handler” is cached in its value container. We already learned earlier that Nginx modules may choose to use the value container of the variable created by themselves as a data cache for its “get handler”. Obviously, the ngx_map
module considers the mapping computation between variables expensive enough and caches the result automatically, so that the next time the same variable is read within the lifetime of the current request, Nginx can just return the cached result without invoking the “get handler” again. It verifies by:
$ curl 'http://localhost:8080/test?debug'
original foo: 1
foo: 1
The map
directive is actually a unique example, because it not only register a “get handler” for the user variable, but also allow the user to define the computing rule in the “get handler” directly in the Nginx configuration file. Meanwhile, it must be made clear that not all the variables using “get handler” will cache the result. For instance, the $arg_xxx
variable does not use its value container at all.
Side note for use contexts of directives
We should note that the
map
directive is put outside theserver
configuration block, that is, it is defined directly within the outermosthttp
configuration block.Every configuration directive does have a pre-defined set of use contexts in the configuration file. When in doubt, always refer to the corresponding documentation for exact use contexts of a particular directive.
We have learned how the map
directive works. It is the “get handler” that performs the value computation and related assignment. And the “get handler” will not run at all unless the corresponding user variable is actually being read. Therefore, for those requests that never access that variable, no useless computation involved.
The technique that postpones that value computation off to the point where the value is actually needed is called “lazy evaluation” in the computing world. In contrast, it is much more common to see “eager evaluation”.
You might have assumed the “requests” in that context are just those HTTP request initiated from the client side. In fact, there are two kinds of “requests” in the Nginx world. One is called the main requests, and the other is called the subrequests.
Main requests are those initiated externally by HTTP clients. Whereas subrequest are a special kind of requests initiated from within the Nginx core. But please do not confuse subrequests with those HTTP requests created by the ngx_proxy
modules!
Subrequests may look very much like an HTTP request in appearance, their implementation, however, has nothing to do with neither the HTTP protocol nor any kind of socket communication. A subrequest is an abstract invocation for decomposing the task of the main request into smaller “internal requests” that can be served independently by multiple different location
blocks, either in series or in parallel. Subrequests can also be recursive: any subrequest can initiate more sub-requests, targeting other location
blocks or even current location
itself. According to Nginx’s terminology, if request A initiates a subrequest B, then A is called the “parent request” of B.
location /main {
echo_location /foo;
echo_location /bar;
}
location /foo {
echo foo;
}
location /bar {
echo bar;
}
$ curl 'http://localhost:8080/main'
foo
bar
The subrequests initiated by echo_location
are always running sequentially according to their literal order in the configuration file. The response body of these two subrequests get concatenated together according to their running order, to form the final response body of their parent request.
It should be noted that the communication of location
block via subrequests is limited within the same server
block (i.e., the same virtual server configuration).
Variables with the same name between a parent request and a subrequest will generally not interfere with each other, both the main request its sub requests do own different copies of variable containers.
Subrequests initiated by certain Nginx modules do share variable containers with their parent requests, like those initiated by 3rd-party module ngx_auth_request
.
location /main {
set $var main;
auth_request /sub;
echo "main: $var";
}
location /sub {
set $var sub;
echo "sub: $var";
}
$ curl "http://localhost:8080/main"
main: sub
Obviously, the value change of $var
in the subrequest to /sub
does affect the main request to /main
. Thus the variable container of $var
is indeed shared between the main request and the subrequest created by the ngx_auth_request
module. The auth_request
directive discards the response body of the subrequest it manages, and only checks the response status code of the subrequest. When the status code looks good, like 200
, auth_request
will just allow Nginx continue processing the main request; otherwise it will immediately abort the main request by returning a 403
error page. In this example, the subrequest to /sub
just return a 200
response implicitly created by the echo
directive in /sub
.
Even though sharing variable containers among the main request and all its subrequests could make bidirectional data exchange easier, it could also lead to unexpected subtle issues that are hard to debug in real-world configurations. Because users often forget that a variable with the same name is actually used in some deeply embedded subrequest and just use it for something else in the main request, this variable could get unexpectedly modified during processing.
Such bad side effects make many 3rd-party modules like
ngx_echo
,ngx_lua
andngx_srcache
choose to disable the variable sharing behaviour for subrequests by default.
When reading $args
in a subrequest, its “get handler” should naturally return the query string for the subrequest.
location /main {
echo "main args: $args";
echo_location /sub "a=1&b=2";
}
location /sub {
echo "sub args: $args";
}
$ curl "http://localhost:8080/main?c=3"
main args: c=3
sub args: a=1&b=2
It is clear that when $args
is read in the main request (to /main
), its value is the URL query string of the main request; whereas when in the subrequest (to /sub
), it is the query string of the subrequest. This behaviour indeed matches our intuition.
Unfortunately, not all built-in variables are sensitive to the context of subrequests. Several built-in variables always act on the main request even when they are used in a subrequest. The built-in variable $request_methos
is such an exception.
Whenever $request_methos
is read, we always get the request method name (such as GET
and POST
) for the main request, no matter whether the current request is a subrequest or not.
location /main {
echo "main method: $reuqust_method";
echo_location /sub;
}
location /sub {
echo "sub method: $request_method";
}
Now, let’s do a POST
request to /main
:
$ curl --data hello "http://localhost:8080/main"
main method: POST
sub method: POST
Here we use the --data
option of the curl
utility to specify our POST
request body, also this option makes curl
use the POST
method for the request. The result turns out as we expected, the variable $request_method
is evaluated to the main request’s method name, POST
, despite its use in a GET
subrequest.
Indeed, we can turn to the built-in variable $echo_request_method
provided by the ngx_echo
module to get the actually method name.
map $uri $tag {
default 0;
/main 1;
/sub 2;
}
...
location /main {
auth_request /sub;
echo "main tag: $tag";
}
location /sub {
echo "sub tag: $tag";
}
$ curl "http://localhost:8080/main"
main tag: 2
It works like this: the $tag
variable was first read in the subrequest to /sub
, and the “get handler” register by map
computed the value 2
for $tag
in that context and the value of 2
got cached in the value container of $tag
from then on. Because the parent request shared the same container as the subrequest created by auth_request
, when the parent request read $tag
later, the cached value 2
was directly returned.
For this example, we can conclude that it can hardly be a good idea to enable variable container sharing in subrequests.
Frank Lin
Web Notes
2016.08.20
Liquid is a simple template language that Jekyll uses to process pages for your site. With Liquid you can output complex contents without additional plugins.
JavaScript Notes
2018.12.17
JavaScript is a very function-oriented language. As we know, functions are first class objects and can be easily assigned to variables, passed as arguments, returned from another function invocation, or stored into data structures. A function can access variable outside of it. But what happens when an outer variable changes? Does a function get the most recent value or the one that existed when the function was created? Also, what happens when a function invoked in another place - does it get access to the outer variables of the new place?
Tools
2020.09.12
Location directives are essential when working with Nginx. They can be located within server blocks or other location blocks. Understanding how location directives are used to process the URI of client request can help make the request handling less unpredictable.