Essential .htaccess Tips and Tricks

//Essential .htaccess Tips and Tricks

Essential .htaccess Tips and Tricks

People running on a good web host should be able to make use of a .htaccess file, which is a plain text file that allows you to have configuration rules that affect your Unix web server on a per-directory basis.

This is a good starting .htaccess file that beefs up your web server’s security, with rules from a post on 0x000000.com, and SigSiu.net

# Disable server signature
ServerSignature Off 
# deny folder listing
IndexIgnore *
# deny directory browsing
Options All -Indexes
# enable symbolic links
Options +FollowSymLinks
# enable basic rewriting
RewriteEngine on
# Prevent use of specified methods in HTTP Request 
RewriteCond %{REQUEST_METHOD} ^(HEAD|TRACE|DELETE|TRACK) [NC,OR] 
# Block out use of illegal or unsafe characters in the HTTP Request 
RewriteCond %{THE_REQUEST} ^.*(\\r|\\n|%0A|%0D).* [NC,OR] 
# Block out use of illegal or unsafe characters in the Referer Variable of the HTTP Request 
RewriteCond %{HTTP_REFERER} ^(.*)(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR] 
# Block out use of illegal or unsafe characters in any cookie associated with the HTTP Request 
RewriteCond %{HTTP_COOKIE} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR] 
# Block out use of illegal characters in URI or use of malformed URI 
RewriteCond %{REQUEST_URI} ^/(,|;|:|<|>|">|"<|/|\\\.\.\\).{0,9999}.* [NC,OR] 
# Block out  use of empty User Agent Strings
# NOTE - disable this rule if your site is integrated with Payment Gateways such as PayPal 
RewriteCond %{HTTP_USER_AGENT} ^$ [OR] 
# Block out  use of illegal or unsafe characters in the User Agent variable 
RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR] 
# Measures to block out  SQL injection attacks 
RewriteCond %{QUERY_STRING} ^.*(;|<|>|'|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).*(/\*|union|select|insert|cast|set|declare|drop|update|md5|benchmark).* [NC,OR] 
# Block out  reference to localhost/loopback/127.0.0.1 in the Query String 
RewriteCond %{QUERY_STRING} ^.*(localhost|loopback|127\.0\.0\.1).* [NC,OR] 
# Block out  use of illegal or unsafe characters in the Query String variable 
RewriteCond %{QUERY_STRING} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR] 
#proc/self/environ? no way!
RewriteCond %{QUERY_STRING} proc\/self\/environ [NC,OR] 
RewriteRule .* - [F]
########## Begin - File injection protection, by SigSiu.net
RewriteCond %{REQUEST_METHOD} GET
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=http:// [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=http%3A%2F%2F [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=(\.\.//?)+ [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=/([a-z0-9_.]//?)+ [NC]
RewriteRule .* - [F]
########## End - File injection protection

Regex Character Definitions For htaccess

In addition, I found the following write-up useful for understanding the characters in the square brackets after each rule (from StopMalvertising).

#
the # instructs the server to ignore the line. used for including comments. Each line of comments requires it’s own #. when including comments, it is good practice to use only letters, numbers, dashes, and underscores. this practice will help eliminate/avoid potential server parsing errors.

[F]
Forbidden: instructs the server to return a 403 Forbidden to the client.

[L]
Last rule: instructs the server to stop rewriting after the preceding directive is processed.

[N]
Next: instructs Apache to rerun the rewrite rule until all rewriting directives have been achieved.

[G]
Gone: instructs the server to deliver Gone (no longer exists) status message.

[P]
Proxy: instructs server to handle requests by mod_proxy

[C]
Chain: instructs server to chain the current rule with the previous rule.

[R]
Redirect: instructs Apache to issue a redirect, causing the browser to request the rewritten/modified URL.

[NC]
No Case: defines any associated argument as case-insensitive. i.e., “NC” = “No Case”.

[PT]
Pass Through: instructs mod_rewrite to pass the rewritten URL back to Apache for further processing.

[OR]
Or: specifies a logical “or” that ties two expressions together such that either one proving true will cause the associated rule to be applied.

[NE]
No Escape: instructs the server to parse output without escaping characters.

[NS]
No Subrequest: instructs the server to skip the directive if internal sub-request.

[QSA]
Append Query String: directs server to add the query string to the end of the expression (URL).

[S=x]
Skip: instructs the server to skip the next “x” number of rules if a match is detected.

[E=variable:value]
Environmental Variable: instructs the server to set the environmental variable “variable” to “value”.

[T=MIME-type]
Mime Type: declares the mime type of the target resource.

[]
specifies a character class, in which any character within the brackets will be a match. e.g., [xyz] will match either an x, y, or z.

[]+
character class in which any combination of items within the brackets will be a match. e.g., [xyz]+ will match any number of x’s, y’s, z’s, or any combination of these characters.

[^]
specifies not within a character class. e.g., [^xyz] will match any character that is neither x, y, nor z.

[a-z]
a dash (-) between two characters within a character class ([]) denotes the range of characters between them. e.g., [a-zA-Z] matches all lowercase and uppercase letters from a to z.

a{n}
specifies an exact number, n, of the preceding character. e.g., x{3} matches exactly three x’s.

a{n,}
specifies n or more of the preceding character. e.g., x{3,} matches three or more x’s.

a{n,m}
specifies a range of numbers, between n and m, of the preceding character. e.g., x{3,7} matches three, four, five, six, or seven x’s.

()
used to group characters together, thereby considering them as a single unit. e.g., (perishable)?press will match press, with or without the perishable prefix.

^
denotes the beginning of a regex (regex = regular expression) test string. i.e., begin argument with the proceeding character.

$
denotes the end of a regex (regex = regular expression) test string. i.e., end argument with the previous character.

?
declares as optional the preceding character. e.g., monzas? will match monza or monzas, while mon(za)? will match either mon or monza. i.e., x? matches zero or one of x.

!
declares negation. e.g., “!string” matches everything except “string”.

.
a dot (or period) indicates any single arbitrary character.


instructs “not to” rewrite the URL, as in “…domain.com.* – [F]”.

+
matches one or more of the preceding character. e.g., G+ matches one or more G’s, while “+” will match one or more characters of any kind.

*
matches zero or more of the preceding character. e.g., use “.*” as a wildcard.

|
declares a logical “or” operator. for example, (x|y) matches x or y.

\
escapes special characters ( ^ $ ! . * | ). e.g., use “\.” to indicate/escape a literal dot.

\.
indicates a literal dot (escaped).

/*
zero or more slashes.

.*
zero or more arbitrary characters.

^$
defines an empty string.

^.*$
the standard pattern for matching everything.

[^/.]
defines one character that is neither a slash nor a dot.

[^/.]+
defines any number of characters which contains neither slash nor dot.

http://
this is a literal statement — in this case, the literal character string, “http://”.

^domain.*
defines a string that begins with the term “domain”, which then may be proceeded by any number of any characters.

^domain\.com$
defines the exact string “domain.com”.

-d
tests if string is an existing directory

-f
tests if string is an existing file

-s
tests if file in test string has a non-zero value

Similar Posts:

By |2014-10-28T06:26:07+00:00October 23rd, 2014|Programming & Code|0 Comments

About the Author:

Alvin Poh lives in Singapore, and is interested in marketing, techy stuff, and likes to just figure out how the two can work with each other. He can also be found on Google+.

Leave A Comment


*