As we've repeatedly stated throughout this book, the same attributes that make HTTP so difficult to control, also make it so difficult to secure. Now that you know about how web applications dance their way around the statelessness of HTTP, you can probably guess that there are some security ramifications lurking around the corner. For example, what if someone steals my browser's session id, does that mean they can log in as me? Or what if I'm accessing a random website, can they peek into my Reddit or Facebook cookie, where my session id information for those sites are stored? We'll spend this chapter discussing some common security issues that creep up with HTTP. This is by no means an exhaustive list of security issues; just common ones that any web developer would be expected to know.
As the client and server send requests and responses to each other, all information in both requests and responses are being sent as strings. If a malicious hacker was attached to the same network, they could employ packet sniffing techniques to read the messages being sent back and forth. As we learned previously, requests can contain the session id, which uniquely identifies you to the server, so if someone else copied this session id, they could craft a request to the server and pose as your client, and thereby automatically being logged in without even having access to your username or password.
This is where Secure HTTP, or HTTPS, helps. A resource that's accessed by HTTPS will start with
https:// instead of
http://, and usually be displayed with a lock icon in most browsers:
With HTTPS every request/response is encrypted before being transported on the network. This means if a malicious hacker sniffed out the HTTP traffic, the information would be encrypted and useless.
HTTPS sends messages through a cryptographic protocol called TLS for encryption. Earlier versions of HTTPS used
SSLor Secure Sockets Layer until
TLS was developed. These cryptographic protocols use certificates to communicate with remote servers and exchange security keys before data encryption happens. You can inspect these certificates by clicking on the padlock icon that appears before the
Although most modern browsers do some high level check on a website's certificate on your behalf, viewing the certificate sometimes serves as an extra security step before interacting with the website.
The same-origin policy permits unrestricted interaction between resources originating from the same origin, but restricts certain interactions between resources originating from different origins. By origin, we mean the combination of the scheme, host, and port. So
http://mysite.com/doc1 has the same origin as
http://mysite.com/doc2, but a different origin from
https://mysite.com/doc1 (different scheme),
http://mysite.com:4000/doc1 (different port), and
http://anothersite.com/doc1 (different host).
Same-origin policy doesn't restrict all cross-origin requests. Requests such as linking, redirects, or form submissions to different origins are typically allowed. Also typically allowed is the embedding of resources from other origins, such as scripts, css stylesheets, images and other media, fonts, and iframes. What is typically restricted are cross-origin requests where resources are being accessed programmatically using APIs such as
fetch (the details of which are beyond the scope of this book).
example.com domain, for instance, and the downloaded script subsequently asks for a resource in the
something.com domain, then CORS gets involved. Your browser implements this by sending an
Origin: example.com header to
something.com when it requests the resource. If
something.com wants to allow cross domain access to
example.com, then it must include the appropriate
Access-Control-Allow-Origin header. If the header is present and grants access, then the browser will subsequently accept the response and process it -- if the header is omitted or denies access, the browser won't accept and process the response. Don't worry about memorizing these details right now. We'll cover it in much greater detail late in the Core Curriculum.
The same-origin policy is an important guard against session hijacking (see next section) attacks and serves as a cornerstone of web application security. Let's look at some HTTP security threats and their counter-measures.
We've already seen that the session plays an important role in keeping HTTP stateful. We also know that a session id serves as that unique token used to identify each session. Usually, the session id is implemented as a random string and comes in the form of a cookie stored on the computer. With the session id in place on the client side now every time a request is sent to the server, this data is added and used to identify the session. In fact, this is what many web applications with authentication systems do. When a user's username and password match, the session id is stored on their browser so that on the next request they won't have to re-authenticate.
Unfortunately, if an attacker gets a hold of the session id, both the attacker and the user now share the same session and both can access the web application. In session hijacking, the user won't even know an attacker is accessing his or her session without ever even knowing the username or password.
One popular way of solving session hijacking is by resetting sessions. With authentication systems, this means a successful login must render an old session id invalid and create a new one. With this in place, on the next request, the victim will be required to authenticate. At this point, the altered session id will change, and the attacker will not be able to have access. Most websites implement this technique by making sure users authenticate when entering any potentially sensitive area, such as charging a credit card or deleting the account.
Another useful solution is setting an expiration time on sessions. Sessions that do not expire give an attacker an infinite amount of time to pose as the real user. Expiring sessions after, say 30 minutes, gives the attacker a far narrower window to access the app.
Finally, as we have already covered, another approach is to use HTTPS across the entire app to minimize the chance that an attacker can get to the session id.
For example, the form below allows you to add comments, which will then be displayed on the site.
Because it's just a normal HTML
We mention the term "escaping" above. To escape a character means to replace an HTML character with a combination of ASCII characters, which tells the client to display that character as is, and to not process it; this helps prevent malicious code from running on a page. These combinations of ASCII characters are called HTML entities.
Consider the following HTML:
<p>Hello World!<\p>. Let's say we don't want the browser to read this as HTML. To accomplish this, we can escape special characters that the browser uses to detect when HTML starts and ends, namely
>, with HTML entities. If we write the following:
<p>Hello World!<\p>, then that HTML will be displayed as plain text instead.
In this section, we covered various aspects of security in web applications. Needless to say, this is a huge topic, and we've only scratched the surface of a few common issues. The point of this chapter is to reveal how fragile and problematic developing and securing a web application is, and it's mostly due to working with HTTP.