Clients and Servers

client-server computing
The interaction between two programs when they communicate across a network. A program at one site sends a request to a program at another site and awaits a response. The requesting program is called a client; the program satisfying the request is called the server. (definition from The Internet Book, 2nd edition by Douglas E. Comer)

Client-Server Computing


Simple HTTP Server Overview

  1. Listen for request
  2. Receive HTTP Request from client
  3. Return HTTP Response and resource to client
  4. Goto Step 1
It can get much more complex than this...

HTTP Server Resources


Apache Web Server Resources


Apache Configuration Overview


.htaccess File Example

filename: .htaccess
location: /home/c/s/cscie12/public_html/apache/example/.htaccess
contents:
ErrorDocument 404 /~cscie12/status404.html
filename: status404.html
location: /home/c/s/cscie12/public_html/status404.html
contents:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML lang="en">
  <HEAD>
  <TITLE>
      CSCIE12: 404 Not Found
  </TITLE>
  <BASE href="http://www.courses.fas.harvard.edu/~cscie12/">
  </HEAD>
  <BODY bgcolor="#ffffff" link="#cc3333" vlink="#996633"
  background="images/background.gif">
    <H1>404 Not Found</H1>
    <H2>CSCIE12: Introduction to Web Site Development</H2>
      The resource you requested, <br>
      <strong><!--#echo var="REQUEST_URI"--></strong><br>
      cannot be found.
    <HR>
    The main areas of the site are:<p>
    <!--#include virtual="inc/nav.html"-->
    <HR>
    <!--#include virtual="inc/footer.html"-->
    <HR>
  </BODY>
</HTML>

.htaccess: Scope

Directives within .htaccess files apply to the directory that contains the .htaccess file and all its descendants.

Directives within the file,
/home/c/s/cscie12/public_html/.htaccess
would apply to all files within and "under" the public_html directory for the user cscie12.

Directives within the file,
/home/c/s/cscie12/public_html/assignments/.htaccess
would apply to all files within and "under" the public_html/assignments directory for the user cscie12.


.htaccess files: Legal Directives I
Context

Certain Apache directives are legal within .htaccess files. Some are not.
See the Apache Documentation for details. Specifically, look at the Context line that is given for the directive in question. The following is an excerpt from the Apache HTTP Server Version 1.3 documentation

ErrorDocument directive

Syntax: ErrorDocument error-code document
Context: server config, virtual host, directory, .htaccess
Status: core
Override: FileInfo
Compatibility: The directory and .htaccess contexts are only available in Apache 1.1 and later.

Also, the "a" indicator on the Apache Quick Reference Card indicates that the directive is valid within an .htaccess file.


.htaccess files: Legal Directives II
AllowOverride

Users are allowed to override certain aspects of the main server configuration.
The main server configuration file (httpd.conf) contains an AllowOverride directive that determines which directives within .htaccess files Apache will process. The Override line that is given for each directive in the Apache documentation indicates which configuration directive must be active in order to use that directive with an .htaccess file.

For the FAS system, the main server configuration file has the following directive in place for users' public_html directories:

AllowOverride FileInfo AuthConfig Limit Indexes Options
The following is an excerpt from the Apache HTTP Server Version 1.3 documentation

ErrorDocument directive

Syntax: ErrorDocument error-code document
Context: server config, virtual host, directory, .htaccess
Status: core
Override: FileInfo
Compatibility: The directory and .htaccess contexts are only available in Apache 1.1 and later.

.htaccess: Legal Directives III
Apache Modules

Apache is distributed with several modules. These modules may or may not be active within the Apache server with which you are working. The Core features will always be available.

For example, if the Rewrite Module (mod_rewrite) has not been activated, none of the Rewrite directives will be available to use.

Refer to the Status and Module lines in the documentation for each directive and to the documentation for the specific Apache installation you are using.


Apache Modules

On the FAS Web servers, the following Apache modules are active:
mod_access
mod_actions
mod_alias
mod_asis
mod_auth
mod_auth_dbm
mod_autoindex
mod_cgi
mod_dir
mod_env
mod_expires
mod_headers
mod_imap
mod_include
mod_log_config
mod_mime
mod_negotiation
mod_perl
mod_rewrite
mod_setenvif
mod_so
mod_status
mod_unique_id
mod_userdir
mod_usertrack
raven_ssl

Problems You will encounter when using .htaccess files

500 Internal Server Error
If you see begin seeing 500 Internal Server Error responses from the server after you have created or edited an .htaccess file, the most likely cause of the problem is incorrect permissions and/or an error in the directive syntax.
fas% pwd
/home/j/h/jharvard/public_html
is03:~% ls -l .htaccess
-rw-------   1 jharvard  founder         349 Nov 27 00:03 .htaccess
is03:~% chmod o+r .htaccess
is03:~% ls -l ~/public_html/.htaccess
-rw----r--   1 jharvard  founder         349 Nov 27 00:03 .htaccess

Problems You will encounter when using .htaccess files

You can't "see" your .htaccess file.
fas% ls
assignments
cgi-bin
faq
images
inc
index.html
instructors
lecture
schedule.html
section
syllabus

fas% ls -a
.
..
.htaccess
assignments
cgi-bin
faq
images
inc
index.html
instructors
lecture
schedule.html
section
syllabus

Apache Configuration Sections

Configuration directives can be limited by using "sections", such as Note that only Files and FilesMatch can be used within .htaccess files.

Examples:

<Files .htaccess>
    Order allow,deny
    Deny from all
</Files>
Examples:
# deny access to any tilde backup files
<Files *~>
    Order allow,deny
    Deny from all
</Files>

Configuring Apache with .htaccess files


Custom Error Documents

.htaccess file:
ErrorDocument 404 /~cscie12/status404.html

Redirecting Requests

HTTP Status Codes:
301 Moved permanently
302 Moved temporarily

Redirecting client requests can be very useful:

Note: redirection may also be achieved on some browsers by using the http-equiv attribute of the <META> element. More information and examples are provided at http://www.fas.harvard.edu/~web/tutorial/meta/refresh/. The recommended method is to do it at the server level.

Redirect

.htaccess file:
Redirect 302 /~cscie12/dce.html      http://www.dce.harvard.edu/
Redirect 301 /~cscie12/presentation  http://www.courses.fas.harvard.edu/~cscie12/lecture

Rewrite

From the Apache documentation on mod_rewrite:

Summary

``The great thing about mod_rewrite is it gives you all the configurability and flexibility of Sendmail. The downside to mod_rewrite is that it gives you all the configurability and flexibility of Sendmail.''
-- Brian Behlendorf
Apache Group
`` Despite the tons of examples and docs, mod_rewrite is voodoo. Damned cool voodoo, but still voodoo. ''
-- Brian Moore
bem@news.cmc.net
Welcome to mod_rewrite, the Swiss Army Knife of URL manipulation!

This module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly. It supports an unlimited number of rules and an unlimited number of attached rule conditions for each rule to provide a really flexible and powerful URL manipulation mechanism. The URL manipulations can depend on various tests, for instance server variables, environment variables, HTTP headers, time stamps and even external database lookups in various formats can be used to achieve a really granular URL matching.

This module operates on the full URLs (including the path-info part) both in per-server context (httpd.conf) and per-directory context (.htaccess) and even can generate query-string parts on result. The rewritten result can lead to internal sub-processing, external request redirection or even to an internal proxy throughput.

But all this functionality and flexibility has its drawback: complexity. So don't expect to understand this module in it's whole in just one day.

This module was invented and originally written in April 1996
and gifted exclusively to the The Apache Group in July 1997 by

Ralf S. Engelschall
rse@engelschall.com
www.engelschall.com

Examples of Rewrite Uses

Provide a standard mechanism to access course Web sites within Harvard College.

For example, Chemistry 5 has a catalog number of 5118, so the URL for the course Web site can be reached through: The "real" location of the site is:

HASCS Site Restructure

Many rewrite directives were put in place when the HASCS site was restructured so that links to documents within the previous site would get redirected to the appropriate page in the new site.

Rewrite: Text-only sites

RewriteEngine On
RewriteBase /~cscie12
RewriteCond %{HTTP_USER_AGENT} ^Lynx
RewriteRule ^(index.html)?$ text/

Text-only sites: LINK

Meta-information can be used to describe alternate content.
In ~cscie12/public_html/index2.html
<LINK title="Text-only version"
         rel="alternate"
         href="http://www.courses.fas.harvard.edu/text/index.html"              
         media="aural, braille, tty">
Lynx view of index2.html provides the text-only version as a link:
                                Introduction to Web Site Development (p1 of 3)

   #text-only version

                          Harvard University, DCE 
                                 Fall 1999

                                  CSCIE12

                    Introduction to Web Site Development

   David P. Heitmeyer
     _________________________________________________________________

   Thanksgiving Holiday: Sections for Wed, 24-Nov and Sat, 27-Nov will
   not meet.

   Lecture 8: Multimedia and HTTP lecture notes and video are available.
   Nov 24, 4:15 PM

   Lecture 7: JavaScript lecture notes and video are available.
   Nov 17, 3:15 PM

   Assignment 5 is available. Due 29-Nov
   Nov 15, 4:45 PM

-- press space for next page --
  Arrow keys: Up and Down to move.  Right to follow a link; Left to go back.
 H)elp O)ptions P)rint G)o M)ain screen Q)uit /=search [delete]=history list

Directory Index and Listings

Note: Remember the difference between a directory having rwx-----x and rwx---r-x permissions?

Automatic Indexing done Right

To make Web site maintenance easier for course staff, the www.courses.fas.harvard.edu virtual host has the ability to generate nicely formatted index pages automatically.

httpd.conf for www.courses.fas.harvard.edu:

DirectoryIndex index.html /cgi-bin/sfindexer.pl

Setting HTTP Headers


Expires

.htaccess file:
ExpiresActive On
ExpiresByType text/html   A3600    # HTML expires in 1 hour
ExpiresByType image/gif   A2592000 # GIF  expires in 30 days
ExpiresByType image/jpeg  A2592000 # JPEG expires in 30 days
ExpiresByType image/png   A2592000 # PNG  expires in 30 days
ExpiresDefault "now plus 1 day"    # types not specified
                                   #  expires in 1 day
From the Apache mod_expires documentation:

This module controls the setting of the Expires HTTP header in server responses. The expiration date can set to be relative to either the time the source file was last modified, or to the time of the client access.

The Expires HTTP header is an instruction to the client about the document's validity and persistence. If cached, the document may be fetched from the cache rather than from the source until this time has passed. After that, the cache copy is considered "expired" and invalid, and a new copy must be obtained from the source.


Headers

The optional headers module allows for the customization of HTTP response headers. Headers can be merged, replaced or removed. The server will always add a "Server" and "Date" header to the HTTP response.

asis Documents

Purpose Example:
fas% ls -l sendasiam.html.asis
-rw----r--   1 cscie12  courses       344 Nov 28 23:25 sendasiam.html.asis
sendasiam.html.asis file:
Status: 301 Now where did I leave that URL 
Location: http://www.joe.com/
Content-type: text/html 

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML> 
<HEAD> 
<TITLE>Lame excuses'R'us</TITLE> 
</HEAD> 
<BODY> 
<H1>Fred's exceptionally wonderful page has moved to 
<A HREF="http://www.joe.com/">Joe's</A> site. 
</H1> 
</BODY> 
</HTML> 

Access Control

Access Control