Approaching Your Final Project, and particularly the "Extraordinary Distinction"

Harvard Extension School  
Fall 2024

Course Web Site: https://cscie12.dce.harvard.edu/

Topics

  1. Extraordinary Distinction - "Extras" for your site
  2. Search and Search Engine Optimization (SEO)
  3. Webpage and Website Optimization
  4. Hosting Your Site
  5. Apache HTTP Server
  6. Caching - Don't deliver content unnecessarily
  7. Minify and Compress Content
  8. Friendly Errors
  9. Friendly Ways to Get There
  10. Getting JSON from HTTP request

Approaching Your Final Project, and particularly the "Extraordinary Distinction", slide1
Extraordinary Distinction - "Extras" for your site, slide2
Search and Search Engine Optimization (SEO), slide3
Content: meta tags, slide4
HTML5 Boilerplate - Louis Lazaris, slide5
SEO - Start with..., slide6
Webpage and Website Optimization, slide7
Core Web Vitals (CWV), slide8
Hosting Your Site, slide9
Web Browser and Web Server, slide10
Domain Name System, slide11
Domain Names: Top Level Domains (TLD), slide12
Getting Your Own Domain and Hosting, slide13
Getting Setup with Dreamhost, slide14
Web Server Software, slide15
HyperText Transfer Protocol, slide16
Looking at HTTP Under the Hood, slide17
Apache HTTP Server, slide18
Apache Configuration Overview, slide19
Scope of .htaccess files, slide20
Problems You Will Have with .htaccess files, slide21
500 Internal Server Error, slide22
Problems You will encounter when using .htaccess files (Internal Server Error 500), slide23
Problems You will encounter when using .htaccess files (Can't see the .htaccess file), slide24
Caching - Don't deliver content unnecessarily, slide25
Caching Related Headers, slide26
Expires HTTP Header, slide27
Do not cache, slide28
Typical Expiration / Cache Directives for Websites, slide29
Minify and Compress Content, slide30
Friendly Errors, slide31
Custom Error Documents, slide32
Friendly Ways to Get There, slide33
HTTP Redirect, slide34
Redirect, slide35
Rewrite, slide36
Example - Make Simple Links Instead of Complex Ones, slide37
Example: Create Links that can always point to the correct place, slide38
My Example Project - .htaccess setting to improve Webpagetest scores!, slide39
Getting JSON from HTTP request, slide40
Let's Try It Out!, slide41

Presentation contains 41 slides

Extraordinary Distinction - "Extras" for your site

Work in one area that is beyond the "core" requirements of three pages implemented. This can be accomplished in a variety of ways, some of which are listed below.

Key point is to go deeper into an area or explore a new area entirely!
This will distinguish the final project from being more than just work from the assignments that are metaphorically stapled together.

Please identify in your report the work in this category!

The topics below also fit into the category of "things you should know about" coming out of a fundamentals course about website development. This is a good reason to dive into a topic that is of particular interest to you.

Search and Search Engine Optimization (SEO)


Content: meta tags

meta tags (CSS Tricks, focusing on social media) and Metadata Guidelines (W3 EOWG)

meta elements from Harvard University:

<title>Harvard University</title>
<meta name="description" content="Harvard University is devoted to excellence in teaching, learning, and research, and to developing leaders who make a difference globally." />
<link rel="canonical" href="https://www.harvard.edu/" />
<meta property="og:locale" content="en_US" />
<meta property="og:type" content="website" />
<meta property="og:title" content="Harvard University" />
<meta property="og:description" content="Harvard University is devoted to excellence in teaching, learning, and research, and to developing leaders who make a difference globally." />
<meta property="og:url" content="https://www.harvard.edu/" />
<meta property="og:site_name" content="Harvard University" />
<meta property="article:modified_time" content="2022-11-14T18:32:14+00:00" />
<meta property="og:image" content="https://www.harvard.edu/wp-content/uploads/2021/03/100408_Yard_045-1200x630.jpg" />
<meta property="og:image:width" content="1200" />
<meta property="og:image:height" content="630" />
<meta property="og:image:type" content="image/jpeg" />
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:title" content="Harvard University" />
<meta name="twitter:description" content="Harvard University is devoted to excellence in teaching, learning, and research, and to developing leaders who make a difference globally." />

HTML5 Boilerplate - Louis Lazaris

A Basic HTML5 Template by Louis Lazaris

<!doctype html>

<html lang="en">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">

  <title>A Basic HTML5 Template</title>
  <meta name="description" content="A simple HTML5 Template for new projects.">
  <meta name="author" content="SitePoint">

  <meta property="og:title" content="A Basic HTML5 Template">
  <meta property="og:type" content="website">
  <meta property="og:url" content="https://www.sitepoint.com/a-basic-html5-template/">
  <meta property="og:description" content="A simple HTML5 Template for new projects.">
  <meta property="og:image" content="image.png">

  <link rel="icon" href="/favicon.ico">
  <link rel="icon" href="/favicon.svg" type="image/svg+xml">
  <link rel="apple-touch-icon" href="/apple-touch-icon.png">

  <link rel="stylesheet" href="css/styles.css?v=1.0">

</head>

<body>
  <!-- your content here... -->
  <script src="js/scripts.js"></script>
</body>
</html>

SEO - Start with...

Webpage and Website Optimization

lighthouse test results

webpage test results

Core Web Vitals (CWV)

User-centered

Largest Contentful Paint threshold recommendationsInteractive to Next Paint threshold recommendationsCumulative Layout Shift threshold recommendations

Hosting Your Site

So, you want to graduate from the course web hosting server to your own domain? Yay!

It isn't hard, but there are some details....

Web Browser and Web Server

Domain Name System

Computers connect by IP address (number); Humans like names (e.g. www.harvard.edu).
Domain Name System (DNS) resolves names to IP addresses (and the other way too)
www.harvard.edu151.101.210.133

Domain Names: Top Level Domains (TLD)

TLDs are managed by the Internet Assigned Numbers Authority (IANA)

Generic: .com, .org, .edu, .gov, etc.

Country codes: .ch, .cn, .de, .uk, .us, etc.

Full listing of TLDs

Getting Your Own Domain and Hosting

Often Domain Name registration and Hosting will be setup together from the same company, but keep in mind that they are distinct and separate things!

  1. Domain Name
    • Buy the domain through a "registrar"
    • Provide name servers
    • About $10/yr
  2. Hosting
    • Shared ($7-15/mo)
    • Private / Cloud

A very short list of hosting companies as a place to start.

My playground domain: cs12.net

I registered "cs12.net" and from there, I can control the subdomains from there. For example, natureofamerica.cs12.net, hello.cs12.net, wptest.cs12.net, etc.

Getting Setup with Dreamhost

Web Server Software

HyperText Transfer Protocol

GET

United States National Archives
www.archives.gov

HTTP/2 200
content-type: text/html; charset=utf-8
content-length: 24627
date: Wed, 16 Nov 2022 23:28:11 GMT
content-language: en
permissions-policy: interest-cohort=()
set-cookie: UUID=2e5ae12d-5a81-c5c4-49e7-8ffb91a120e9; expires=Thu, 16-Nov-2023 22:59:51 GMT; Max-Age=31536000; path=/; domain=.www.archives.gov; httponly
last-modified: Wed, 16 Nov 2022 22:59:51 GMT
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
etag: W/"1668639591-0-gzip"
v-ttl: 1899
cache-control: public, max-age=60, s-maxage=180
v-cache-ttl: 1899
x-frame-options: SAMEORIGIN
accept-ranges: bytes
vary: Cookie,Accept-Encoding
x-cache: Hit from cloudfront
via: 1.1 d5b8ff1568ca9900eb00feb643d95cd4.cloudfront.net (CloudFront)
x-amz-cf-pop: BOS50-P1
x-amz-cf-id: AAonOXKNDKqSml1g9p-sfF5H0zvxNSq1iypUW-fseaFZXdtns9IyAw==
age: 34

<!doctype html>
<html lang="en" dir="ltr" prefix="fb: //www.facebook.com/2008/fbml">
<head>
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">  <!-- truncated for example -->

Looking at HTTP Under the Hood

Use your browser developer tools!
screenshot of http headers in browser dev tools

Apache HTTP Server

apache httpd

Apache Configuration Overview

Scope of .htaccess files

Directives within .htaccess files apply to the directory that contains the .htaccess file and all its descendants.

Directives within the file,
/home/dh_xyz45/mycoolsite.com/.htaccess
would apply to all files within and "under" the mycoolsite.com directory for the user dh_xyz45.

Directives within the file,
/home/dh_xyz45/mycoolsite.com/products/.htaccess
would apply to all files within and "under" the mycoolsite.com/products directory for the user dh_xyz45.

Problems You Will Have with .htaccess files

500 Internal Server Error

500 Internal Server Error

:(

Problems You will encounter when using .htaccess files (Internal Server Error 500)

500 Internal Server Error
If you see begin seeing 500 Internal Server Error responses from the server after you have created or edited an .htaccess file, the most likely cause of the problem is incorrect permissions and/or an error in the directive syntax.
% pwd
/home/dh_xyz45/mycoolsite.com
% ls -l .htaccess
-rw-------   1 dh_xyz45  www         349 Nov 27 00:03 .htaccess
% chmod o+r .htaccess
% ls -l ~/mycoolsite.com/.htaccess
-rw----r--   1 dh_xyz45  www         349 Nov 27 00:03 .htaccess

Problems You will encounter when using .htaccess files (Can't see the .htaccess file)

You can't "see" your .htaccess file.

Caching - Don't deliver content unnecessarily

Caching Related Headers

Expires HTTP Header

.htaccess
ExpiresActive On

ExpiresByType text/html   A3600
# HTML expires in 1 hour

ExpiresByType image/gif   A2592000
# GIF  expires in 30 days

ExpiresByType image/jpeg  A2592000
# JPEG expires in 30 days

ExpiresByType image/png   A2592000
# PNG  expires in 30 days

# types not specified
ExpiresDefault "now plus 1 day"
#  expires in 1 day  
Or, expire based upon modification time of document:
ExpiresActive On
ExpiresByType text/html   M86400
# HTML expires 1 day after it was last modified
ExpiresDefault M86400  

Do not cache

If you do not want your page cached, set these HTTP response headers:

Cache-control: no-cache
Pragma: no-cache
Expires: <set to now>  

In .htaccess in Apache, this would translate to:

ExpiresDefault "now"
Header set Pragma "no-cache"

Typical Expiration / Cache Directives for Websites

Expire static content a week or more into the future.

In .htaccess

# Turn on the module.
ExpiresActive on
# Set the default expiry times.
ExpiresDefault "now"
ExpiresByType image/jpg "access plus 1 month"
ExpiresByType image/svg+xml "access 1 month"
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType text/css "access plus 1 month"
ExpiresByType text/javascript "access plus 1 month"
ExpiresByType application/javascript "access plus 1 month"
ExpiresByType image/ico "access plus 1 month"
ExpiresByType text/html "access plus 600 seconds"

What about site updates?

Cache/Expiration based on full URL. So you can reflect the "version" within the URL, either as part of the path or part of the query string.

Minify and Compress Content

Minify Content

Minified files can be 25% to 35% smaller!

Compress Content

mod_deflate compresses content before sending to web browser.

Simple use:

AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/javascript

Does Compressing Help?

This can make a noticable different in the total page weight!

Friendly Errors

Apache Default "Not Found" 404 document:
404

"Not Found" 404 for Harvard
404 Not Found for Harvard University

"404" for my project site
404

Custom Error Documents

.htaccess
ErrorDocument 401 /error/status401.html
ErrorDocument 403 /error/status403.html
ErrorDocument 404 /error/status404.html
ErrorDocument 500 /error/status500.html 

Another fine way to approach this would be for each of the different error status codes (401, 404, 404, 500) all point to a single error.html document.

ErrorDocument 401 /error.html
ErrorDocument 403 /error.html
ErrorDocument 404 /error.html
ErrorDocument 500 /error.html

Friendly Ways to Get There

HTTP Redirect

Ways to Achieve this

Redirecting Requests

HTTP Status Codes:
301 Moved permanently
302 Moved temporarily

Redirecting client requests can be very useful:

Redirect

For cscie12.dce.harvard.edu the .htaccess file[ contains:


RewriteRule ^/assignments https://canvas.harvard.edu/courses/132j062/assignments [R=302]
RewriteRule ^/syllabus$ https://canvas.harvard.edu/courses/132j062/assignments/syllabus  [R=302]
RewriteRule ^/schedule$ https://canvas.harvard.edu/courses/132j062/assignments/syllabus [R=302]
RewriteRule ^/textbooks /harvardcoop_textbooks.html [R=302]
RewriteRule ^/sshclients https://canvas.harvard.edu/courses/132j062/pages/ssh-clients-remote-login [R=302]
RewriteRule ^/sftpclients https://canvas.harvard.edu/courses/132j062/pages/file-transfer-sftp [R=302]
RewriteRule ^/sections https://canvas.harvard.edu/courses/132j062/pages/section-meetings-csci-e-12-15078 [R=302]Redirect 302 /syllabus    https://canvas.harvard.edu/courses/132j062/assignments/syllabus
Try it:

Rewrite

mod_rewrite uses regular expressions to match on a pattern and rewrite incoming URLs to a new URL location.


Using mod_rewrite from within .htaccess

If you use RewriteRule from within an .htaccess files, you must use the RewriteBase directive.
See: http://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritebase

Example - Make Simple Links Instead of Complex Ones

Context: Parks and Recreation class offered and how to easily link directly to the class

Park and Rec system:
https://webtrac.littletonrec.com/wbwsc/webtrac.wsc/wbsearch.html

Link I can use with Rewrite rule
http://littletontrack.org/lpr-303107

RewriteEngine On
RewriteBase /
RewriteRule ^lpr-(.*)$ https://webtrac.littletonrec.com/wbwsc/webtrac.wsc/wbsearch.html?per=10&xxsearch=yes&xxdispmap=no+&xxmulti-list=&xxmulti-lbls=&xxrowid=&xxmod=ar&xxactivitynumber=$1&xxage=&xxgrade=&xxkeyword=&xxkeywordoption=N&xxtype=&xxcategory=&xxsortoption=ActivityNumber&xxdisplayoption=D&xxsubmit=Search

Example: Create Links that can always point to the correct place

Road Race Registration is done through a 3rd party service, SignMeUp

Redirect  /registration https://www.signmeup.com/site/reg/register.aspx?fid=B42VRH7

Redirect /map http://maps.google.com/maps/ms?ie=UTF8&hl=en&msa=0&msid=101999702593116464805.00046f1a27a9feb5aacaf&ll=42.52946,-71.485934&spn=0.018975,0.018239&z=15

My Example Project - .htaccess setting to improve Webpagetest scores!

Nature of America - My Example Project Site

.htaccess file:


# default to index.html
DirectoryIndex index.html

# BEGIN Expire headers
<IfModule mod_expires.c>
  # Turn on the module.
  ExpiresActive on
  # Set the default expiry times.
  ExpiresDefault "now"
  ExpiresByType image/jpg "access plus 1 month"
  ExpiresByType image/svg+xml "access 1 month"
  ExpiresByType image/gif "access plus 1 month"
  ExpiresByType image/jpeg "access plus 1 month"
  ExpiresByType image/png "access plus 1 month"
  ExpiresByType text/css "access plus 1 month"
  ExpiresByType text/javascript "access plus 1 month"
  ExpiresByType application/javascript "access plus 1 month"
  ExpiresByType image/ico "access plus 1 month"
  ExpiresByType image/x-icon "access plus 1 month"
  ExpiresByType text/html "access plus 600 seconds"
</IfModule>
# END Expire headers

# Security Policy that determines domains that resources can load from
<IfModule mod_headers.c>
  Header set Strict-Transport-Security "max-age=2592000; includeSubDomains; preload"
  Header set Content-Security-Policy: "default-src 'self'; img-src 'self' cdn.jsdelivr.net *.openstreetmap.org cdnjs.cloudflare.com; script-src 'self' 'unsafe-eval' code.jquery.com cdn.jsdelivr.net *.cloudflare.com; style-src 'self' 'unsafe-inline' *.jsdelivr.net *.cloudflare.com fonts.gstatic.com fonts.googleapis.com; font-src 'self' fonts.gstatic.com fonts.googleapis.com"
  Header set X-Frame-Options: DENY
</IfModule>

# compress (DEFLATE) files that are text
<IfModule mod_deflate.c>
  AddOutputFilterByType DEFLATE text/html text/css text/javascript application/javascript application/json
</IfModule>
Options -Indexes

# All errors will go to a common error file
ErrorDocument 404 /underconstruction.html
ErrorDocument 403 /underconstruction.html
ErrorDocument 500 /underconstruction.html

# Shouldn't publish from a git checkout anyway,
#   but just in case, sent requests trying to access .git to 404
RedirectMatch 404 /\.git

Getting JSON from HTTP request

JavaScript functions are often asynchronous -- the script continues to run before a step has completed!

You can use async with await or you can chain steps in sequence with fetch.then().then()

Javascript "fetch"

async/await

async function fetchData() {
  try {
    const response = await fetch('https://api.example.com/data');
    const data = await response.json();
    console.log(data);
  } catch (error) {
    console.error('Error:', error);
  }
}

fetch().then().then()

fetch('https://api.example.com/data')
  .then((response) => response.json())
  .then((data) => {
    console.log(data);
  })
  .catch((error) => {
    console.error('Error:', error);
  });

Let's Try It Out!

There are others you can experiment with, such as the National Parks Service API and MLB API.


Note: You may find a "JSON Viewer" plugin or extension for your browser useful if you work with JSON