Session 01 - The Internet, World Wide Web, Web Browsers, Web Sites, and HTML

Harvard Extension School  
Fall 2020

Course Web Site: https://cscie12.dce.harvard.edu/

Topics

  1. The Internet and the World Wide Web
  2. A Web Site over Time
  3. Components of the Web
  4. Client-side Web Parts: Markup, Style, Function
  5. HTML Introduction
  6. Markup Evolution and Standards
  7. HTML/SGML/XML — What's the Difference?
  8. HTML5
  9. File Management
  10. Relative URLs
  11. URL to Filename Mapping

Session 01 - The Internet, World Wide Web, Web Browsers, Web Sites, and HTML, slide1
The Internet and the World Wide Web, slide2
The Internet: Schematic, slide3
Types of Traffic on the Internet, slide4
Tim Berners-Lee on The World Wide Web, slide5
Features of the World Wide Web, slide6
Approaching the Web, slide7
A Web Site over Time, slide8
A Web Address - URLs (and URIs), slide9
Components of the Web, slide10
Client-side Web Parts: Markup, Style, Function, slide11
Our Solar System: Markup, slide12
Our Solar System: Markup + Style, slide13
Our Solar System: Markup + Style + Function, slide14
HTML Introduction, slide15
Markup - HTML, slide16
Essential HTML5 Document Structure, slide17
Components of HTML Elements, slide18
Elements, Start Tags, Attributes and values, End Tags, Content, slide19
Markup Evolution and Standards, slide20
Benefits of Web Standards, slide21
HTML/SGML/XML — What's the Difference?, slide22
A Tale of Two Documents, slide23
HTML5, slide24
Most commonly used or seen elements, slide25
Page Structure - header, main, footer, slide26
HTML5 Document Template, slide27
File Management, slide28
Relative URLs, slide29
Absolute and Relative Locations, slide30
Relative Paths to Parent Locations, slide31
URL to Filename Mapping, slide32
Directory Requests and "index.html", slide33

Presentation contains 33 slides

The Internet and the World Wide Web

The Internet

key

Image from Opte Project and is used under the Creative Commons Attribution-NonCommercial 4.0 International License.

The Internet: Schematic

Internet

Types of Traffic on the Internet

Internet

Tim Berners-Lee on The World Wide Web

Suppose all the information stored on computers everywhere were linked. Suppose I could program my computer to create a space in which everything could be linked to everything.

Tim Berners-Lee
Web at 25

Today, and throughout this year, we should celebrate the Web’s first 25 years. But though the mood is upbeat, we also know we are not done. We have much to do for the Web to reach its full potential. We must continue to defend its core principles and tackle some key challenges.

Tim Berners-Lee in Welcome to the Web's 25 Anniversary
Long Live the Web

The Web evolved into a powerful, ubiquitous tool because it was built on egalitarian principles and because thousands of individuals, universities and companies have worked, both independently and together as part of the World Wide Web Consortium, to expand its capabilities based on those principles.

Tim Berners-Lee in Long Live the Web (Scientific American, Nov/Dec 2010)
Weaving the Web

The irony is that in all its various guises -- commerce, research, and surfing -- the Web is already so much a part of our lives that familiarity has clouded our perception of the Web itself.

Tim Berners-Lee in Weaving the Web (1999)

Features of the World Wide Web

Approaching the Web

A Web Site over Time

The White House Site (www.whitehouse.gov)

1996
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 1996

1997
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 1997


1997
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 1997

1998
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 1998


1998
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 1998

1999
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 1999


1999
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 1999

2001
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2001


2001
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2001

2002
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2002


2002
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2002

2007
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2007


2007
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 1996

2009
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2009


2009
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2009

2011
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2011


2011
whitehouse.gov
Internet Archive Way Back Machine - whitehouse.gov 2011

2015 - Early
whitehouse.gov


2015 - Early
whitehouse.gov

2015 - Late
whitehouse.gov


2017 - August
whitehouse.gov

2018 - January
whitehouse.gov

Design as of August 2020 remains essentially the same since January 2018.

A Web Address - URLs (and URIs)

URL/URI
https://www.archives.gov/historical-docs/voting-rights-act

Aside: Names and Locations: URLs, URIs, and URNs

URI, URN, URL

A book example ("Leadership in Turbulent Times" by Doris Kearns Goodwin).

Both are "URIs" one is a URN and the other is a URL.

Components of the Web

web parts

1. HTTP Client

http clienthttp client

2. HTTP Server

server-sideserver

3. Network

Network connecting HTTP client with server.

Client-side Web Parts: Markup, Style, Function

web parts

Our Solar System: Markup

solarsystem-markup.html
markup

Our Solar System: Markup + Style

solarsystem-style.html
markup + style

Our Solar System: Markup + Style + Function

solarsystem.html
markup + style + functionmarkup + style + function

Files:

HTML Introduction

HTML5

Markup - HTML

The Code

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>My Schools</title>
  </head>
  <body>
    <h1>My Schools</h1>
    <ul>
      <li>
        <a href="https://www.harvard.edu/">Harvard University</a><br/>
        <img src="images/harvard-shield.png" alt="Harvard Shield" />
      </li>
      <li>
        <a href="https://www.ku.edu/">University of Kansas</a><br/>
        <img src="images/kansas-jayhawk.png" alt="University of Kansas Jayhawk" />
      </li>
    </ul>
  </body>
</html>

How a Browser Displays It

web page

How Your Browser Thinks About It

dom tree

Essential HTML5 Document Structure

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>Document Title</title>
  </head>
  <body>
    <!-- content goes here -->
  </body>
</html>

html5 skeleton nodes


Components of HTML Elements

A Hypertext Link

Markup for a Hypertext link:

<a href="http://www.harvard.edu/">Harvard</a>

How it would render in a web browser:
Harvard Link in a web browser


element anatomy


Start Tag
<a href="http://www.harvard.edu/">Harvard</a>

Element Name
<a href="http://www.harvard.edu/">Harvard</a>

Attribute
<a href="http://www.harvard.edu/">Harvard</a>

Attribute Value
<a href="http://www.harvard.edu/">Harvard</a>

Content
<a href="http://www.harvard.edu/">Harvard</a>

End Tag
<a href="http://www.harvard.edu/">Harvard</a>

Elements, Start Tags, Attributes and values, End Tags, Content

element anatomy

Element Names

Attributes and Values

Content

End Tags

Markup Evolution and Standards

markup evolution

Markup Standards

Benefits of Web Standards

Standards for:
Benefits:
"Be liberal in what you accept, and
conservative in what you send"

"Postel's Law" or the "Robustness Principle"

HTML/SGML/XML — What's the Difference?

Main differences between HTML/SGML and XML:

 HTMLXML
1.End tags can be "implied"
Closing elements that have implied end tags

<img src="images/drink.jpg" alt="Lake" >

<ul>
    <li>coffee
    <li>tea
</ul>

End tags always required
(even for "empty" elements)

<img src="images/drink.jpg" alt="Lake" />

<ul>
    <li>coffee</li>
    <li>tea</li>
</ul>

2.Start tags can be "implied"
<!DOCTYPE html>
<head>
<title>My Document</title>
<body>
<h1>My Document</h1>
Start tags always required
<!DOCTYPE html>
<html>
<head>
<title>My Document</title>
</head>
<body>
<h1>My Document</h1>
</body>
</html>
3.Element and attribute names are not case-sensitive

<IMG SrC="images/lake.jpg" aLT="Lake" >

Element and attribute names are case-sensitive

<img src="images/lake.jpg" alt="Lake" />

4.Attribute values do not need to be in quotes if the values contain alpha-numeric characters only

<img src=images/lake.jpg alt=Lake >

Attribute values must always be in quotes
 

<img src="images/lake.jpg" alt="Lake" />

Best Practices for Starting Out

A Tale of Two Documents

XML Syntax

<!DOCTYPE html>
<html lang="en">
<head>
    <title>My Document</title>
    <meta charset="utf-8" />
</head>
<body>
    <h1>My Document</h1>
    <ul>
        <li>coffee</li>
        <li>tea</li>
    </ul>
    <img src="images/mug.jpg" alt="mug" />
</body>
</html>

SGML/HTML Syntax

<!DOCTYPE html>
<HEAD>
    <TITLE>My Document</TITLE>
    <META CHARSET=utf-8 >
<BODY>
    <H1>My Document</H1>
    <UL>
        <LI>coffee
        <LI>tea
    </UL>
    <IMG SRC=images/mug.jpg ALT=Mug >

Cleaner version of SGML/HTML Syntax

Of course, you can use the SGML/HTML syntax and write HTML that looks better. Just because the syntax allows you shorten things and leave out things, doesn't mean you have to.
Like this:

<!DOCTYPE html>
<html lang="en">
<head>
    <title>My Document</title>
    <meta charset="utf-8" >
</head>
<body>
    <h1>My Document</h1>
    <ul>
        <li>coffee
        <li>tea
    </ul>
    <img src="images/mug.jpg" alt="Mug" >
</body>
</html>

HTML5

116 elements defined in HTML5 HTML5 Logo

More information: HTML5 Living Standard from the WHATWG. Section 4 contains the List of elements in HTML.

I've highlighted the 23 elements that you will use and/or see most commonly.

Most commonly used or seen elements

HTML5 Logo Start with these 24 — these are elements you will use in most of your web pages, or that you'll find in a majority of web pages.

How to find out more about them? Two places that I would start are:

Page Structure - header, main, footer

First, recall the basic document structure:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>Document Title</title>
  </head>
  <body>
    <!-- content goes here -->
  </body>
</html>

header, main, footer

MDN HTML elements reference: header, main, footer.

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>Document Title</title>
  </head>
  <body>
    <header> <!-- page header --> </header>
    <main> <!-- main content goes here --> </main>
    <footer> <!-- page footer --> </footer>
  </body>
</html>

HTML5 Document Template

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>Document Title</title>
  </head>
  <body>
    <header> <!-- page header --> </header>
    <main> <!-- main content goes here --> </main>
    <footer> <!-- page footer --> </footer>
  </body>
</html>

File Management

For Class

For Web Sites

Relative URLs

URL
https://www.archives.gov/historical-docs/voting-rights-act

Absolute and Relative Locations

Absolute and Relative Locations

Relative locations (URLs) are resolved according to the location (URL) of the containing (starting) document!

Absolute or Fully Qualified URLs

Absolute, or fully-qualified, URLs specify the complete information (scheme, host, port, path).

https://news.harvard.edu/gazette/story/2020/07/public-health-experts-unite-to-bring-clarity-to-coronavirus-response/

Relative or Partial URLs

Relative, or partial, URIs specify partial information. The information not provided is resolved from the current location.

<a href="slide2.html">Slide 2</a>

Relative to Server Root

Is this relative or absolute? Scheme, host, and port would be resolved from current location, but path is absolute

<a href="/copyright.html">copyright information</a>

Relative Paths to Parent Locations

Location:
https://www.madeupschool.edu/museums/index.html
Relative URLResolved URL
../index.htmlhttps://www.madeupschool.edu/index.html
../arts/index.htmlhttps://www.madeupschool.edu/arts/index.html
../images/museum_building.jpghttps://www.madeupschool.edu/images/museum_building.jpg

Relative links are "transportable":

Containing Page:
https://stage.madeupschool.edu/museums/index.html
Relative LinkDocument
../index.htmlhttps://stage.madeupschool.edu/index.html
../arts/index.htmlhttps://stage.madeupschool.edu/arts/index.html
../images/museum_building.jpghttps://stage.harvard.edu/images/museum_building.jpg

URL to Filename Mapping

User directories in a shared environment

Web documents for each user are kept in the user's home directory, in a directory typically named public_html. As an example, for the user jharvard whose home directory is /home/courses/j/h/jharvard

URIhttps://cs12students.dce.harvard.edu/~jharvard/index.html
File/home/courses/j/h/jharvard/public_html/index.html

Document Root

The Web documents are typically kept under a single directory, traditionally named htdocs. The full path to this directory is called the "document root" of the Web server, for example, /www/htdocs.

URIhttps://www.unicorns-r-us.com/jobs/index.html
File/www/unicorns-r-us.com//jobs/index.html

Directory Requests and "index.html"

URL paths that map to a directory. For example the request: http://www.madeupschool.edu/museums/ would return the index.html page in the museums directory.