Media types
An Internet media type is, generally speaking, a property of a
data set, describing both the general type of data (such as
"text" or "image" or "application";
the last one refers to program-specific internal data formats)
and, as a subtype, a specific format for the data. The concept
was originally defined as "MIME content types".
Media types relate to HTML as follows:
- When a Web server sends an HTML document, it should
specify the correct media type (
text/html
)
in the HTTP
headers it sends along with the document. Normally
servers are configured to do this by default when the
file name ends with .html
or .htm
(depending on the system; please consult local
documentation).
- In a FORM element, the value of
the ENCTYPE attribute specifies
the media type to be used then encoding and sending the
content of the form.
- When referring to various resources, such as embedding
images using IMG elements or
linking to binary files using an A
element, there is no way to tell the media type in HTML.
Things must be handled in the server. Typically, a Web
server uses some mapping table to map file name
extensions to media types (eg mapping extension
.zip
to media type application/zip
), and it may
provide users some tools for overriding such mappings or
otherwise specifying the media type to be associated with
a file or set of files. The description of the A element
contains some additional notes related to audio and video and binary files in general.
The HTML
3.2 Reference Specification refers to RFC 1521 but
that specification was superseded by RFC 2046 (in
November 1996). The procedure for registering types in given in RFC 2048;
according to it, the registry is kept at ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/
For less authoritative but more readably presented
information, see document MIME
Types by Chris
Herborth.
In addition to standardized media types, there are media types
which are in fact supported by popular servers and browsers. Appendix B
of Special
Edition Using CGI lists many of them.
You can check what is the media type information sent
by a server as follows: Assuming we are interested in the media
type of the document at URL http://
host/
path,
establish a Telnet connection to host using the port
number in the URL if present, port 80 otherwise. Then give the
command
HEAD
/
path HTTP/1.0
and then an empty line. Example (where the Telnet connection is
established by starting the telnet
program from Unix
command level):
beta ~ 51 % telnet www.hut.fi 80
Trying 130.233.224.28...
Connected to info-e.hut.fi.
Escape character is '^]'.
HEAD /home/jkorpela/perhe.jpg HTTP/1.0
HTTP/1.1 200 OK
Date: Tue, 23 Sep 1997 12:37:05 GMT
Server: Apache/1.2.4
Last-Modified: Tue, 08 Aug 1995 08:29:53 GMT
ETag: "16391-9232-30272081"
Content-Length: 37426
Accept-Ranges: bytes
Connection: close
Content-Type: image/jpeg
Connection closed by foreign host.
beta ~ 52 % exit
Here the Content-Type:
field tells that the media
type is image/jpeg
.