The Document Object Model (DOM) represents XML or HTML documents as a tree of nodes. Using DOM methods and properties, you can access any element on the page, modify or delete elements, and add new ones. DOM is a language-independent Application Programming Interface (API) that can be implemented not only in JavaScript, but also in any other programming language. For example, you can generate pages on the server side using the PHP implementation of DOM (php.net/dom).
Any HTML document can be represented as a DOM tree, where each node has its parent and children. Each node in this tree is an object, with its own properties and methods. Empty lines and comments are also considered objects (nodes) in the DOM model.
For all examples, we will use a simple HTML document, the code of which is shown below. And for viewing and working with the console, a separate page has been created - dom_example.html
<!DOCTYPE html>
<html>
<head>
<title>My page</title>
</head>
<body>
<p class="opener">first paragraph</p>
<p><em>second</em> paragraph</p>
<p id="closer">final</p>
<!-- and that's about it -->
</body>
</html>
The document
node represents the entire HTML or XML document and is the root of the DOM tree. It is also the starting point for accessing any element or node in the document.
You can access the document
node using the document
object, which is a global variable in web browsers.
To investigate this node, execute the command console.dir(document)
. The console.dir
command will display all properties and methods of this node.
All nodes (including document
node, text nodes, element nodes, and attribute nodes) have nodeType
, nodeName
, and nodeValue
properties.
> document.nodeType;
9
There are 12 types of nodes represented by integer numbers. As you can see, the document
node is represented by the number 9. The most commonly encountered types are 1 (element), 2 (attribute), and 3 (text).
All nodes also have names. For HTML tags, the node name is the tag name (tagName
property). Text nodes have the name #text
, and the document
node is named as:
> document.nodeName;
"#document"
Nodes can also have a value associated with them. For example, for text nodes, the value is the text content itself. The document
node doesn't have any value:
> document.nodeValue;
null
The documentElement
property is a property of the document
object, which represents the root element of the document
. This means that documentElement
refers to the <html>
element in an HTML document.
The root element is the parent element for all other elements in the document and can contain other elements inside itself.
> document.documentElement;
<html>…</html>
nodeType
equals 1, which corresponds to an element node:
> document.documentElement.nodeType;
1
For element nodes, the nodeName
and tagName
properties contain the name of the tag itself:
> document.documentElement.nodeName
"HTML"
> document.documentElement.tagName;
"HTML"
To check if a node has child nodes, you can call the hasChildNodes() method:
> document.documentElement.hasChildNodes();
true
The HTML element has three child elements: head, body, and a whitespace|empty element between them (whitespace elements are accounted for by most, but not all, browsers). You can access the child elements using the childNodes
property:
> document.documentElement.childNodes.length;
3
> document.documentElement.childNodes[0];
<head>…</head>
> document.documentElement.childNodes[1];
#text
> document.documentElement.childNodes[2];
<body>…</body>
Each child node has a parentNode
property that can be used to access its parent node:
> document.documentElement.childNodes[1].parentNode;
<html>…</html>
Let's save a reference to the body
node:
> const bd = document.documentElement.childNodes[2];
Let's see how many child nodes the body
has:
> bd.childNodes.length;
9
Let's recall what's inside the body
tag:
<body>
<p class="opener">first paragraph</p>
<p><em>second</em> paragraph</p>
<p id="closer">final</p>
<!-- and that's about it -->
</body>
But why the body node contains 9 child nodes? Well, 3 p
nodes and 1 comment node make up a total of 4 child nodes. The whitespace nodes between these 4 element nodes give us an additional 3 text nodes. That totals to 7 child nodes. The whitespace node between the <body>
tag and the first <p>
tag will be the eighth child node. The whitespace node between the comment and the closing </body>
tag will be the ninth child node. To check these claims, you can run the command bg.childNodes
in the console.
Since the first child node of the body
node is a whitespace node, then the second node (index 1) is the first paragraph in our HTML document:
> bd.childNodes[1];
<p class="opener">first paragraph</p>
To check if an element has attributes, the method hasAttributes()
is used:
> bd.childNodes[1].hasAttributes();
true
How many attributes? In our example, there is 1 attribute - class
> bd.childNodes[1].attributes.length;
1
You can access attributes both by index and by name. You can also get the value of an attribute using the getAttribute()
method:
> bd.childNodes[1].attributes[0].nodeName;
"class"
> bd.childNodes[1].attributes[0].nodeValue;
"opener"
> bd.childNodes[1].attributes['class'].nodeValue;
"opener"
> bd.childNodes[1].getAttribute('class');
"opener"
Let's take a look at the first paragraph of our document:
> bd.childNodes[1].nodeName;
"P"
You can retrieve the text content inside a paragraph by using the textContent
property. The textContent
property does not exist in older versions of IE, but another property, innerText
, returns the same value:
> bd.childNodes[1].textContent;
"first paragraph"
> bd.childNodes[1].innerText;
"first paragraph"
There is also the innerHTML
property, which returns (or sets) the HTML code contained within the node. It can be noticed that such behavior somewhat contradicts the DOM model, which represents the document as a tree of nodes rather than a string of tags. However, the innerHTML
property has proven to be so convenient that it is widely used.
> bd.childNodes[1].innerHTML;
"first paragraph"
The first paragraph contains only text, so both innerHTML
and textContent
(innerText
in IE) will return the same value. However, the second paragraph contains an em
node, so we can observe the differences in the properties:
> bd.childNodes[3].innerHTML;
"<em>second</em> paragraph"
> bd.childNodes[3].textContent;
"second paragraph"
Another way to retrieve the text within the first paragraph is to use the nodeValue
property of the text node contained within the p
element:
> bd.childNodes[1].childNodes.length;
1
> bd.childNodes[1].childNodes[0].nodeName;
"#text"
> bd.childNodes[1].childNodes[0].nodeValue;
"first paragraph"
By using the properties childNodes
, parentNode
, nodeName
, nodeValue
and attributes, you can traverse the DOM tree and perform various operations on document nodes. However, the fact that whitespace and empty characters are also considered text nodes makes this traversal method unreliable. If the page structure changes, your script may no longer work correctly. Additionally, if you want to access nested elements of a particular node, you would need to write additional code before you can do so. That's where the fast access methods come into play, namely getElementsByTagName()
, getElementsByName()
, and getElementById()
.
getElementsByTagName()
takes a tag name (element node name) as an argument and returns a collection (array-like object) of nodes that match the tag name. For example, the following script will count the number of paragraphs (the <p>
tag) in the document:
> document.getElementsByTagName('p').length;
3
Access to each element of the collection can be obtained using square brackets or the item()
method by specifying the index of the desired element (0 for the first element). For example:
> document.getElementsByTagName('p')[0];
<p class="opener">first paragraph</p>
> document.getElementsByTagName('p').item(0);
<p class="opener">first paragraph</p>
To retrieve the content of the first <p>
tag, you can use the innerHTML
property:
> document.getElementsByTagName('p')[0].innerHTML;
"first paragraph"
To access the last <p>
tag:
> document.getElementsByTagName('p').item( document.getElementsByTagName('p').length - 1 );
<p id="closer">final</p>
To access the attributes of an element, you can use the attributes
array or the getAttribute()
method as shown before. However, a shorter way is to use the attribute name as a property of the element you're working with. This way, to get the value of the id
attribute, you can write it like this:
> document.getElementsByTagName('p')[2].id;
"closer"
However, this approach will not work when accessing the value of the class
attribute. This is an exception because the keyword "class
" is reserved in ECMAScript. To overcome this issue, you need to use the className
property instead:
> document.getElementsByTagName('p')[0].className;
"opener"
Using the getElementsByTagName()
method, you can retrieve an array-like collection of all elements on the page:
> document.getElementsByTagName('*').length;
8
getElementById()
is the most common method for accessing elements. You simply assign an ID attribute to the elements you intend to work with later, and then access them using the following approach:
> document.getElementById('closer');
<p id="closer">final</p>
Additional methods for fast access, introduced in modern browsers:
getElementsByClassName()
: searching for elements by the value of the class
attribute.querySelector()
: searching for an element based on a specified CSS selectorquerySelectorAll()
: this method is similar to the previous one, except that it returns all matching elements, not just the first one.nextSibling
and previousSibling
are two convenient properties for navigating the DOM tree when you already have a reference to a specific element.
nextSibling
refers to the next sibling node, which is the next element or node at the same level in the DOM hierarchy.previousSibling
refers to the previous sibling node, which is the previous element or node at the same level in the DOM hierarchy.These properties allow you to traverse the DOM tree horizontally, moving to the next or previous sibling element or node from the current reference point.
> var para = document.getElementById('closer');
> para.nextSibling;
#text
> para.previousSibling;
#text
> para.previousSibling.previousSibling;
<p>…</p>
> para.previousSibling.previousSibling.previousSibling;
#text
> para.previousSibling.previousSibling.nextSibling.nextSibling;
<p id="closer">final</p>
document.body
is a property that represents the <body>
element of an HTML document. It provides access to the content within the <body>
tag, which is the main area of the webpage visible to the user.
> document.body;
<body>…</body>
> document.body.nextSibling;
null
> document.body.previousSibling.previousSibling;
<head>…</head>
firstChild
and lastChild
are properties of a DOM node that provide access to its first and last child nodes, respectively. firstChild
is equivalent to childNodes[0]
, and lastChild
is equivalent to childNodes[childNodes.length - 1]
:
> document.body.firstChild;
#text
> document.body.lastChild;
#text
> document.body.lastChild.previousSibling;
<!-- and that's about it -->
> document.body.lastChild.previousSibling.nodeValue;
" and that's about it "
Finally, here is a function that takes any DOM node and traverses the entire DOM tree starting from the given node:
function walkDOM(n) {
do {
console.log(n);
if (n.hasChildNodes()) {
walkDOM(n.firstChild);
}
} while (n = n.nextSibling);
}
Usage example:
> walkDOM(document.documentElement);
> walkDOM(document.body);
Let's save a reference to the last <p>
tag in a variable (remember that we are using a separate test page for all the examples):
> var my = document.getElementById('closer');
By changing the value of the innerHTML
property, we modify the content of the <p>
tag
> my.innerHTML = 'final!!!';
"final!!!"
Since innerHTML
accepts an HTML-formatted string, you can create a new DOM element node as follows:
> my.innerHTML = '<em>my</em> final';
"<em>my</em> final"
The new em
-node becomes part of the DOM tree:
> my.firstChild;
<em>my</em>
> my.firstChild.firstChild;
#text "my"
Another way to change the text inside a tag is to directly access the text node and modify its nodeValue
property:
> my.firstChild.firstChild.nodeValue = 'your';
"your"
More often, we need to change the presentation of elements rather than their content. All elements have a style
property, which in turn contains properties that correspond to CSS properties. Here's an example of how you can change the style of a paragraph by adding a red border to it:
> my.style.border = "1px solid red";
"1px solid red"
CSS properties are often written with hyphens, which are not supported in JavaScript. In such cases, you should omit the hyphen and convert the following letter to uppercase. Thus, the CSS property padding-top
becomes paddingTop
, margin-left
becomes marginLeft
, and so on:
> my.style.fontWeight = 'bold';
"bold"
Additionally, there is access to the cssText
property, which allows you to work with styles as a string:
> my.style.cssText;
"border: 1px solid red; font-weight: bold;"
To modify the style, you need to modify the string:
> my.style.cssText += " border-style: dashed;"
"border: 1px dashed red; font-weight: bold; border-style: dashed;"
To create new nodes in the Document Object Model, you should use the methods createElement()
and createTextNode()
. Once you have created a new node, you can add it to the DOM tree using methods like appendChild()
(or insertBefore()
, or replaceChild()
).
Creating a new p
-element node and setting its text content:
> var myp = document.createElement('p');
> myp.innerHTML = 'yet another';
"yet another"
The new element automatically inherits all default properties, including the style
property, which you can modify:
> myp.style;
CSSStyleDeclaration
> myp.style.border = '2px dotted blue';
"2px dotted blue"
By using the appendChild()
method, you can add the new node to the DOM tree. Calling this method on the document.body
node means creating one or more child nodes immediately after the last child node element. In our case, the new p
-element will be added to the end of the page.
> document.body.appendChild(myp);
<p style="border: 2px dotted blue;">yet another</p>
Using the appendChild()
method, you can only add a new child element to the end of the selected element. To specify the exact position for the new element, you can use the insertBefore()
method. Its functionality is similar to appendChild()
, but it takes an additional parameter that indicates where (before which element) the new element should be placed.
For example, the following code will insert a text node at the end of the BODY
:
> document.body.appendChild(document.createTextNode('boo!'));
The following code creates another text node and inserts it as the first child element of the BODY
node:
> document.body.insertBefore(
document.createTextNode('first boo!'),
document.body.firstChild
);
Using innerHTML
makes it easier to create new nodes compared to using pure DOM methods. When creating element nodes exclusively with DOM methods, you need to follow several steps:
With this method, you can create any number of nodes and organize their nesting as desired. For example, let's say you need to add the following HTML code to the end of the body
tag:
<p>one more paragraph<strong>bold</strong></p>
The hierarchy of nodes will look as follows:
P (paragraph) element node
Text node with the value "one more paragraph"
STRONG element node
Text node with the value "bold"
Thus, the code to create and insert these new elements into the document looks as follows:
// Create a P element node
var myp = document.createElement('p');
// Create a text node and append it to the P element node
var myt = document.createTextNode('one more paragraph ');
myp.appendChild(myt);
// Create a STRONG element node and append a text node to it
var str = document.createElement('strong');
str.appendChild(document.createTextNode('bold'));
// Append the STRONG element node to the P element node
myp.appendChild(str);
// Append the P element node to the BODY
document.body.appendChild(myp);
Another way to create new nodes is by copying (or cloning) an existing node. The cloneNode()
method is used for this purpose and accepts a boolean parameter (true
for deep cloning, including all child elements, false
for shallow cloning of only the current element). Let's use this method.
Let's save a reference to the element we want to copy in a variable:
> var el = document.getElementsByTagName('p')[1];
Now the variable el
refers to the second paragraph, which looks like this:
<p><em>second</em> paragraph</p>
Let's perform a shallow copy of this element and insert it into the body
:
> document.body.appendChild(el.cloneNode(false));
You won't see any changes on the page because with shallow copying, only a copy of the <p>
element is created without its nested elements. This means that the text inside the paragraph (which is a child text node) will not be copied.
The executed code is equivalent to the following:
> document.body.appendChild(document.createElement('p'));
If deep copying is performed, the entire DOM tree starting from the P
-element will be copied, including the text nodes and the EM
-element. The following code will fully copy the second paragraph to the end of the document:
> document.body.appendChild(el.cloneNode(true));
If you prefer, you can copy only the EM
node:
> document.body.appendChild(el.firstChild.cloneNode(true));
<em>second</em>
... or only the text node with the value 'second':
> document.body.appendChild(el.firstChild.firstChild.cloneNode(false));
"second"
To remove nodes from the DOM tree, the removeChild()
method is used.
Here's how you can remove the second paragraph (remember that we are using a separate test page for all the examples):
> var myp = document.getElementsByTagName('p')[1];
> var removed = document.body.removeChild(myp);
The removeChild()
method returns the removed element, in case you need to use it further. You can still use all DOM methods on the removed element, even though it no longer exists in the DOM tree.
For example:
> removed;
<p>…</p>
> removed.firstChild;
<em>second</em>
There is also a method called replaceChild()
that removes a node and inserts a new one in its place.
Here's how you can replace the second paragraph with the one stored in the removed
variable:
> var p = document.getElementsByTagName('p')[1];
> var replaced = document.body.replaceChild(removed, p);
Just like removeChild()
, replaceChild()
also returns a reference to the node that was removed from the DOM tree.
> replaced;
<p id="closer">final</p>
The quick way to remove all content inside an element is to assign an empty string to the innerHTML
property. The following code will remove all the children of the BODY
tag:
> document.body.innerHTML = '';
""
To check if the BODY
tag no longer has any children, you can use the following code:
> document.body.firstChild;
null
To remove nodes using only DOM methods, you would need to traverse all the descendants of a given node and remove each one individually. Here's a small function that removes all nodes starting from the provided node:
function removeAll(n) {
while (n.firstChild) {
n.removeChild(n.firstChild);
}
}
You can call this function with the desired node as an argument to remove all its descendants. For example, to remove all nodes within the BODY
tag, you can use
> removeAll(document.body);