Java DOM Parser

What is DOM?

DOM stands for Document Object Model, and it’s essentially the common convention for representing an XML and HTML files. However, for this specific tutorial, we will only be focusing on XML files.

The structure consists of elements, also known as nodes, which may have more elements inside of them.
For example, consider this list of players on a team:

<?xml version="1.0" encoding="UTF-8"?>
<team>
  <player>
    <first_name>Henry</first_name>
    <last_name>Dang</last_name>
    <ranking>1</ranking>
  </player>

  <player>
    <first_name>John</first_name>
    <last_name>Smith</last_name>
    <ranking>3</ranking>
  </player>

  <player>
    <first_name>Kelly</first_name>
    <last_name>Doe</last_name>
    <ranking>2</ranking>
  </player>
</team>

In this case, “team” is known as the root, but it is still an element.
The “team” element has three “player” elements inside of it. Each “player” element also has three distinct elements: “first_name”, “last_name”, and “ranking”.

The “first_name”, “last_name”, and “ranking” elements all have values inside, which are known as the element content. If an element has an element content (not an attribute), then it does not have any child elements.

How to Parse the XML

First, you need to create an XML file with the above elements pasted inside.

Next, we need to place the XML file inside the project.

Parsing the actual XML is simple and straight-forward.

public void parseFile() {
  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

  try {
    DocumentBuilder db = dbf.newDocumentBuilder();
    String path = new File("data.xml").getAbsolutePath();
    Document doc = db.parse(path);

    // Puts document into the conventional normal form
    doc.normalize();

    // Make a node list with all the "player" elements
    NodeList list = doc.getElementsByTagName("player");

    System.out.println("------------------------------");

    for (int i = 0; i < list.getLength(); i++) {

      // Grab the i'th "player" tag
      Node node = list.item(i);

      if(node.getNodeType() == Node.ELEMENT_NODE) {
        // If it's an element, we can safely cast the node to an element
        Element element = (Element) node;

        //Now we can print out the value for all
        //child nodes of the player element.
        System.out.println("First name : " +
          element.getElementsByTagName("first_name")
              .item(0)
              .getTextContent());

        System.out.println("Last name : " +
          element.getElementsByTagName("last_name")
              .item(0)
              .getTextContent());

        System.out.println("Ranking : " +
          element.getElementsByTagName("ranking")
              .item(0)
              .getTextContent());

      System.out.println("------------------------------");
      }
    }
  } catch (ParserConfigurationException | SAXException | IOException e) {
  e.printStackTrace();
}

And the output :

——————————
First name : Henry
Last name : Dang
Ranking : 1
——————————
First name : John
Last name : Smith
Ranking : 3
——————————
First name : Kelly
Last name : Doe
Ranking : 2
——————————

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s