Skip to content
Christer Sandberg edited this page Apr 27, 2015 · 4 revisions

This page covers version 1.x

Overview

This project is a Java implementation of the W3C Selectors Level 3 specification. There are several implementations around using JavaScript and WebKit has native support for them, but I couldn't find a Java implementation so I decided to create one.

I couldn't have done this without peeking at the other implementations, so thanks to all the developers that have inspired me.

Motivation

Well... You can always use XPath, but I think that CSS selectors are easier to understand and more beautiful.

Implementation details

When thinking about doing this implementation the Ragel State Machine Compiler was an obvious choice for me. I've been playing around with it for some time, and even used it in one project where I created a CSS inspired format to style subtitles, and it's an amazing tool I think.

Supported selectors

  • * any element
  • E an element of type E
  • E[foo] an E element with a "foo" attribute
  • E[foo="bar"] an E element whose "foo" attribute value is exactly equal to "bar"
  • E[foo~="bar"] an E element whose "foo" attribute value is a list of whitespace-separated values, one of which is exactly equal to "bar"
  • E[foo^="bar"] an E element whose "foo" attribute value begins exactly with the string "bar"
  • E[foo$="bar"] an E element whose "foo" attribute value ends exactly with the string "bar"
  • E[foo*="bar"] an E element whose "foo" attribute value contains the substring "bar"
  • E[foo|="en"] an E element whose "foo" attribute has a hyphen-separated list of values beginning (from the left) with "en"
  • E:root an E element, root of the document
  • E:nth-child(n) an E element, the n-th child of its parent
  • E:nth-last-child(n) an E element, the n-th child of its parent, counting from the last one
  • E:nth-of-type(n) an E element, the n-th sibling of its type
  • E:nth-last-of-type(n) an E element, the n-th sibling of its type, counting from the last one
  • E:first-child an E element, first child of its parent
  • E:last-child an E element, last child of its parent
  • E:first-of-type an E element, first sibling of its type
  • E:last-of-type an E element, last sibling of its type
  • E:only-child an E element, only child of its parent
  • E:only-of-type an E element, only sibling of its type
  • E:empty an E element that has no children (including text nodes)
  • E:contains(text) an E element containing the specified text
  • E#myid an E element with ID equal to "myid".
  • E:not(s) an E element that does not match simple selector s
  • E F an F element descendant of an E element
  • E > F an F element child of an E element
  • E + F an F element immediately preceded by an E element
  • E ~ F an F element preceded by an E element

TODO

  • Namespace support
  • Better error reporting on scanner errors.
  • ...

Example usage

Suppose you have a DOM document in a variable named document that you'd like to query using CSS selectors:

NodeSelector selector = new DOMNodeSelector(document);
Set<Node> result = selector.querySelectorAll("div:nth-child(2n)");

Note that a Set is returned instead of a NodeList. This is different from the NodeSelector interface specified in the W3C Selectors API specification. The motivation behind this is to make it easier for other implementations that doesn't use the DOM to base their implementation on this projects NodeSelector interface which uses generic types.

Please see the Javadoc or source code for more information.

Build instructions

This project is built using Maven.