Validator.Nu on GCJ Update
ruby test/fonts.rb test/google.html size => -1 size => -1 size => -2 size => -2
Notes:
Another gcj bug slowed me down for a bit.
HtmlDocumentBuilder has a private method to lazily initialize a driver, but that method was never called. I added a call — I hope Henri agrees.
I now pass the raw bytes from Ruby to Java (as a ByteArrayInputStream). Future iterations can replace this read and memcpy with a BufferedReader so that it won’t be necessary to have two entire copies of the file being parsed in memory.
Java objects are wrapped with Ruby objects, the Ruby objects are linked so that attributes point to elements which point to the owning document. The Java document object will be released when the Ruby document object no longer has references.
Individual methods are very straightforward to implement: CNI makes native access to Java objects fairly transparent, and Ruby’s native API is also each to pick up and use. Take a look at jaxp_element_attributes if you would like to see a good example:
- Unwrap the Ruby Element to find the Java Element
- Create a new Ruby Hash
- Get the attributes from the Java Element, and iterate over them:
- Retrieve Java Attribute object from the Java NamedNodeMap
- Wrap the Java Attribute object in a Ruby Attribute object
- Set
@element
in the Ruby Attribute object to point to the Ruby element - Add the Ruby Attribute object to the hash with the attribute name as the key
- Return the Ruby hash
It doesn’t look like FFI will be as easy as CNI is inherently C++. Of course, JRuby can directly call into Java — but there still will be a need for a more Rubyish API.
JAXP’s implementation of XPath is fully namespace aware making it a bit cumbersome to use. If CSS selectors are to be implemented by mapping the selectors to XPath expressions, this is something that will need to be accomodated.
Conversion of strings (Java uses UTF-16, Ruby UTF-8) is only done when necessary.
At the moment, I’ve only implemented data access and traversal methods, but there is no reason that these methods can’t construct new Java objects and modify the DOM.