It’s just data

WHATWG URL vs IETF URI

I’ve been looking into differences between the WHATWG URL Living Standard and the combination of RFC 3986 and RFC 3987.  I’ve come up with an indirect but effective way to identify the differences.  To start with I downloaded urltestdata.txt and urltestparser.  I then wrote a small script to convert the test data into json.

I then wrote another script to take this data and pass it through what is advertised as a closely conforming implementation of the relevant RFCs.

Looking at the results, the first set of issues related to the stripping of leading and trailing whitespace, so I updated the script to do that to focus on the remaining differences.  Similarly, the URL parsing definition includes the leading ? and # in the query and fragment values respectively, so I eliminated those differences in the cases where the values were non-empty.

The resulting script produces the this output.

The next set of differences concern canonicalization, so I ran tests using Addressable’s normalize method.  Note that as this as this non standard.  Updated output including normalization.


Based on what you have as output for e.g. “http://0Xc0.0250.01” it seems these tests might not actually match the specification in all cases. (Though mostly it looks familiar and correct.)

Posted by Anne van Kesteren at

url test results by browser.

Updates to the test data should be sent as pull requests to w3c/web-platform-tests.

See a user agent that should be included in the results?  Visit urltest and leave a comment with the user agent and hex code that that the web page reports.

Posted by Sam Ruby at

urltest is JS only. Does it make sense to test things like httpie, curl, modules and libraries from ruby, python, php and so on?

Posted by karl at

Opera/9.80 (Macintosh; Intel Mac OS X 10.9.5) Presto/2.12.388 Version/12.16
256b53c71f5140f5276307c0158fa175

Posted by zcorpan at

urltest is JS only. Does it make sense to test things like httpie, curl, modules and libraries from ruby, python, php and so on?

Sure!  I’ll note that the ‘IETF’ rows actually represent data captured by a Ruby library.  My personal preference is to focus on modern, actively maintained or spec compliant applications.  A counter-example would be Java.

Opera/9.80 (Macintosh; Intel Mac OS X 10.9.5) Presto/2.12.388 Version/12.16

Added.  Thanks!

Posted by Sam Ruby at

test case review

Posted by Sam Ruby at

To address a problem Anne found, I updated urltesttojson.js, and then updated the urltestdata.json, captured new results for each browser (thanks, Simon!), and produced new output.

Colors on the initial page triage results:

Clicking through to an individual result, lack of convergence is represented by an entire column in gold.  Exceptions thrown are shown in pale violet red (#D87093).

Posted by Sam Ruby at

I’ve updated the colors to split out no convergence (Pale Red) from convergence doesn’t match WHATWG (Hot Pink - #FF69B4).

Posted by Sam Ruby at

PLH ran these tests using the following user agent:

Mozilla/5.0 (iPad; CPU OS 8_0_2 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12A405 Safari/600.1.4

I compared these results to those obtained from Version/7.1 Safari/537.85.10 on Intel Mac OS X 10_9_5.

Not a single result changed.

Posted by Sam Ruby at

Thank you for sharing with us this insight is lacking. the attractiveness, usefulness, it feels when I finished reading the written language in this blog [obat gatal-gatal merah di badan]. Thanks to all authors. a bit of advice, however, that the author put forward his every interaction. - obat penghilang keloid yang sudah lama

Posted by bin at

Thank you in advance for giving us a lot of inspiration. not only that, the benefits of education, skills are also in this blog. many tasks as well as I finish through the aid of this blog. obat anyang-anyangan - cara mengatasi sakit ulu hati

Posted by rada at

Thank you in advance for giving us a lot of inspiration. not only that, the benefits of education, skills are also in this blog. many tasks as well as I finish through the aid of this blog. obat anyang-anyangan - cara mengatasi sakit ulu hati

Posted by rada at

fit once I found this blog. here are many things that can inspire me, ranging from pictures, quotes, articles, poems, short stories, and more. Thanks writers. cara mengobati ginjal bengkak obat batuk kering obat anyang-anyangan pada wanita

Posted by jawba at

Thanks for sharing this information. I really like your blog post very much. You have really shared a informative and interesting blog post with people.

Posted by Ibrahim at

You have posted a detailed document having full of latest information. I read this whole content and really liked it. Thanks for sharing.

Posted by email sign up at

I received many useful information from your article. Thank you for sharing

Posted by Baby Names at

I think this is an informative post and it is very useful and knowledgeable. I really enjoyed reading this post. big fan, thank you!

Posted by geometry dash 2.0 at

Add your comment