pegurl.js
pegurl.js is the result of two days worth of work. While it is undoubtedly buggy and incomplete, it does pass 255 out of 256 tests and that last test is wrong. For comparison: results from other user agents.
Current work products:
- Source: API, grammar; the latter based on PEG.js
- LiveViewer. Differences mean that either or both of the following are true: (a) pegurl.js doesn’t match the Url Standard or (b) the Url Standard doesn’t match your browser.
- Grammar expressed in the form of railroad diagrams. Produced using Gunther Rademacher’s converter.
Future work:
- The implementation is incomplete, in particular, much of the character encoding logic and IP address parsing is just roughed id at this point.
- I’d like to propose a number of changes to the test results; mostly to more closely match existing browser behavior, and perhaps where possible to make the implementation logic less convoluted. Meanwhile, I felt that it was important to have a faithful baseline implemented so that I could experiment with changes and see if there were any unintended consequences to those changes.
- More tests! There’s no such thing as too many tests.
- Rewrite URL parser. I suspect that the railroad diagrams (converted to bikeshed?) plus the parts of the grammar contained in curly braces expressed in prose would be more comprehensible and maintainable than the current state machine approach.