It’s just data

Interoperability and XSS Mitigation

Rob Sayre: Come to think of it, we might want to standardize similar policies for restricted HTML parsing. There’s even a W3C mailing list working on this stuff. Turns out mail clients have the same issues that feed readers do. And Google Reader is just one example of a website that has this problem. Why can’t browsers borrow this policy from email clients and feed readers, and allow site authors to activate it? That way, sites wouldn’t get burned by faulty markup sanitization.

I’ve created Sanitization Rules.  As it is a wiki page, free form additions and refactorings are welcome.


What about data URIs? They don’t work in IE but otherwise seem harmless, and are useful for things like sparklines.

Posted by Michael R. Bernstein at

XSS Follow-Ups

Sam Ruby started a wiki page on the matter. I love low-overhead standardization. Joe Walker suggested a SameRefererOnly cookie field. In the bug, Jonas has suggested that we back the proposed Content Restriction header with an implementation that...

Excerpt from Rob Sayre's Mozilla Blog at

Michael: it’s a wiki!  Don’t be shy.  data: does seem safe.

Posted by Sam Ruby at

Well, I added data and Rob pulled it back out.  I’ve started a discussion.

Posted by James Snell at

The combination of namespace and content make data URIs hard to reason about. That’s enough to eliminate from a white list, as far as I’m concerned. But if you have experience that makes believe they’re safe, you can change it back. I’ll just refer to

[link]

if I need a reference. I love version tracking.

Posted by Robert Sayre at

To minimise these sorts of arguments, it might be useful to include an annotated blacklist to complement the whitelist in each category. Not necessarily comprehensive, and certainly not intended to be used as a blacklist, but it would help to know why something isn’t included in the whitelist.

For example, I noticed that BDO is missing from the element list. Assumedly that’s because it’s terribly unsafe under certain conditions, but I can’t imagine what those conditions might be. Without a documented reason for its omission, I might be tempted to add it.

On a separate note, I find the concept of a generic attribute whitelist somewhat disturbing. Is it not possible that an attribute that is safe in one element could potentially be unsafe when used with another element? Or is that being overly paranoid?

Posted by James Holderness at

Sanitation

It’s amazing how issues float to the top of multiple minds independently. I’ve been spending a lot of time thinking about how to sanitize to-be-published data. Then Rob Sayre wrote Interoperability and XSS Mitigation ; XSS stands for “cross-site...

Excerpt from ongoing at

It would be nice if someone could write up a whtielist for CSS, not just a regex of things to strip.

Posted by Paul Querna at

It would be nice if someone could write up a whtielist for CSS, not just a regex of things to strip.

You mean, something like a whitelist of CSS style properties, and a list of CSS style property values?  Or do you mean something else?

All: if you find things that aren’t clear or need to be added: remember, it is a wiki!

Posted by Sam Ruby at

D’oh.  I can’t read.

One thing that doesn’t seem clear, is that the in the properties white list, it include things like ‘height’, but in the values whitelist, there aren’t expressions like \d+px or the like. Is it safe to have something like height: 30px; or are there issues around that?

Also, has anyone looked at doing a CSS cleaner using Libcroco?
[link]

Posted by Paul Querna at

in the properties white list, it include things like ‘height’, but in the values whitelist, there aren’t expressions like \d+px or the like

Good catch.  Added.

Posted by Sam Ruby at

Add your comment