It’s just data

Creating well-formed XHTML

Sjoerd Visscher: We have a simple client-side clean-up script that extracts well-formed XHTML from the WYSIWYG editor. It even handles pasted HTML from Word rather well. I discussed it with my collegues today, and we are willing to make that script available as open source if people are interested.

Excellent!


I've been running IE's WYSIWYG output through HTML Tidy for about a month now, and it's been working quite well. I'd be curious to know what advantages Sjoerd's script has over Tidy.

Posted by Roger Benningfield at

Regarding welformed XHTML

Isn't it a bit ironic that the Scripting News rss file of July 4th (isn't that a special day), has an invalid character on line 30 positionn 392.

It is not showing up in IE6 and gives an error.
(encoding problem)

Cybarber

Posted by cybarber at

That was a quick refresh lets hope, RSS2 will get an even quick refresh soon.

By the way I have been revitalising the IE4 Channel Definition File (CDF) format under windows XP this week (based on Yasser Shohouns ASPX files at learnxmlws.com)

I will give some more info later but here is a msxml demo conversion page with RSS/RDF conversion to CDF and (Sjoerd Visschers XSLT) conversion of RSS to RDF.

http://cybarber.ath.cx/TestingwithSources.htm

Cybarber

Posted by cybarber at

"I wonder if the Movable Type people have figured out how to create an [wysiwyg] editor that real people will use that produces XHTML. The most popular editor among Radio users on Windows produces perfectly horrible HTML, which we encode and put in the RSS feeds that all aggregators handle perfectly well. We can't change the editor because it's baked into the browser. Do you think users would understand if we told them they had to use a much worse editor and enter the tags themselves because that made more sense to Ben Trott?"
- ScriptingNews

I guess this (potentially) takes care of that, not that it was much of an obstacle to begin with.

Posted by Tomas at

That's right. In fact, I use Radio, so I was specifically thinking of their editor when it was decided that XHTML should be the preferred format.

Posted by Sjoerd Visscher at

This reminds me of an issue that would make a good use case to think through when considering the api.

A lot of blog servers do this thing where they accept tags and text that mean something special to the specific blog engine.  One example is livejournal's "lj" tags.  These have special meaning and are converted in the server on their way to consumption.

Another example is the comments for this blog.  Like Word, it converts asterisks to "b" or "em" tags before consumption.

Then there's what wikis do with ThingsLikeThis.  In each of these cases, what the editor submits is different than what is eventually disseminated.

This pattern is prevalent enough that it's something that should probably be supported; otherwise the API will have limited usefulness.

To support it, there should be a method in the API intended for editing that gets the source of the entry (not the to-be-consumed version).

(goes to write this on the wiki...)

Posted by Chris Wilper at

UPDATED demo page

with several participants(Pilgrim, Sopolsky, etc)  NECHO feed conversion to CDF(Channel Definition Format)
(see also necho 0.1 group)

http://cybarber.ath.cx/TestingwithSources.htm

Cybarber

Posted by cybarber at

a well formed name for necho:

S(ynchronized) A(ggregated) M(essaging)

or short

SAM

a tribute to...
Cybarber

Posted by cybarber at

Creating well-formed XHTML. Sjoerd Visscher: We have a simple client-side clean-up script that extracts well-formed XHTML from the WYSIWYG editor. It even handles pasted HTML from Word rather well. I discussed it with my collegues today, and we are...

Excerpt from André Venter: Dev at

SAM (formerly known as ECHO or NECHO)

now stands for

S(yndication A(ggregation) M(arkup)

My SAM2CDF and (soon SAM2RSS SAM2RDF and RSS2SAM and RDF2SAM) XSLT conversion demo page is up.

Cybarber

Posted by cybarber@home.nl at

About time! 

At this point, I do not use any application to clean up my XHTML because of obvious concerns.  Most of my sites are in table-less <div> layouts, and the applications are simply not smart enough to handle the cleaning. 

I look forward to the release of this script, including the export of code written in Word to an editor. 

Right now, whenever I have an article to publish, I must change the typeface to Courier New (less whitespace problems with this font for some reason), add in the HTML for publishing, and then paste it into Notepad to nix any of MS-Word's <sarcasm> wonderful auto-formatting </sarcasm off>. 

Then, <sigh> I paste it into a WYSIWYG editor (usually DMX).  What a hassle this is. 

I even do a find and replace for all the auto-formatted quotes that end up with invalid XHTML markup. 

So, in a nutshell, I'll be a buyer of your script.  :)

Joshua

Posted by Joshua at

Christian Romney

As much as I like Sam [Ruby], I hate the name SAM, S(yndication A(ggregation) M(arkup). I would bet Sam [Ruby] doesn't care much for it either. It clouds the issue.

Message from Christian Romney at

Add your comment