November 2004 Archives

RelaxngInference

| No Comments | No TrackBacks

Yesterday I started to write RELAX NG grammar inference. I hope this design won't ** you.

document("")

| No Comments | No TrackBacks

Just voted my first 5 on this very important bug that shows W3C standard conformance breakage.

Such XPathNavigator instance could be kept in memory only for such a stylesheet that contains document("")document() (it could be done in static analysis). So the reason of "by design" does not make sense.

Real developers could just implement standard-conformant implementation in easy way, instead of using casuistry on whether it is conformant or not, which just result in imposing annoyance on real users.

Let's make System.Xml 2.0 not suck

| No Comments | No TrackBacks

On the suggestion on "infer elements always globally", Am getting positive feeling from Microsoft XML guys via the feedback center.

On the other hand, am getting negative response for the suggestion on XmlSchemaSimpleType.ParseValue() which validates string considering facets. But I believe that XML Schema based developers will be absolutely appreciated by that feature. For example, it will be mandatory for XQP project that must support user-defined type constructor defined in the section 5 of W3C XQuery Functions and Operators specification. Microsoft guys might want to help your development.

It depends on you, XML developers, whether Microsoft will improve their library or not. We could provide our own advantages, but it would be still better that your advanced code will run on MS.NET too.

(FYI: You can "vote" for the suggestions ;-)

Useful codeblock

| No Comments | No TrackBacks

using QName = System.Xml.XmlQualifiedName; using Form = System.Xml.Schema.XmlSchemaForm; using Use = System.Xml.Schema.XmlSchemaUse; using SOMList = System.Xml.Schema.XmlSchemaObjectCollection; using SOMObject = System.Xml.Schema.XmlSchemaObject; using Element = System.Xml.Schema.XmlSchemaElement; using Attr = System.Xml.Schema.XmlSchemaAttribute; using AttrGroup = System.Xml.Schema.XmlSchemaAttributeGroup; using AttrGroupRef = System.Xml.Schema.XmlSchemaAttributeGroupRef; using SimpleType = System.Xml.Schema.XmlSchemaSimpleType; using ComplexType = System.Xml.Schema.XmlSchemaComplexType; using SimpleModel = System.Xml.Schema.XmlSchemaSimpleContent; using SimpleExt = System.Xml.Schema.XmlSchemaSimpleContentExtension; using SimpleRst = System.Xml.Schema.XmlSchemaSimpleContentRestriction; using ComplexModel = System.Xml.Schema.XmlSchemaComplexContent; using ComplexExt = System.Xml.Schema.XmlSchemaComplexContentExtension; using ComplexRst = System.Xml.Schema.XmlSchemaComplexContentRestriction; using SimpleTypeRst = System.Xml.Schema.XmlSchemaSimpleTypeRestriction; using SimpleList = System.Xml.Schema.XmlSchemaSimpleTypeList; using SimpleUnion = System.Xml.Schema.XmlSchemaSimpleTypeUnion; using SchemaFacet = System.Xml.Schema.XmlSchemaFacet; using LengthFacet = System.Xml.Schema.XmlSchemaLengthFacet; using MinLengthFacet = System.Xml.Schema.XmlSchemaMinLengthFacet; using Particle = System.Xml.Schema.XmlSchemaParticle; using Sequence = System.Xml.Schema.XmlSchemaSequence; using Choice = System.Xml.Schema.XmlSchemaChoice;

I've 90% finished XmlSchemaInference. I implemented it only because .NET 2.0 contains it.

XmlSchemaInference is very useful. For example, if you have such document like:

<products> <category> <category> <product name="foo" /> <product name="bar" /> <product name="baz" /> </category> <product name="hoge" /> <product name="fuga" /> </category> </products>

It creates two different definition of "product" elements. Here is the infered schema and generated serializable class.

So now I wonder if I had better port the same feature to Commons.Xml.Relaxng. RELAX NG is not so sucky than XML Schema, so I might be able to provide better XML structure inference engine. But XML structure inference itself is not so fun.

... after some thoughts, I decided to enter a new suggestion to MS feedback center which seems working again recently.

xml:id and canonical XML

| No Comments | No TrackBacks

I found that the Last Call working draft of xml:id was out. But I think xml:id will be incompatible with Canonical XML (xml-c14n). Below is an excerpt from 2.4 Document Subsets in xml-c14n W3C REC:

The processing of an element node E MUST be modified slightly when an XPath node-set is given as input and the element's parent is omitted from the node-set. The method for processing the attribute axis of an element E in the node-set is enhanced. All element nodes along E's ancestor axis are examined for nearest occurrences of attributes in the xml namespace, such as xml:lang and xml:space (whether or not they are in the node-set). From this list of attributes, remove any that are in E's attribute axis (whether or not they are in the node-set). Then, lexicographically merge this attribute list with the nodes of E's attribute axis that are in the node-set. The result of visiting the attribute axis is computed by processing the attribute nodes in this merged attribute list.

Well, I don't think xml:id is wrong here. It is xml-c14n that is based on non-committed premises that all xml:* attributes must be inherited (yes, xml:lang, xml:space and xml:base were). Anyways, don't worry about that incompatibility. Canonical XML is already incompatible with XML Infoset with related to namespace information items.

A change of seasons

| No Comments | No TrackBacks

Many of my friends have been saying that they feel sorry for Rupert that he does not have any more clothes in this cold season. Today I was hanging around Shibuya (central Tokyo area) with my friends, and they were so kind to buy a new one for him (from my budget). Now he looks younger than before.

XmlSchemaInference

I was escaping from /doc stuff and looking into xsd inference task (I cannot stand working only on that annoying task). I wrote some notes but incomplete. Apparently the most difficult area is particle inference, but right now not so many ideas. My current idea is to support non-XmlSchema language.

the latest /doc patch

| No Comments | No TrackBacks

I've finally hacked all /doc support feature, including the related warnings. Am now working on testing stuff; I've already done warning tests, but need to compare results. Here's the latest patch and the set of the compiler sources.

monodoc-aspx on windows

For those who are interested, here is my local changes in monodoc to make monodoc-aspx runnable on windows.

ResolveEntity()

| No Comments | No TrackBacks

XmlReader.ResolveEntity() is one of the biggest problem for custom XmlReader implementors, since it never provides XmlParserContext. Well, most of those implementations would just use XmlReader as construction arguments. In such cases, they could just invoke the argument XmlReader's. ResolveEntity(). If you have DTD information (name / public ID / system ID/internal subset), then you're still lucky; because you can create XmlDocumentType node and XmlEntityReference that holds ChildNodes. That's how our XmlNodeReader is implemented now.

btw, XmlTextReader in System.Xml 2.0 can resolve entities. It means, now XmlTextReader always checks if there is entity reader inside the class. Actually the similar situation lies in XmlNodeReader and DTD validating reader. They are mostly the same (still different though). As an example, I wrote new XmlNodeReader and XmlTextReader based on the old implementations, but on entity handling, they are so close. So I think, handling entity-resolvable XmlReader might be possible to extract to one (abstract) class - I haven't tried though (since I cannot change the class hierarchy on XmlNodeReader and that of XmlTextReader).

A similar problem and possible solution lies under post validation information providor (such as DTD validator and XSD validator, that handle default values). But I won't provide common solution for it, because people should never use something like PSVI that makes documents inconsistent (well, entity is also, but it is already-happened disaster).

The United States of Absurdity

| No Comments | No TrackBacks

Am being disappointed in American citizens.

Anyways, XmlTextReader got 20% faster yesterday in my box and in my testing. With my pending patch, it even goes 33% (as total), but I haven't committed as yet because it is nasty, expanding some functions inline manually.

XQueryCommand is dropped. How about XQueryConvert?

From the revision log of latest W3C XQuery 1.0 working draft:

A value of type xs:QName is now defined to consist of a "triple": a namespace prefix, a namespace URI, and a local name. Including the prefix as part of the QName value makes it possible to cast any QName into a string when needed.

xs:QName is mapped to XmlQualifiedName, which only contains local name and namespace URI.

CS1587

| No Comments | No TrackBacks

I decided to postpone /doc patch checkin (well, requesting approval and checkin) for a while (maybe two weeks or so), since there are many dangerous factors (many changes and breakage) for committing patches right now. Instead, I decided to create the complete patch for /doc (that implies no more significant changes intended).

But that also means, I have to add parser/tokenizer annoyance for CS1587 "XML comment is placed on an invalid language element which can not accept it." ... it is really annoying. Just for the support for that warning, my small patch increased 1.5 times larger than before. And since doc comment is not kind of token which can be recognized by the parser (it is not "error" to have those comments in improper code blocks), the task is mostly done by the tokenizer, and I need to keep track of token transition at a time. Having tokenization control code both in parser and in tokenizer is a bad idea, but there was no better way (there was a similar annoyance in XQuery parser).

Anyways, that CS1587 task is mostly done (I hope so). I still have to handle 'cref' related warnings, which are also annoyance.

It is still a mystery that such person that dislikes /doc feature is working on it (ah, but was ditto for XmlSchema ;-).

[19:30] I noticed that I had been making significant changes with related to the latest changes in mcs that makes my /doc patch mostly useless. Here's the latest patch.

Japanese Monkeyguide Translation

| No Comments | No TrackBacks

Recently one Japanese person has been contributing monkeyguide translations (well, we know that monkeyguide is old). I put those translations here. Those it is rather a repositry than good-to-browse pages. The files are still raw (because old monkeyguide did not have index.html).

Some Japanese people complains that there is no Japanese resources. We have. Please navigate from http://www.mono-project.com to "Resources" >> "Related Sites" >> "Mono Japanese Translation". The page style is old, but the large part of the translations are up-to-date (well, the web pages are slightly changed after the 1.0 release).