Below are emails that Madpole sent me describing
his experiences with XML, which I am keen to
share with you. This is funny, but is aimed
at techies.
- Eadon
Madpole
on < XML />: Yes, Java is kool
because I am nicely developing our new system
without any hiccups or problems and without...
knowing any Java hehehe... it all works and
"glues" together... jar and blah... it is amazing
in a way...
But the amount of technologies thrown on me...
well... I start to question all this... it is
perhaps "early ages" but everybody has gone
totally XML mad it seems to me... I have been
spending days now typing rather than programming...
because it is easier to copy and paste some
text (even from XML file) rather than create
hundreds of nodes and complex tree structures
for no reason at all... I am lately thinking
of inventing yet another tool hehehe (there
are millions of them already) - tool which would
take my text in text format... substitute variables
and parse that through XML engine so I don't
have to create Nodes manually.. Mad Mad Mad
long story....
But there is good new... all those technologies
are so new... and there is so many of them...
I haven't make any progress in my "Urgent" project...
one finds oneself clicking and searching and
reading.. and simple task suddenly becomes 1
week of investigations after which one just
gives up and writes some "Mickey Mouse" code
to cope with the problem...
Over the last 2 weeks I had to deal with Dom,
Sax, Fop, XML, XSL, XSLT, LT, XPath, jaxp, javax,
xalan, xerces... my head is spinning hehehe...
and I tried to get some data from XML file ..
(which I discovered is in fact fPmL file)...
and I could not... and now I discovered that
the reason is that they used XPointer and xlinkto.com
and what have You... This is not all... the
link below demonstrates the best.... once dives
into XML... one will get very little work done
hehehe:
http://www.oasis-open.org/cover/xml.html#applications
[opens in a new window]
I start to suspect that all those technologies
create "circular dependencies"... XML files
produce XML files by using XSL (which is XML
too of course)... I predict that in 1, 2 years
time people will start to say "Hmmm... I heard
there is this new software called "Database"
and You can just get data out of this "Database"
by writing simple queries... Magic!!! HAHAHAHAHHAHAHAHAHAHAHA.
XML Part
II
XML stands for "Extensible Markup Language".
That means that it can be "extended". My XML
experience suggests to me that in fact it can
ONLY BE EXTENDED. Everybody doing any work with
XML seems to be in need of EXTENDING it, rather
than just getting on with it. XML seems to be
this pure concept which is useless to anybody,
this "abstract generic class" which, however
nice in theory, has very little practical value.
So everybody is into "extending", "extending"
in a way is a synonym for "getting it to work"
in XML lingo. So nothing new here, everybody
is doing their own work, in their own way, twisting
and interpreting the concepts to their liking.
"But at least there are standards" people tell
me, "at least it is standard way of doing things".
Yes... there are standards... thousands of standards,
thousands of specifications, thousands of ways
of interpreting data in XML file. So what is
new? Who has time to read those documents? There
never was and never will be a "standard" way
to write Java or C++ code, but at least I can
look at this code and understand it. I am not
a lawyer or Technical Engineer, my job is to
... err .. get the job done. I work for Business,
not Academic community, I cannot afford or justify
time spent reading thousands of pages of documents
just to retrieve one data item from XML file.
Case Study:
Department A generates "fpml" file, which, yes,
you guessed it, is "extended XML for financial
products". "fpml" specification is HUGE for
it encapsulates hundreds of different trading
concepts and business logic units. Of course,
this format does not cover all the "optional"
data items the Department A needs to put into
the file. So Department "A" ends up producing
Extended "fpml" - the actual trading information
is probably 10% of the file - the rest of the
data are "optional" items needed for our particular
business. Department "B" enhances this file
with additional items, extending the extended
"fpml" even further. Department "C" (my department)
is the 3rd in a chain. But the difference is
that whether Department "A" and "B" produce
the file - adding extra items to it - I am the
first person in a chain who has ACTUALLY read
it and interpret the data. This turns into a
nightmare because there is not only XML specifications
to consider, but also "fpml" specification and
other bits and pieces which various departments
added along the way. After 2 weeks of deliberations,
various discussions with XML experts, searching
and scanning and reading Internet - I am no
closer to solution how to retrieve data from
this "Extended fpml" file in some standard,
organised and structured way. Xpath, XPointer,
XThis, XThat would all resolve some of my problems.
But there is no sign of "Unified Solution" and
therefore I find myself writing new technology:
"XMadPath" - with its own way of resolving references
in supplied XML file.
Case Study
Two
The XML file supplied to me contains about 130
lines. At the same time, this XML file represents
relational database with probably about 20 different
tables. Somebody clever made sure that there
is no data repetition in XML file by introducing
"links" to data items. Sometimes "href" attribute
is used to achieve that, sometimes "type" attribute.
I think the main aim of this "structure" was
to ensure that as few Nodes have Unique name
as possible - i.e. prevent anybody to get to
the data items directly. So one has to iterate
endlessly through various nodes over and over
again only to retrieve the reference value which
then can be used in other iterations to obtain
data item. That is quite a lot of processing
to do in order to read 130 lines of text, and
my latest idea is to push all the nodes - chevrons
and all - into hash table or any other flat
structure so I can access data directly. Let
me analyse few of my doubts about the whole
things: Some references to data are actually
larger than data itself, the nodes with the
same name have got DIFFERENT data items as children
which defeats the whole purpose. It took me
so far 1000 lines of code to retrieve 10 data
items from this file. In other words - if I
processed this file sequentially - line by line
- using "
if line = "<NODE>" then jump
to this function" - the code would be much
more readable, simple and efficient during execution.
But thanks to XML we don't file sequentially
- we "load it" into DOM tree (which means that
the file has been already read one sequentially)
- the typically we process Node by Node (which
means that we are reading this file sequentially
again) or we use XPath or XPointer or any such
tool (which does reading the nodes sequentially
and jumping up and down the branches) for us.
I would really like to see statistics of how
many times the tree is scanned forward and backwards
just to retrieve one data item... Blah blah
blah enough!
XML - A
structured way to pass the bug...
Department A extends fpml is which extended
XML, Department B extends it even further...
and I am a poor bastard who has to make some
sense of it... now... let's call me Department
C... Now... I am also Department D - the person
who is supposed to produce documents from XML
... if I was department C and not D - then I
will gladly extend the XML supplied by Department
B even further hehehehe....
The whole project (i.e. my project) was simple
- I am supposed to produce PDF documents from
XML... it was supposed a simple case of mapping
data in XSL and hey presto - Bob is ... But
due to extensive IT experience hehehe... I invented
XML to XML stage.... just in case.... I thought...
we will map most stuff but I need to have means
of tweaking few things if needed... the whole
idea was that I just practically copy the same
XML and have a hook for quick fixes which XSL
would not handle... I was even going to abandon
this idea due to time constraints... but now
... XML to XML became my MAJOR job hehehe...
There is no way we could produce PDF using XSL
from what I am getting... I am basically working
on "flattening" all those complex structures
which kind people produced for us in order to
keep us in a job I presume hehehe.... I am flattening
all the structures and making all "non-unque"
node names unique so when we get to XSL - we
will be able to bloody just retrieve the data
without writing thousands lines of code in order
to print somebody's name or address lol enough...
see.. I prefer to write all this bollox rather
than doing the work which involves writing about
30 lines of the same code for each Node in XML
file hehehe.
Please
add your comment to this page

 |  |  |  |  |
| From: |
MadPole | Subject: | 2004-04-15 16:27:06 |
 | | | | |
help: how to add your comment Page hits: 2881