Discussion:
[mdr-users] MDR usage as primary storage/cache
William Will
2002-09-27 17:55:45 UTC
Permalink
Hi there,

according to my understanding MDR 1.0 will only try to achieve usage as a
caching mechanism for metadata, and not as a form of primary storage. I'm
trying to understand the reasoning behind this, since it puts considerable
constraints on MDR's usability.

Essentially any application that derives metadata must dump this information
to XMI or some other format at some reasonably short time interval in order
to prevent data loss. I can also see the lack of primary storage
functionality as being a problem for any type of modeling tool that is
actually used to input mappings/metadata between different metamodels.
According to my understanding the ability to model and operate on multiple
levels of models was to be one of the main features of the MOF/MDR
combination...

So, are there any plans to make MDR with the current bTree storage a
reliable primary storage mechanism? I think some of the main promise of MDR
lies in this area of functionality.

Cheers,
William Will
Holger Krug
2002-09-27 18:50:00 UTC
Permalink
Post by William Will
according to my understanding MDR 1.0 will only try to achieve usage as a
caching mechanism for metadata, and not as a form of primary storage. I'm
trying to understand the reasoning behind this, since it puts considerable
constraints on MDR's usability.
Essentially any application that derives metadata must dump this information
to XMI or some other format at some reasonably short time interval in order
to prevent data loss. I can also see the lack of primary storage
functionality as being a problem for any type of modeling tool that is
actually used to input mappings/metadata between different metamodels.
According to my understanding the ability to model and operate on multiple
levels of models was to be one of the main features of the MOF/MDR
combination...
Hi, Will.

The restriction to use MDR only as data cache relieves the MDR
implementors from paying attention on compatibility of storage formats
(between different versions of MDR). I don't understand that restriction
such that I have to fear data loss or destruction during one session.
The only restriction is that at session end I have to export the data,
i.e. I have to require my user to push the Save button. That's normal
for modeling tools, isn't it ? Using MDR the following way:
- XMI as data format visible for the user
- MDR as data format during the session
should be 100% acceptable for any user.
--
Holger Krug
***@rationalizer.com
William Will
2002-09-27 19:31:13 UTC
Permalink
Holger,

the question arises when you have large volumes of data. We intend to dump
large files, 50-500 MB in size into the MDR, and look at the MDR almost like
an 'object database', based on open standards for model definition and
exchange. It seems strange that there are transaction/rollback features, but
no persistence. I know the NetBeans XML guys are thinking of similar
approaches.

Let's take the work going on in the Java project (anyone want to comment?)
as an example. The entire Java SDK will be parsed into the MDR, perhaps even
with multiple versions... This parsing no doubt takes hours, just as our
files at 500 MB are in the range of many minutes. Once the file is in the
MDR, there should be no need to 'export it', unless I'm dealing with
multiple MDR versions. A user should be able to just start MDR and continue
working on the extent as before.

I understand that a dump/reload via XMI with some special handling will be
necessary if a model or MDR version changes. I look at XMI as an
export/exchange and not as a storage format.

So I should reformulate my question: Will MDR storage in MDR 1.0 be stable
enough within one version of MDR to allow persistence across sessions of
usage? The answer to this question will really determine usability in a
number of scenarios, since data corruption would not be tolerable from
session to session. It would also have to drive the QA process on MDR.

Cheers,
William Will


----- Original Message -----
From: "Holger Krug" <***@rationalizer.com>
To: <***@mdr.netbeans.org>
Sent: Friday, September 27, 2002 11:50 AM
Subject: Re: [mdr-users] MDR usage as primary storage/cache
...
Post by Holger Krug
Hi, Will.
The restriction to use MDR only as data cache relieves the MDR
implementors from paying attention on compatibility of storage formats
(between different versions of MDR). I don't understand that restriction
such that I have to fear data loss or destruction during one session.
The only restriction is that at session end I have to export the data,
i.e. I have to require my user to push the Save button. That's normal
- XMI as data format visible for the user
- MDR as data format during the session
should be 100% acceptable for any user.
--
Holger Krug
Charles T. Betz
2002-09-29 16:12:08 UTC
Permalink
This is a recurrent topic on this list. I am dying for an RDBMS or ODBMS
persistence layer for MDR. This would enable me to have an end-to-end
solution that addresses the more traditional meaning/context of
"metadata" -- e.g. data dictionary type functionality (leveraging CWM).

Why don't I jump in and contribute myself? Because the company I work for
won't permit it.

-ctb
-----Original Message-----
Sent: Friday, September 27, 2002 2:31 PM
Subject: Re: [mdr-users] MDR usage as primary storage/cache
Holger,
the question arises when you have large volumes of data. We intend to dump
large files, 50-500 MB in size into the MDR, and look at the MDR almost like
an 'object database', based on open standards for model definition and
exchange. It seems strange that there are transaction/rollback
features, but
no persistence. I know the NetBeans XML guys are thinking of similar
approaches.
Let's take the work going on in the Java project (anyone want to comment?)
as an example. The entire Java SDK will be parsed into the MDR, perhaps even
with multiple versions... This parsing no doubt takes hours, just as our
files at 500 MB are in the range of many minutes. Once the file is in the
MDR, there should be no need to 'export it', unless I'm dealing with
multiple MDR versions. A user should be able to just start MDR
and continue
working on the extent as before.
I understand that a dump/reload via XMI with some special handling will be
necessary if a model or MDR version changes. I look at XMI as an
export/exchange and not as a storage format.
So I should reformulate my question: Will MDR storage in MDR 1.0 be stable
enough within one version of MDR to allow persistence across sessions of
usage? The answer to this question will really determine usability in a
number of scenarios, since data corruption would not be tolerable from
session to session. It would also have to drive the QA process on MDR.
Cheers,
William Will
----- Original Message -----
Sent: Friday, September 27, 2002 11:50 AM
Subject: Re: [mdr-users] MDR usage as primary storage/cache
...
Post by Holger Krug
Hi, Will.
The restriction to use MDR only as data cache relieves the MDR
implementors from paying attention on compatibility of storage formats
(between different versions of MDR). I don't understand that restriction
such that I have to fear data loss or destruction during one session.
The only restriction is that at session end I have to export the data,
i.e. I have to require my user to push the Save button. That's normal
- XMI as data format visible for the user
- MDR as data format during the session
should be 100% acceptable for any user.
--
Holger Krug
Svata Dedic
2002-09-30 08:10:34 UTC
Permalink
Post by Charles T. Betz
This is a recurrent topic on this list. I am dying for an RDBMS or ODBMS
persistence layer for MDR. This would enable me to have an end-to-end
solution that addresses the more traditional meaning/context of
"metadata" -- e.g. data dictionary type functionality (leveraging CWM).
Why don't I jump in and contribute myself? Because the company I work for
won't permit it.
Well, this is a part of the answer why MDR isn't a RDBMS or ODBMS :-\
The NetBeans IDE group does not need that functionality. Others may need
it. But they do not contribute the functionality to the project for one
or other reason, waiting for - what ? SUN ? Someone else in the community ?
Why SUN NetBeans developers should spend considerable man-hours
implementing functionality they don't need ? Similarly as your employer
would not permit contribution to another projects, I suppose that SUN
would not put resources into something which does not have much effect
in the rest of the IDE.
Note that Marting is not discouraging anyone to build such a layer, he
is only saying that he will not do it.

-Svata
--
Svatopluk Dedic <mailto:***@sun.com>
NetBeans, Java/Repository <http://java.netbeans.org>
Charles T. Betz
2002-09-30 13:15:41 UTC
Permalink
I apologize if it seems like I am pestering the NetBeans team to develop
something they do not need; this is not my intention. I am hoping that
someone under less commercial restriction than I will also see the need and
perhaps start a new open-source initiative (not to implement an ODBMS/RDBMS!
just to map MDR to an existing one). If they do so I will be able to at
least contribute a real-world alpha site.

Charlie
-----Original Message-----
Sent: Monday, September 30, 2002 3:11 AM
Subject: Re: [mdr-users] MDR usage as primary storage/cache
Post by Charles T. Betz
This is a recurrent topic on this list. I am dying for an RDBMS or ODBMS
persistence layer for MDR. This would enable me to have an end-to-end
solution that addresses the more traditional meaning/context of
"metadata" -- e.g. data dictionary type functionality (leveraging CWM).
Why don't I jump in and contribute myself? Because the company
I work for
Post by Charles T. Betz
won't permit it.
Well, this is a part of the answer why MDR isn't a RDBMS or ODBMS :-\
The NetBeans IDE group does not need that functionality. Others may need
it. But they do not contribute the functionality to the project for one
or other reason, waiting for - what ? SUN ? Someone else in the community ?
Why SUN NetBeans developers should spend considerable man-hours
implementing functionality they don't need ? Similarly as your employer
would not permit contribution to another projects, I suppose that SUN
would not put resources into something which does not have much effect
in the rest of the IDE.
Note that Marting is not discouraging anyone to build such a layer, he
is only saying that he will not do it.
-Svata
--
NetBeans, Java/Repository <http://java.netbeans.org>
Svata Dedic
2002-09-30 07:55:08 UTC
Permalink
Post by William Will
Holger,
Let's take the work going on in the Java project (anyone want to comment?)
as an example. The entire Java SDK will be parsed into the MDR, perhaps even
with multiple versions... This parsing no doubt takes hours, just as our
Parsing JDK, without using MDR to store the data takes about 10 minutes.
About 5 minutes, if a hacked parser is used that does not perform
semantics phase on method bodies. Again *not* counting overhead of a
MDR-based solution.

-Svata
--
Svatopluk Dedic <mailto:***@sun.com>
NetBeans, Java/Repository <http://java.netbeans.org>
Brian Smith
2002-09-27 19:34:33 UTC
Permalink
William,

I think it is just a different usage model than what you expect. For the
use cases I have seen (the Java module and my SMN module), it is
actually better for it to be a cache than primary storage. The reason is
that Java already has a primary storage (.java files and .class files)
and so does SMN (the .smn files and eventually ".smn-group" files). My
understanding is that the Java and SMN modules will use MDR only to
cache metadata to avoid re-parsing the input files. Similarly, the
NetBeans XML module will apparently use MDR to cache metadata for XML
documents.

If you looked at my SMN module, you will see that pretty much all of the
pre-existing MDR explorer functionality (except "Generate JMI
Interfaces" and "Instantiate") is available on the context menus of the
SMN documents. I will add "Generate JMI Interfaces" soon. Also, soon the
nodes for the SMN documents will be able to display the contents of the
document like the Java module displays fields/methods/etc. Once that is
done, there really won't be any reason the the SMN module user to use
MDR Explorer to manipulate their metamodels anymore. And, that is the
effect I am going for: The MDR repository should be transparent to the
normal user, even the MOF metamodel developer. That is why, for example,
the "Import SMN..." and "Instantiate" actions aren't in the context menu
for the documents themselves (they are only in MDR Explorer's context
menu), because they are specific to the idea of having some kind of
repository. I could change the module to work purely on the SMN
concrete-syntax-trees without using JMI at all and the user interface
would be the same.

Now, if your application is a "pure JMI" application that requires only
XMI documents for storage, it might seem like you want the the
repository to be the primary storage and the XMI to be just an export
format. But, I think that doesn't fit well with typical usage models for
MDR that I see. If you are developing a repository-centric tool then you
will probably need some mechanism for allowing multiple users to access
a single repository and work together. And, then you'd need some kind of
undo/redo and versioning support in the repository as well. And, I think
that kind of stuff is really stretching MDR too far in the direction of
being a ODBMS instead of being a metadata management tool for NetBeans
modules. There is already another JMI-compatible metadata repository
(Adaptive Repository) that seems to be tuned to those types of use
cases. But, I don't think that Pete is going to give you the source code
for it.

In Project XEMO's case it seems there are already existing file formats
available for storing compositions (e.g. MusicXML). Why don't you want
to use them as the primary storage?

- Brian
William Will
2002-09-27 20:15:56 UTC
Permalink
Brian,

I think the scenario I'm working on is in-between MDR as a pure metadata
cache and MDR as a persistent form of storage. We do have a number of
existing file formats that can be used as primary storage, one of them being
MusicXML. The problem is the size and parsing of these files in a desktop
application, with the user sitting and waiting a long time to open/save a
file. Many XML formats suffer exactly from this problem... large and
verbose.

Ideally we would drop the XML into MDR, perform our conversions as needed,
edit the instance data, e.g.. using a notation editor, and close the session
without having to go to primary storage, speak back to MusicXML. Upon
starting a session again, MDR would be accessed to continue editing. Only at
some later point of time would we dump back to MusicXML for exchange
purposes. The undo/redo support you allude to is inherently already there
with the notion of transactions/rollback as provided in the MDR API.

I think many applications will have similar problems. While we don't need
multi-user support, we don't want to have to reparse/generate large exchange
files constantly.
MDR could easily be used to address some of those performance issues. All
kinds of XML data used in the NetBeans IDE could be persisted in the MDR,
avoiding the need to reparse on startup - and greatly improving performance
of the IDE.
Post by Brian Smith
Now, if your application is a "pure JMI" application that requires only
XMI documents for storage, it might seem like you want the the
repository to be the primary storage and the XMI to be just an export
format. But, I think that doesn't fit well with typical usage models for
MDR that I see. If you are developing a repository-centric tool then you
will probably need some mechanism for allowing multiple users to access
a single repository and work together. And, then you'd need some kind of
undo/redo and versioning support in the repository as well. And, I think
that kind of stuff is really stretching MDR too far in the direction of
being a ODBMS instead of being a metadata management tool for NetBeans
modules. There is already another JMI-compatible metadata repository
(Adaptive Repository) that seems to be tuned to those types of use
cases. But, I don't think that Pete is going to give you the source code
for it.
In Project XEMO's case it seems there are already existing file formats
available for storing compositions (e.g. MusicXML). Why don't you want
to use them as the primary storage?
- Brian
Martin Matula
2002-09-27 20:55:45 UTC
Permalink
Hi William,
BTree storage is reliable and stable for the kind of usage you describe,
if you use one version of MDR. It is quite robust - it can recover from
unexpected failures - even if you turn off your computer during the
write operation it will work fine and nothing but the last transaction
will be lost (if the harddisk wasn't corrupted, of course). Storage
format can change from one version to another so you will have to be
careful when upgrading. But that was not the reason to state that MDR is
not intended to be used as a primary storage. The main reason for that
was that otherwise people would come with requirements for versioning
mechanisms, etc. We wanted to make it clear that we are not building a
full featured object database, but more a lightweight metadata
integration facility. For our usecases it is much more desired to use
the source files as the primary data and version them, i.e. MDR does not
provide any support for versioning the storage and the storage files are
binary files which can hardly be versioned by existing versioning systems.
If this is not a problem for you - i.e. if you want to version your
data, you will have to flush them into XML - then MDR fits your usecase.
Martin
Post by William Will
Brian,
I think the scenario I'm working on is in-between MDR as a pure metadata
cache and MDR as a persistent form of storage. We do have a number of
existing file formats that can be used as primary storage, one of them being
MusicXML. The problem is the size and parsing of these files in a desktop
application, with the user sitting and waiting a long time to open/save a
file. Many XML formats suffer exactly from this problem... large and
verbose.
Ideally we would drop the XML into MDR, perform our conversions as needed,
edit the instance data, e.g.. using a notation editor, and close the session
without having to go to primary storage, speak back to MusicXML. Upon
starting a session again, MDR would be accessed to continue editing. Only at
some later point of time would we dump back to MusicXML for exchange
purposes. The undo/redo support you allude to is inherently already there
with the notion of transactions/rollback as provided in the MDR API.
I think many applications will have similar problems. While we don't need
multi-user support, we don't want to have to reparse/generate large exchange
files constantly.
MDR could easily be used to address some of those performance issues. All
kinds of XML data used in the NetBeans IDE could be persisted in the MDR,
avoiding the need to reparse on startup - and greatly improving performance
of the IDE.
Post by Brian Smith
Now, if your application is a "pure JMI" application that requires only
XMI documents for storage, it might seem like you want the the
repository to be the primary storage and the XMI to be just an export
format. But, I think that doesn't fit well with typical usage models for
MDR that I see. If you are developing a repository-centric tool then you
will probably need some mechanism for allowing multiple users to access
a single repository and work together. And, then you'd need some kind of
undo/redo and versioning support in the repository as well. And, I think
that kind of stuff is really stretching MDR too far in the direction of
being a ODBMS instead of being a metadata management tool for NetBeans
modules. There is already another JMI-compatible metadata repository
(Adaptive Repository) that seems to be tuned to those types of use
cases. But, I don't think that Pete is going to give you the source code
for it.
In Project XEMO's case it seems there are already existing file formats
available for storing compositions (e.g. MusicXML). Why don't you want
to use them as the primary storage?
- Brian
Charles T. Betz
2002-09-29 16:15:23 UTC
Permalink
-There is already another JMI-compatible metadata repository
(Adaptive Repository) that seems to be tuned to those types of use
cases. But, I don't think that Pete is going to give you the source code
for it.
That's exactly the problem. DSTC also has something proprietary, as does
MetaMatrix. But I CAN NOT make a business case for these expensive products
yet! If there WERE an open-source implementation, I think it would FURTHER
the interests of these vendors. I could get a skunkworks implementation up
to demonstrate the utility of these solutions in a Fortune 100 shop, and
when the higher ups started asking about support, then I'd have the business
case to bring in a vendor. I'll be blunt -- people are being too greedy, and
they risk having 100% of nothing rather than 5% of something.

-Charlie
P***@ubsw.com
2002-09-30 08:47:39 UTC
Permalink
Post by Charles T. Betz
This is a recurrent topic on this list. I am dying for an RDBMS or ODBMS
persistence layer for MDR. This would enable me to have an end-to-end
solution that addresses the more traditional meaning/context of
"metadata" -- e.g. data dictionary type functionality (leveraging CWM).
Why don't I jump in and contribute myself? Because the company I work for
won't permit it.
A JDBC persistence layer would probably be necessary for the application
we have in mind, because the metadata needs to be managed as a shared
resource. Does the MDR architecture allow for something such as this
to be plugged in reasonably neatly?

Is anyone thinking of developing such a layer?

Visit our website at http://www.ubswarburg.com

This message contains confidential information and is intended only
for the individual named. If you are not the named addressee you
should not disseminate, distribute or copy this e-mail. Please
notify the sender immediately by e-mail if you have received this
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free
as information could be intercepted, corrupted, lost, destroyed,
arrive late or incomplete, or contain viruses. The sender therefore
does not accept liability for any errors or omissions in the contents
of this message which arise as a result of e-mail transmission. If
verification is required please request a hard-copy version. This
message is provided for informational purposes and should not be
construed as a solicitation or offer to buy or sell any securities or
related financial instruments.
Martin Matula
2002-09-30 18:41:14 UTC
Permalink
Hi Philip,
Post by P***@ubsw.com
Post by Charles T. Betz
This is a recurrent topic on this list. I am dying for an RDBMS or ODBMS
persistence layer for MDR. This would enable me to have an end-to-end
solution that addresses the more traditional meaning/context of
"metadata" -- e.g. data dictionary type functionality (leveraging CWM).
Why don't I jump in and contribute myself? Because the company I work for
won't permit it.
A JDBC persistence layer would probably be necessary for the application
we have in mind, because the metadata needs to be managed as a shared
resource. Does the MDR architecture allow for something such as this
to be plugged in reasonably neatly?
Currently not. As Tomas already wrote, currently you can provide your
own storage layer by implementing Storage SPI. However the SPI is too
low-level to enable you to take advantage of RDBMS. It is doable to
provide RDBMS implementation, but it would still need to be restricted
to single-user usage, otherwise several things - mostly the events
notifications - would not work.
We are planning on making the storage model layer pluggable by
separating it by a set of API, but that will take some time (is not even
scheduled yet, so I don't think we will start working on it before the
end of this year).
Post by P***@ubsw.com
Is anyone thinking of developing such a layer?
I am not aware of anyone.
Martin
Lombart Vincent
2002-09-30 14:41:20 UTC
Permalink
Hello Charlie

I think there are several of us here who have come to the Netbeans MDR
for the same reason: metadata management is useful but the "Return
On Investment" for commercial (expensive) tools is hard to demonstrate.

The MDR is an open-source jewel that allows us to build a small-scale
repository and maybe support our demand for a large-scale, commercial
repository later on. But the lack of an RDBMS persistence layer is a
problem that severely limits the usefulness of even a small-scale
repository in an enterprise environment.

Unfortunately, the companies that do not want to pay for a commercial
repository usually do not want to pay for an open-source development.
And I understand very well that SUN does not want to pay for it either.

Does anybody know how much work is needed to build that persistence layer?

Vincent Lombart

-----Original Message-----
From: Charles T. Betz [mailto:***@visi.com]
Sent: lundi 30 septembre 2002 15:16
To: ***@mdr.netbeans.org
Subject: RE: [mdr-users] MDR usage as primary storage/cache


I apologize if it seems like I am pestering the NetBeans team to develop
something they do not need; this is not my intention. I am hoping that
someone under less commercial restriction than I will also see the need and
perhaps start a new open-source initiative (not to implement an ODBMS/RDBMS!
just to map MDR to an existing one). If they do so I will be able to at
least contribute a real-world alpha site.

Charlie
-----Original Message-----
Sent: Monday, September 30, 2002 3:11 AM
Subject: Re: [mdr-users] MDR usage as primary storage/cache
Post by Charles T. Betz
This is a recurrent topic on this list. I am dying for an RDBMS or ODBMS
persistence layer for MDR. This would enable me to have an end-to-end
solution that addresses the more traditional meaning/context of
"metadata" -- e.g. data dictionary type functionality (leveraging CWM).
Why don't I jump in and contribute myself? Because the company
I work for
Post by Charles T. Betz
won't permit it.
Well, this is a part of the answer why MDR isn't a RDBMS or ODBMS :-\
The NetBeans IDE group does not need that functionality. Others may need
it. But they do not contribute the functionality to the project for one
or other reason, waiting for - what ? SUN ? Someone else in the
community ?
Why SUN NetBeans developers should spend considerable man-hours
implementing functionality they don't need ? Similarly as your employer
would not permit contribution to another projects, I suppose that SUN
would not put resources into something which does not have much effect
in the rest of the IDE.
Note that Marting is not discouraging anyone to build such a layer, he
is only saying that he will not do it.
-Svata
--
NetBeans, Java/Repository <http://java.netbeans.org>
--
Visit our websites http://www.dexia.be - http://www.axionweb.be

"The information contained in this message is intended for the addressee
only and may contain confidential and/or privileged information and/or
information protected by intellectual property rights. If you are not the
addressee, please delete this message and notify the sender; you should
not use, alter, copy or distribute this message or disclose its contents
to anyone.
Email transmission cannot be guaranteed to be secure or error free as
information could be intercepted, corrupted, lost, destroyed, arrive late
or incomplete, or contain viruses. No responsibility is accepted by Dexia
Bank for any loss or damage arising in any way from its use.
Any views or opinions expressed in this message are those of the author
and do not necessarily represent those of Dexia Bank or any of its
affiliates. Therefore this email does not constitute a commitment by Dexia
bank unless it contains an express statement to the contrary from an
authorised representative."
Holger Krug
2002-09-30 15:11:44 UTC
Permalink
Post by Lombart Vincent
I think there are several of us here who have come to the Netbeans MDR
for the same reason: metadata management is useful but the "Return
On Investment" for commercial (expensive) tools is hard to demonstrate.
The MDR is an open-source jewel that allows us to build a small-scale
repository and maybe support our demand for a large-scale, commercial
repository later on. But the lack of an RDBMS persistence layer is a
problem that severely limits the usefulness of even a small-scale
repository in an enterprise environment.
Unfortunately, the companies that do not want to pay for a commercial
repository usually do not want to pay for an open-source development.
And I understand very well that SUN does not want to pay for it either.
Does anybody know how much work is needed to build that persistence layer?
Before speeking about the amount of work I would recommend to make some
requirement analysis first. Maybe we can use this thread to collect the
requirements for such a persistence layer. This definitely would help
anybody who wants to implement it later on quite a lot and is not very
much work for anybody of us.

Let me start:

1) Version management in a CVS like manner (different branches etc.)
2) Semantical diffs and merges (not based on a textual diff a some
textual representation, e.g. XMI, but based on the meta-model)
3) Merges also restricted to parts of the model (i.e. merge only the
changes I've made inside this containment hierarchy with the main trunk,
but not all the other changes I've made)
--
Holger Krug
***@rationalizer.com
Tomas Zezula
2002-09-30 17:40:41 UTC
Permalink
Post by Holger Krug
Post by Lombart Vincent
I think there are several of us here who have come to the Netbeans MDR
for the same reason: metadata management is useful but the "Return
On Investment" for commercial (expensive) tools is hard to demonstrate.
The MDR is an open-source jewel that allows us to build a small-scale
repository and maybe support our demand for a large-scale, commercial
repository later on. But the lack of an RDBMS persistence layer is a
problem that severely limits the usefulness of even a small-scale
repository in an enterprise environment.
Unfortunately, the companies that do not want to pay for a commercial
repository usually do not want to pay for an open-source development.
And I understand very well that SUN does not want to pay for it either.
Does anybody know how much work is needed to build that persistence layer?
Before speeking about the amount of work I would recommend to make some
requirement analysis first. Maybe we can use this thread to collect the
requirements for such a persistence layer. This definitely would help
anybody who wants to implement it later on quite a lot and is not very
much work for anybody of us.
1) Version management in a CVS like manner (different branches etc.)
2) Semantical diffs and merges (not based on a textual diff a some
textual representation, e.g. XMI, but based on the meta-model)
3) Merges also restricted to parts of the model (i.e. merge only the
changes I've made inside this containment hierarchy with the main trunk,
but not all the other changes I've made)
The MDR architecture provides Storage SPI to storage vendors to create
new persistent
storages for the MDR (currently, only the MemoryStorage and the
BTreeStorage are available),
the most simple way how to do the RDBMS persistent layer is to use this
SPI. This
approach is quite simple and does not require any MDR changes, but the
advantages of the
RDBMS like StorableObjects' attribute values indexing could not be used,
the SPI treats the StorableObject
as an array of bytes.
In my opinion the requirements stated above should be implemented on
much higher layer than the storage is.

Tomas
Holger Krug
2002-09-30 16:34:59 UTC
Permalink
Post by Tomas Zezula
Post by Holger Krug
Post by Lombart Vincent
Does anybody know how much work is needed to build that persistence layer?
Before speeking about the amount of work I would recommend to make some
requirement analysis first. Maybe we can use this thread to collect the
requirements for such a persistence layer. This definitely would help
anybody who wants to implement it later on quite a lot and is not very
much work for anybody of us.
1) Version management in a CVS like manner (different branches etc.)
2) Semantical diffs and merges (not based on a textual diff a some
textual representation, e.g. XMI, but based on the meta-model)
3) Merges also restricted to parts of the model (i.e. merge only the
changes I've made inside this containment hierarchy with the main trunk,
but not all the other changes I've made)
The MDR architecture provides Storage SPI to storage vendors to create
new persistent
storages for the MDR (currently, only the MemoryStorage and the
BTreeStorage are available),
the most simple way how to do the RDBMS persistent layer is to use this
SPI. This
approach is quite simple and does not require any MDR changes, but the
advantages of the
RDBMS like StorableObjects' attribute values indexing could not be used,
the SPI treats the StorableObject
as an array of bytes.
In my opinion the requirements stated above should be implemented on
much higher layer than the storage is.
Surely, on a higher layer. The reason why I started to collect
requirements is, that I am quite sure, that it won't help anybody simply
to implement a RDBMS based persistence layer. What does it help to say:
"OK, simply implement the MDR persistence layer for RDBMS, that will
take x weeks." if afterwards the person who did this will recognize,
that that does not meet her requirements.

One question regarding your proposal: Are you sure, that you can use an
RDBMS based implementation of the MDR persistence layer in a multi-user
environment (i.e. several users working on the same model). IMHO MDR
caching will make this quite useless and will produce strange results.
Am I wrong ?
--
Holger Krug
***@rationalizer.com
Tomas Zezula
2002-09-30 18:48:47 UTC
Permalink
Post by Holger Krug
Post by Tomas Zezula
Post by Holger Krug
Post by Lombart Vincent
Does anybody know how much work is needed to build that persistence layer?
Before speeking about the amount of work I would recommend to make some
requirement analysis first. Maybe we can use this thread to collect the
requirements for such a persistence layer. This definitely would help
anybody who wants to implement it later on quite a lot and is not very
much work for anybody of us.
1) Version management in a CVS like manner (different branches etc.)
2) Semantical diffs and merges (not based on a textual diff a some
textual representation, e.g. XMI, but based on the meta-model)
3) Merges also restricted to parts of the model (i.e. merge only the
changes I've made inside this containment hierarchy with the main trunk,
but not all the other changes I've made)
The MDR architecture provides Storage SPI to storage vendors to create
new persistent
storages for the MDR (currently, only the MemoryStorage and the
BTreeStorage are available),
the most simple way how to do the RDBMS persistent layer is to use this
SPI. This
approach is quite simple and does not require any MDR changes, but the
advantages of the
RDBMS like StorableObjects' attribute values indexing could not be used,
the SPI treats the StorableObject
as an array of bytes.
In my opinion the requirements stated above should be implemented on
much higher layer than the storage is.
Surely, on a higher layer. The reason why I started to collect
requirements is, that I am quite sure, that it won't help anybody simply
"OK, simply implement the MDR persistence layer for RDBMS, that will
take x weeks." if afterwards the person who did this will recognize,
that that does not meet her requirements.
One question regarding your proposal: Are you sure, that you can use an
RDBMS based implementation of the MDR persistence layer in a multi-user
environment (i.e. several users working on the same model). IMHO MDR
caching will make this quite useless and will produce strange results.
Am I wrong ?
No, you are right.
The problem is in the caches in the StorableObjects layer, but I think
it can be overcome. The caching on the level
of Storage is no problem. But don't take it as requirement. As I sad
above, in this way of implementation you are loosing all the advantages
of RDBMS. The better way would be to create a layer between Storables
and Persistence SPI, where the Storables will be still treated as
objects with attributes and will be reasonable mapable into database tables.

Tomas
Charles T. Betz
2002-09-30 17:03:34 UTC
Permalink
Holger,

These are excellent requirements, and start to move the MDR in the direction
of a generalized CASE tool (metaCASE, actually).

However, I would like to lay out another use case: using the MDR APIs for
read-only system inventory work. This is more of a reverse
engineering/program understanding/enterprise application portfolio
management use case, in which versioning becomes somewhat less important.

An example use case would be:

1. Run a scanner against an RDBMS system catalog. Translate the proprietary
system catalog to standard CWM/XMI (Relational package).

2. Write the extracted system catalog to the repository.

3. Repeat for all RDBMSes in the shop.

4. User logs into Web site, queries repository. Repository emits XMI that
can be presented to user via XSLT transformation. This is where multi-user
architecture becomes important, not on the input side.

The reason I say that versioning is less important to this use case is that
the extracts can simply be completely refreshed, which simplifies a number
of issues tremendously. Delta processing is still useful for identifying new
elements for the inventory and enabling related administrative processes,
but great value can still be delivered without this.

The above use case is RDBMS-centric, but simply replace the RDBMS with

1. UML diagrams (UML metamodel)
2. J2EE servers (JMI -> EDOC metamodel?)
3. Enterprise messaging architectures (UML Profile for EAI)
4. ETL applications such as Informatica (CWM)
5. E/R modeling tools such as Erwin (CWM)

Etc.

This probably seems like alien stuff, given the development tools focus of
MDR. But I can see *so much* potential in that generalized MOF API and XMI
for my problem domain, which is managing enterprise system inventories at
scale. Sure, Adaptive already does this -- but again, I am in a
bootstrapping problem with limited resources, and need the open source
implementations to prove my case. If people had been charging half a million
dollars for TCP/IP twenty years ago, it never would have gone anywhere.

If anyone else is interested in these types of use cases, please let me
know. Perhaps we can start an interest group list. There is also the
potential of leveraging some of our local software engineering students who
are always looking for projects.

Thanks,

Charlie

P.S. W/r/t CASE: These types of requirements are also nontrivial, as the
history of 1st generation CASE tools shows. The issue of object lineage and
identity is a killer.

Even Erwin, with its limited domain of use (entity/relationship
diagramming), has had tremendous challenges precisely in the area of model
merge/delta with respect to the ModelMart RDBMS-based repository. (I worked
for Target, the large American retailer; when I left there a few months ago
there were about 35 outstanding issues we were pestering CA on, and this was
with respect to Erwin/ModelMart 3.5, supposedly mature technology)
Post by Holger Krug
Before speeking about the amount of work I would recommend to make some
requirement analysis first. Maybe we can use this thread to collect the
requirements for such a persistence layer. This definitely would help
anybody who wants to implement it later on quite a lot and is not very
much work for anybody of us.
1) Version management in a CVS like manner (different branches etc.)
2) Semantical diffs and merges (not based on a textual diff a some
textual representation, e.g. XMI, but based on the meta-model)
3) Merges also restricted to parts of the model (i.e. merge only the
changes I've made inside this containment hierarchy with the main trunk,
but not all the other changes I've made)
--
Holger Krug
Holger Krug
2002-09-30 17:20:39 UTC
Permalink
Post by William Will
Holger,
These are excellent requirements, and start to move the MDR in the direction
of a generalized CASE tool (metaCASE, actually).
Oh, excuse me. I didn't want to speak about MDR as such. These are quite
heavy requirements, certainly not for MDR. I meant them to be
requirements for a database based persistence tool which can *cooperate*
with MDR. I think we should let MDR be what it currently is: a nice,
fine and beautiful meta-data based repository for the client side.
Post by William Will
However, I would like to lay out another use case: using the MDR APIs for
read-only system inventory work. This is more of a reverse
engineering/program understanding/enterprise application portfolio
management use case, in which versioning becomes somewhat less important.
1. Run a scanner against an RDBMS system catalog. Translate the proprietary
system catalog to standard CWM/XMI (Relational package).
2. Write the extracted system catalog to the repository.
3. Repeat for all RDBMSes in the shop.
4. User logs into Web site, queries repository. Repository emits XMI that
can be presented to user via XSLT transformation. This is where multi-user
architecture becomes important, not on the input side.
IMHO all this can be done with the current version of MDR. Take it and
do it and enjoy ! Simply make a layer on top of MDR which handle user
requests, e.g. integrate MDR into a web-server. There should be no
problem at all (at least what concerns MDR). Or did I miss anything ?
Post by William Will
This probably seems like alien stuff, given the development tools focus of
MDR. But I can see *so much* potential in that generalized MOF API and XMI
for my problem domain, which is managing enterprise system inventories at
scale. Sure, Adaptive already does this -- but again, I am in a
bootstrapping problem with limited resources, and need the open source
implementations to prove my case. If people had been charging half a million
dollars for TCP/IP twenty years ago, it never would have gone anywhere.
I deem, they charged and US government paid. Even government people do
not work without money.
--
Holger Krug
***@rationalizer.com
Charles T. Betz
2002-09-30 17:41:56 UTC
Permalink
Post by Holger Krug
IMHO all this can be done with the current version of MDR. Take it and
do it and enjoy ! Simply make a layer on top of MDR which handle user
requests, e.g. integrate MDR into a web-server. There should be no
problem at all (at least what concerns MDR). Or did I miss anything ?
Scalability of B-tree? Any concurrency issues for multiple read-only access
via XMI Writer? Eventually the load process might need transactionality, so
we could multithread it. Perhaps you're right though. I've been so tied to
my RDBMS-centric worldview that I haven't asked whether this could be
achieved with a shared B-tree file and an application server.

In this case, open source work presents itself in (for example) writing a
scanner that can translate JDBC into CWM XMI (Relational package). Has this
been done to anyone's knowledge?
Post by Holger Krug
If people had been charging
half a million
Post by Charles T. Betz
dollars for TCP/IP twenty years ago, it never would have gone anywhere.
I deem, they charged and US government paid. Even government people do
not work without money.
Point well taken.

-ctb
Martin Matula
2002-09-30 20:28:55 UTC
Permalink
Hi Charles,
Post by Charles T. Betz
In this case, open source work presents itself in (for example) writing a
scanner that can translate JDBC into CWM XMI (Relational package). Has this
been done to anyone's knowledge?
We are doing it, but in the unreleased version of our database explorer
module (it has not been released yet as because of lack of resources we
had to suspend its developement for several months). I will try to find
out whether the JDBC->CWM mapping can be released separately.
Martin
Pete Rivett
2002-10-03 15:19:19 UTC
Permalink
FYI The MetaIntegration toolset does this, as well as bridging most data
modeling tools to/from (CWM) XMI; for details and an evaluation version see
www.metaintegration.net.

Pete

Pete Rivett (***@adaptive.com)
Dean Park House, 8-10 Dean Park Crescent, Bournemouth, BH1 1HL, UK
Tel: +44 (0)1202 449419 Fax: +44 (0)1202 449448
http://www.adaptive.com
-----Original Message-----
Sent: 30 September 2002 23:29
Subject: Re: [mdr-users] MDR usage as primary storage/cache
Hi Charles,
Post by Charles T. Betz
In this case, open source work presents itself in (for
example) writing a
Post by Charles T. Betz
scanner that can translate JDBC into CWM XMI (Relational
package). Has this
Post by Charles T. Betz
been done to anyone's knowledge?
We are doing it, but in the unreleased version of our
database explorer
module (it has not been released yet as because of lack of
resources we
had to suspend its developement for several months). I will
try to find
out whether the JDBC->CWM mapping can be released separately.
Martin
The information contained in this email and any attached files are confidential and intended solely for the addressee(s). The e-mail may be legally privileged or prohibited from disclosure and unauthorised use.

If you are not the named addressee you may not use, copy or disclose this information to any other person. If you received this message in error please notify the sender immediately.

Any views or opinions presented here may be solely those of the originator and do not necessarily reflect those of the Company.
Holger Krug
2002-10-01 06:25:16 UTC
Permalink
Post by Charles T. Betz
Post by Holger Krug
IMHO all this can be done with the current version of MDR. Take it and
do it and enjoy ! Simply make a layer on top of MDR which handle user
requests, e.g. integrate MDR into a web-server. There should be no
problem at all (at least what concerns MDR). Or did I miss anything ?
Scalability of B-tree?
You have read-only access. Why should RDBMS any better than B-tree ?
Post by Charles T. Betz
Any concurrency issues for multiple read-only access
via XMI Writer?
Multiple read-only is explicitely allowed.
Post by Charles T. Betz
Eventually the load process might need transactionality, so
we could multithread it.
Use different repositories for read and write and switch after the write
is completed.
Post by Charles T. Betz
Perhaps you're right though. I've been so tied to
my RDBMS-centric worldview that I haven't asked whether this could be
achieved with a shared B-tree file and an application server.
;-)

Good luck in starting your project !
--
Holger Krug
***@rationalizer.com
Martin Matula
2002-09-30 19:34:03 UTC
Permalink
Hi Charles,
Post by Charles T. Betz
1. Run a scanner against an RDBMS system catalog. Translate the proprietary
system catalog to standard CWM/XMI (Relational package).
2. Write the extracted system catalog to the repository.
3. Repeat for all RDBMSes in the shop.
4. User logs into Web site, queries repository. Repository emits XMI that
can be presented to user via XSLT transformation. This is where multi-user
architecture becomes important, not on the input side.
I don't see why you need RDBMS storage layer for this. It is easily
doable with the current implementation of MDR.
Post by Charles T. Betz
The above use case is RDBMS-centric, but simply replace the RDBMS with
1. UML diagrams (UML metamodel)
2. J2EE servers (JMI -> EDOC metamodel?)
3. Enterprise messaging architectures (UML Profile for EAI)
4. ETL applications such as Informatica (CWM)
5. E/R modeling tools such as Erwin (CWM)
Etc.
Sure. We want to use the MDR in the same way. The only difference is,
that the interface will not be a web browser, but JMI API and the users
will not be people, but NetBeans modules. MDR will serve as a universal
integration point for various modules plugging in support for various
languages and types of metadata.
I still don't see where the requirement for a RDBMS storage comes from.
The above does not imply it at all.
Post by Charles T. Betz
This probably seems like alien stuff, given the development tools focus of
MDR.
Not at all. But we need to keep in mind that the reason we got resources
for doing MDR was NetBeans and thus it has to be our first priority.
That's where the developement tools focus of MDR comes from. But it is
not limited to the developement tools usage and I hope that our users
realize that and that in the future we will see many different usages of
MDR. Than it could be interesting for Sun to even start sponsoring MDR
as a separate opensource project independent from NetBeans if there is
enough interest for that.
Martin
Loading...