Register to post in forums, or Log in to your existing account
 

Play RetroMUD
Post new topic  Reply to topic     Home » Forums » CMUD Beta Forum
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Mon Jun 07, 2010 7:38 pm   

Backwards compatibility is such a pain
 
When working on the issue of existing DB variables not being loaded into the new CMUD 3.18d when they contain nested string lists, I've run into a bunch of problems. This issue of backwards compatibility is really causing some huge headaches. There are many days when I wish I didn't have to worry about backwards compatibility since it seems like I spend the majority of my time on it.

The issue is that a table in CMUD is stored internally as a complex data structure. In order to store this data structure to the database, or to export it to XML, the complex data structure needs to be converted to a simple string value. This is commonly called "serializing the data" in other languages.

In the past, CMUD has used a string format for stringlists and database variables like this:
Code:
stringlist:  item1|item2|item3...
dbvar: key1=value1|key2=value2|key3=value3...

Note that a dbvar is a special case of a string list. When using a nested list within a db variable, old CMUD/zMUD used quotes like this:
Code:
key1="item1|item2|item3"|key2=value...

The new version of CMUD has an improved internal structure for handling tables. A specific case to look at is how it stores a value into a table that contains the "|" character. Old CMUD/zMUD could not handle this. Any string containing "|" would be treated as a string list. But using a JSON string format, here are two different tables:
Code:
var1 = {key1:[item1,item2,item3]}
var2 = {key1:"item1|item2|item3"}

With "var1" we see that we have a nested array containing 3 items within an object with a key of "key1". In "var2" we see that we have a single string value with a key of "key1".

However, when you convert these to the older CMUD/zMUD string format, you get the same result for both variables:
Code:
key1="item1|item2|item3"

So the old string format cannot properly maintain the *true* structure of the internal table.

When doing the XML Export/Import in the new CMUD, I added a flag called "json=true" and then output the JSON string format instead of the zMUD string format. As somebody pointed out to me, the problem with this is that you can't take the XML output from the new version and use it to load into the older CMUD version. The older CMUD version doesn't understand the "json=true" flag and tries to read the value as a normal variable, and ends up making a complete mess. So I can't do it this way.

Also, as shown above, I cannot use the old string format when storing the value to the database because it doesn't preserve the proper internal table structure. But if I change the database to use the JSON string format, then older versions of CMUD will not be able to read your new *.PKG files properly.

Some people might just say to not worry about this. But this would mean that somebody couldn't "test" the new 3.x public version without it rewriting all of their *.PKG files making them incompatible with the 2.x public version. That's not a very nice thing to do and I'll get flooded with support mail about this.

One way to handle this would be to add an extra field to the database and the XML export giving the JSON string format. When the new CMUD reads the database or imports XML, it could look for this new field and use it instead of the old string value to create the internal table. However, this would cause CMUD to double the data being outputted. There would be the old "compatible" format (with the known problems with nested tables), and the new json field. For large database variables, this could significantly increase the size of the *.PKG file, the XML export file, and also slow down the save routine since it would need to save both formats of the table.

I can't really think of any other way to do it at this point, so I'll probably just go with this double output format and not worry about the size and performance issue. But if anybody has other ideas, let me know soon.

In any case, this has really "thrown a wrench into the works" and slowed down the next release while I figure this all out and try to find the best way to handle it.
Reply with quote
GeneralStonewall
Magician


Joined: 02 Feb 2004
Posts: 364
Location: USA

PostPosted: Mon Jun 07, 2010 8:08 pm   
 
How about an 'export as old format' menu item or some such?
Reply with quote
Rahab
Wizard


Joined: 22 Mar 2007
Posts: 2320

PostPosted: Mon Jun 07, 2010 8:10 pm   
 
Yikes. Quite a kettle of fish there. When I looked at the new format, I did realize that converting back to 3.17 would not work. I wouldn't mind terribly if that continued to be true--for now I'm using 3.18 only for testing, and once it works well enough for my scripts I'll probably switch to it for gaming. But I imagine it would bother a lot of other folks.

I can only think of two other ways to do it. The first violates your policy of not branching your versions: make a new 2.xx version (and maybe a new 3.17 version) that can convert from the 3.18 format. You have valid reasons for your policy, so this isn't viable. The other is to write an external conversion script to go from 3.18 to earlier variable formats. But that's rather awkward.

You could do it with the double format for a while, then set a drop-dead date (well after the public release) after which a next version no longer supports (and actively removes) the duplicate format. But it doesn't reduce your work any (actually adds to it a bit).

Wish I could think of an alternative.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Mon Jun 07, 2010 8:46 pm   
 
Quote:
How about an 'export as old format' menu item or some such?

Problem with that is that non-expert users wouldn't know about this, then they would just export it from 3.xx and a friend would import it into 2.xx and find that all of the string lists and database variables were empty. Also, this only helps with the XML format. It doesn't help with the *.PKG database format. If the *.PKG file doesn't have the old string format, then if you loaded it into the old version of CMUD, all of your string lists and database variables would be empty and then the old version would SAVE the package potentially causing your variables to get erased for good.

Quote:
The other is to write an external conversion script to go from 3.18 to earlier variable formats

If I use a separate database field to store the new json format, then if you loaded the *.PKG into 2.xx and then *changed* a database variable, the new field would not be updated, so when the package is loaded back into 3.x all of your changes to the variable made in 2.x would be gone. Again, the end-user would just see it as data loss. So if the end-user didn't know about the conversion program or just forgot to use it, they could cause data loss by just loading their new package into the old version.

Quote:
make a new 2.xx version

Yeah, that's not going to happen. My version control screwed up a while back so I am really unable to produce an updated 2.x version at this point.

Quote:
You could do it with the double format for a while, then set a drop-dead date (well after the public release) after which a next version no longer supports (and actively removes) the duplicate format

That's probably what I'll end up doing eventually, but as you said, it doesn't really help for now.
Reply with quote
GeneralStonewall
Magician


Joined: 02 Feb 2004
Posts: 364
Location: USA

PostPosted: Mon Jun 07, 2010 9:06 pm   
 
I suppose another option might be to change the extension of package files; When an old one is loaded, it leaves it untouched and creates one with the new extension. This would effectively make a backup without the possibility of it being overwritten with the new format. The older CMUDs would only show the .mud and .pkg extensions so they couldn't be mixed up. The issue being is they wouldn't be able to use any changes they made in the new package (unless, once again, you could export to the old format).
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Mon Jun 07, 2010 11:05 pm   
 
OK, I've thought a lot about this today and here is what I'm going to do:

My #1 priority is to not mess up the majority of CMUD users. This means I need to preserve the old "string format" for tables and lists. I will add the json format as a separate field to both the database and the XML output.

For most users, this will have very little effect. Thus, the least side effect for the largest number of users. The only users that will notice a side effect are those user with very large database or string list variables. For those users, the XML output and the PKG output will grow to contain two copies of the data (the old string format plus the new json format). Saving a package will take slightly longer as the json format is added to the package database.

In reality, the effect is very minimal. Even dumping a huge database variable to both string format and json format just takes milliseconds. I can test this using my bench mark tests. When I run the "makedb" alias with a count of 10000 and then do a #VAR when it's done, the output of #VAR shows both the string format and the json format. And yet the output of #VAR is displayed almost instantly. So the performance decrease is minimal.

As for the size increase of the package file and XML file, disk space is cheap these days and I shouldn't be spending this kind of time trying to reduce a file size by just a few KB. The biggest annoyance will be to users with huge string lists who want to look at their XML output or edit it where they'll need to deal with the two copies of the data.

So, to summarize, this change only has a small effect on a very small number of users. Even though this might be some vocal power-users, I still think it's the right way to go. Dealing with multiple pkg versions, or pkg conversions, or some new options would just be confusing to the majority of users. Power users should be able to deal with this change.

The downside is that this will be yet another big change to the low-level code in CMUD. So this is going to put CMUD 3.19 back into Alpha stage for a few days until people can fully test it again.

What makes this stuff SO HARD is that in addition to the compatibility issues, there are big performance issues. CMUD needs to minimize the number of times the internal table is converted to and from a string format. For example, when you do this:
Code:
#VAR list %additem('item',@list)

it is important for CMUD to optimize this and *not* actually convert @list to a string value before calling %additem. Even though the arguments for %additem are string values, CMUD passes the internal hash table to the %additem function, which then adds the new "item" to this internal table. When #VAR sees this result, it directly assigns this internal table to the @list variable. But CMUD never actually computes the string or json string format of this internal table.

It's not until you do a #SHOW @List, or a #VAR command that CMUD converts the internal table to a readable string output value.

When the PKG database needs to be saved, the background thread checks each variable with an associated hash table to determine if the hash table has been changed since the string value was last computed. If the hash table has been changed, the new string value is computed and stored to the database.

So you can see the potential problems with this kind of optimization. If CMUD isn't very careful to flag changes to the underlying hash table, then the new string value might not be computed for saving to the PKG file, resulting in data loss. CMUD has to do a lot of work to keep the string value and the underlying hash table "in sync" while minimizing the number of conversions to and from a string value. It's quite complicated and that is why you sometimes will find bugs where something only fails if the variable doesn't exist yet, or it only fails if the variable already exists, or somehow changes to a database variable don't get saved or loaded from the pkg correctly.

If I wasn't worried about compatibility and performance, then writing a MUD client would be a lot more trivial (which is what a lot of people learn when they try to create their own client).
Reply with quote
mr_kent
Enchanter


Joined: 10 Oct 2000
Posts: 698

PostPosted: Tue Jun 08, 2010 7:47 am   
 
I think understand your concerns. I have a few thoughts/questions about this decision. Presumably, at some point CMUD v#.x will become obsolete and unsupported.
How far into the future will you be required to maintain backward-compatibility as it pertains to this issue? That is, will this data need to be duplicated indefinitely? My initial thought is yes but then, I'm sure that's not true at all.

As I understand it, this is an issue that I would want to view as a hurdle to jump or work through and be done with, rather than an ongoing concern. Once a stable version of CMUD is released with all the bugs ironed out relating to the new json object implementation, will we still need to (or need to be able to) create XML and PKG files that are compatible with earlier versions?

I'm likely over-thinking this but, might it be better or easier to create duplicate files by adding back the export/save code from a previous version, with the idea of eventually eliminating the duplication once you're confident that the new data structures work correctly? Do you need to create bridges extending beyond the beta versions?

Finally, are you able to force older versions of CMUD to ignore the extra data in the files? If so, perhaps just a field could be added pointing to the new-format file for subsequent versions to load.

If what I'm asking makes no sense then just say so. I might be even more clue-deficient than I know. Thanks for a wonderful product.
Reply with quote
Rahab
Wizard


Joined: 22 Mar 2007
Posts: 2320

PostPosted: Tue Jun 08, 2010 1:08 pm   
 
I believe what you are describing is exactly what Zugg has said he would probably be doing.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Tue Jun 08, 2010 5:00 pm   
 
Yes, older versions will ignore any extra data in the XML output and any extra fields in the database (or extra options in the existing Options freeform field). So that's fine.

As far as supporting old versions...in the past, upgrades were free so I could just tell people they had to upgrade. Now that upgrades are paid, I plan to support the previous major version, at least as far as compatibility and support. The old version is frozen, so no bug fixes to it though.

I'm going with a "combo approach" on this. I'm going to add a preference option to "Output compatible format for string lists and database variables" as suggested above, but I'm going to make this Enabled by default. Power-users who don't want to see the duplicated old string format data in their packages can turn this off, with the understanding that then it cannot be used by players with an older version of CMUD. And TeSSH would have this option turned off because there is no sense adding this kind of compatibility mess to TeSSH.

But CMUD would keep this option enabled by default for probably a year. Then I'd just change the default value of this option to phase it out.
Reply with quote
GeneralStonewall
Magician


Joined: 02 Feb 2004
Posts: 364
Location: USA

PostPosted: Tue Jun 08, 2010 8:43 pm   
 
So what's the plan from here, then? Do you need to make those low-level code adjustments that you mentioned earlier, resulting in a new alpha, or is 3.18 still the focus for now? I ask because I want to know whether I should continue fumbling around with 3.18, or wait for the next release.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Tue Jun 08, 2010 8:46 pm   
 
Yes, I still need to make the low-level changes, so you might as well wait for the next release. It might be stable enough to call a Beta, but it's going to be another "private" beta just for people in this forum. Should be out today or tomorrow.
Reply with quote
GeneralStonewall
Magician


Joined: 02 Feb 2004
Posts: 364
Location: USA

PostPosted: Wed Jun 09, 2010 4:37 am   
 
Just to be clear: This was only an issue with nested databases and lists?
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Wed Jun 09, 2010 4:50 pm   
 
No, it's an issue with *any* string list or database variable. These variables are now stored internally as json tables and saved/exported via the json string format rather than the old zMUD string format. If the new option is enabled, then *both* string formats will be saved, otherwise only the new json string format will be saved.

Example:
Code:
dbvar=""
#addkey dbvar name zugg
#addkey dbvar level 20
list = ""
#additem list zugg
#additem list chiara

with the above variables, you have the following different string formats:
Code:
dbvar zmud: name=zugg|level=20
dbvar json: {"name":"zugg","level":20}

list zmud: zugg|chiara
list json: ["zugg","chiara"]

Note that this only effects how the data is saved to the pkg database and exported/imported via XML. When you access the "string value" of a database variable or list within your scripts, you still get the zMUD string format. You only get the json string format in your script if you specifically use the new %json function.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Wed Jun 09, 2010 10:29 pm   
 
Still fixing more low-level bugs in this stuff. I keep finding new bugs (some bugs caused by the new changes, other bugs that were still in 3.18d). Making good progress but still not ready to release a new beta yet. Hopefully tomorrow.
Reply with quote
Display posts from previous:   
Post new topic   Reply to topic     Home » Forums » CMUD Beta Forum All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

© 2009 Zugg Software. Hosted by Wolfpaw.net