Register to post in forums, or Log in to your existing account
 

Play RetroMUD
Post new topic  Reply to topic     Home » Forums » CMUD General Discussion Goto page 1, 2  Next
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 1:02 am   

Feature request: Relevancy to searches in CMUD Documentation
 
In the CMUD Documentation viewer, you can search for something, but sometimes dozens of results match the command. For example, say you want to find the #SHOW documentation. You put in #SHOW and click Search. #SHOW is the 19th result. Can we make (i) exact match on the title of article appear higher up then (ii) matches within the title (where the title includes other text) and both of those higher up then (iii) matches within the body of the article? Or some other technique for getting relevancy. For example, with (iii) we could sort this by the number of times the term is found within the article.

Right now, I find I'm using the zMUD help file more than the CMUD help file because of the index and bookmarks in the zMUD help file - I can find anything very quickly because it is either well indexed or I have already bookmarked it (only have 9 bookmarks in it). I know you already ruled out an index for CMUD, but search really needs to be improved!
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Thu Oct 18, 2007 1:14 am   
 
Unfortunately, there is no way to do that. The local Help database is an SQLite database, and SQLite doesn't have any sort of relevancy or full text searching functions. Sorry. You'll have to use the Knowledge Base search in the Site menu on the web site for that. What it is doing right now is using the "x LIKE y" SQL query, which is about all I can do.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 2:04 am   
 
What are x and y? I find it impossible to believe that nothing can be done to improve the situation. I can think of some SQL queries that would help, but I need more to work with here.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 2:27 am   
 
BTW, you should really consider upgrading SQLite (at least for the Documentation initially) as the version you are using is over two and a half years old and many features (like full-text searching), fixes and performance improvements have been made since.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Thu Oct 18, 2007 4:14 am   
 
It's doing this right now:
Code:
...WHERE (Name LIKE '%Search%') OR (Body LIKE '%Search%')

where "Search" is whatever you enter into the help search field.

Quote:
you should really consider upgrading SQLite

It's on my list. But it requires upgrading the ZeosLib database components that I use for Delphi and back when I got them they only supported v2.8. They their web site went away and since the components worked, I didn't bother with it. I think they are back and I *think* they now support v3. But it's been low on my to-do list since the ZeosLib components are one of the few 3rd party components I haven't had *any* trouble yet and I hate to "fix something thats not broken".
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 10:57 am   
 
OK, how about this then:
Code:
SELECT My, Current, List, Of, Columns, 10 AS MyOrder, LEN(Name)
FROM MyHelpDB
WHERE (Name = 'Search')
UNION
SELECT My, Current, List, Of, Columns, 20 AS MyOrder, LEN(Name)
FROM MyHelpDB
WHERE (Name LIKE '%Search%')
UNION
SELECT My, Current, List, Of, Columns, 30 AS MyOrder, LEN(Name)
FROM MyHelpDB
WHERE (Body LIKE '%Search%')
ORDER BY MyOrder, LEN(Name)

This will order them as listed in my first post - exact matches first, then partial name matches, then partial body matches (but without the frequency part, plus giving preference to articles with short names). It won't be quite as quick as your original SQL, but I reckon we can spare a few milliseconds. Actually, there are a couple of other issues with this query: you may not want to return the extra columns I added, and it may contain duplicates.

Using a self-join is better for performance (only one pass through the data is required), but it requires that the table has a numerical column where the values start at 1 and are guaranteed to be sequential. I don't know if you have such a column in this table. If you don't, you can use a pivot table to join against instead of joining the table on itself. Both of these should be faster (if indexed) than the first query, but they still have the same issues where you have extra columns and possible duplicates. The solution to these issues is to either put my query into a SUBSELECT (which I don't think is going to be possible) or to use a temporary table to remove the extra columns and duplicates. Insert the whole result into a temporary table, then SELECT DISTINCT MyColumn FROM MyTempTable. (Or filter the result in Delphi.) Putting my query into a VIEW may also be possible.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Thu Oct 18, 2007 5:05 pm   
 
Quote:
it requires that the table has a numerical column where the values start at 1 and are guaranteed to be sequential

Close, but not quite. There is an "article_id" field that is numeric and sequential. But it doesn't start at 1 and there are gaps in the sequence. That's because it's the ID of all articles in the knowledge base, and the CMUD help file only contains those articles that apply to the CMUD Help. But it *is* a unique value (and is actually the primary key for the table).

SQLite uses LENGTH instead of LEN (yeah, so much for SQL standards). But it did seem to work pretty well. Now it's just a matter of removing the duplicates. I'm no SQL expert, so any suggestions are welcome.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 6:42 pm   
 
Which query are you using? The UNION query I posted?

Whichever, I suggest you create a VIEW:
Code:
CREATE VIEW IF NOT EXISTS [database-name.] view-name AS select-statement

e.g. Assuming that currently you only select the Article's name and article_id:-
Code:
CREATE VIEW IF NOT EXISTS SearchResultsView AS
   SELECT Name, article_id, 10 AS MyOrder, LENGTH(Name)
   FROM MyHelpDB
   WHERE (Name = 'Search')
   UNION
   SELECT Name, article_id, 20 AS MyOrder, LENGTH(Name)
   FROM MyHelpDB
   WHERE (Name LIKE '%Search%')
   UNION
   SELECT Name, article_id, 30 AS MyOrder, LENGTH(Name)
   FROM MyHelpDB
   WHERE (Body LIKE '%Search%')
   ORDER BY MyOrder, LENGTH(Name), Name

(I added Name to the order in case it makes sense for articles to be sorted by name as 3rd choice - might make it easier to find stuff in the results.)

Then:
Code:
SELECT DISTINCT Name, article_id FROM SearchResultsView

I've not tested any of this, and I don't know if your version of SQLite will accept a UNION in a VIEW (although I suspect it will, and the latest version certainly does). If it doesn't, you'll have to go with one of my other suggestions, both of which use an INNER JOIN instead. Unfortunately, they are pretty complicated to get right (not INNER JOINs in general - just these cases) so I'm not going to attempt this now. Also, with article_id - if it doesn't start at 1, it is probably still possible to use it instead of a pivot table, as long as the first article_id is constant then there can be a consistent subtraction, but it will be more complicated and conceptually harder to grasp.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 10:39 pm   
 
Heh, I just realised this is only going to work if you want to search for "Search". i.e. it's hard coded. Back to the drawing board!
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 10:49 pm   
 
I got the SUBSELECT I mentioned before to work - I thought this is how it would work but I was testing against SQL Server and it was baulking on it. So I downloaded SQLite Administrator and tried this:
Code:
SELECT DISTINCT article_title FROM
   (SELECT article_title, article_id, 10 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_title = '#SHOW')
   UNION
   SELECT article_title, article_id, 20 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_title LIKE '%#SHOW%')
   UNION
   SELECT article_title, article_id, 30 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_body LIKE '%#SHOW%')
   ORDER BY MyOrder, LENGTH(article_title), article_title)

And it worked fine! As there is no hardcoded view needed, CMUD can substitute whatever it wants for #SHOW. Nice! It works nicely on 'trigger' too. #TRIGGER is the first result! Very Happy (Runs 'within 50 ms' too, so I reckon we're OK!)
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Thu Oct 18, 2007 11:08 pm   
 
Cool! Thanks for helping with this. I'll try to add this new SQL statement to one of the next versions.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 11:13 pm   
 
Oh, you probably want the first line as:
Code:
SELECT DISTINCT article_title, article_id FROM

since you probably use the article_id invisibly to know which article to get when a user clicks on the article_title. I added it to the SUBSELECT, but forgot to add it to the outer SELECT in my last post.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Thu Oct 18, 2007 11:56 pm   
 
Seb wrote:
It works nicely on 'trigger' too. #TRIGGER is the first result!

Wow, I can't get over how much better this is! If you do a search for 'trigger' in v2.06 documentation, the articles that really deal with triggers are around 100 down from the top! Very Happy

OK, it looks like you are doing some stuff with the tree in v2.06 after doing a search, so maybe the SQL will need to be a bit more complicated, or maybe you are doing that in Delphi... But you should really try and get this in soon!
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Fri Oct 19, 2007 1:37 am   
 
Zugg wrote:
It requires upgrading the ZeosLib database components that I use for Delphi and back when I got them they only supported v2.8. They their web site went away and since the components worked, I didn't bother with it. I think they are back and I *think* they now support v3. But it's been low on my to-do list since the ZeosLib components are one of the few 3rd party components I haven't had *any* trouble yet and I hate to "fix something thats not broken".

I saw this on ZeosLib's Announcement forum (SQLite Administrator is what I installed earlier):
Quote:
The SQLite Administrator now supports SQLite 2.8 and SQLite 3.0 databases (using a CVS snapshot of ZeosLib's DBOs, Version 6.5.1).

They're currently on 6.6.1. SQLite 3.x has been around for about a year. However, I would be aware of a possible memory leak.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Fri Oct 19, 2007 1:43 am   
 
Ooh, SQLite Administrator even has a "Copy as Delphi Source" command (right-click on your SQL query).
Reply with quote
Tech
GURU


Joined: 18 Oct 2000
Posts: 2733
Location: Atlanta, USA

PostPosted: Fri Oct 19, 2007 5:21 am   
 
I'd like to piggyback on this request a bit.

I would like to be able to do a Ctrl-F search on the panel that actually displays the current documentation (i.e. the large panel to the right in the CMUD Documentation app).

Additionally it would be nice to have search terms highlighted (but this is more a nice to have.) The reason being is if I search for 'gauge' for example I know that any text that comes up references the term somewhere, but it can be difficult to quickly find exactly where my search term was referenced.
_________________
Asati di tempari!
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Fri Oct 19, 2007 9:10 am   
 
I agree with both points, although I think I mentioned this before and Zugg said the problem was that the Ctrl-F was being captured already by another control. Some programs use F3 (Windows Explorer) and F5 (Textpad) for searching, so those are other options. Or Ctrl-Shift-F, perhaps.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Fri Oct 19, 2007 5:40 pm   
 
Are you talking about the "quick reference" panel? Unfortunately, there is no way to do any useful searches of that data...it's not a searchable database, it's just the hash table used for performing the low-level lookup and parsing of commands. You can only search the full helpfile database.

Highlighted search terms might be possible, but I'd have to think about how to really implement that. The current HTML component that is used to display the body of the text message is rather complicated and I'd somehow have to intercept the HTML rendering to highlight something. Not something that quick or easy (or very high on the priority list), sorry.
Reply with quote
Tech
GURU


Joined: 18 Oct 2000
Posts: 2733
Location: Atlanta, USA

PostPosted: Fri Oct 19, 2007 6:59 pm   
 
I didn't mean the Quick Reference panel, I meant the actual help documentation (i.e. the one that comes as a separate app). On the left side is the Topics list (table of contents) and on the right is where the topic is actually displayed.

I totally understand it not being high on the priority list, that's why it's a request. Very Happy
_________________
Asati di tempari!
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Fri Oct 19, 2007 10:36 pm   
 
You mean "the one that *also* comes as a separate app". It would be nice to be able search the right-hand panel. And in terms of the highlighting - something like Firefox does (and is also available for IE7 with IE7Pro or Find As You Type add-ons). In fact, if the CMUD Documentation viewer was able to send the html direct to your browser, rather than (or in addition to) opening the URL from zuggsoft.com, then we could use the Find in Page features with Next, Previous and Highlight all that is available in Firefox, etc. What's the component that is displaying the HTML in the CMUD Documentation viewer at the moment?

Heh, managed to create a 100% CPU lockup while experimenting with Ctrl-F in the CMUD Documentation viewer, but I'll make a separate thread for that. (It's actually a CMUD main window bug).
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Fri Oct 19, 2007 11:16 pm   
 
Oh, I see what you are talking about...searching the actual body of the help topic being shown. Ok, I'll add that to the wish list.

Seb: The CMUD Documentation Viewer hasn't been recompiled in a while and might be incompatible with the current help database. It's on my list to rebuild it, but I've had higher priority stuff to do.

In my current SQL, I actually do a "SELECT * FROM articles WHERE..." because it creates the dataset that all of the data-aware components on the form use. So I need more than just the article_title and article_id fields.

I think I'm going to use the current SQL without the UNIQUE as a basic improvement to the search, even though it includes duplicates. If you can figure out another way to do a query that will get rid of duplicates, but still give all of the fields in the result, let me know.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Fri Oct 19, 2007 11:30 pm   
 
The lockup wasn't in the separate CMUD Documentation Viewer exe - that doesn't have the problem - it's a main CMUD program problem and I've created a thread in the Beta forum.

The CMUD Documentation Viewer didn't work run when I tried it, but I managed to get it to work with a bit of playing around (copy the cmudhelp.db from latest CMUD) and copy sqlite.dll while renaming it to sqlite28.dll.

SQL to follow.
Reply with quote
Seb
Wizard


Joined: 14 Aug 2004
Posts: 1269

PostPosted: Fri Oct 19, 2007 11:38 pm   
 
SQL (just add the other columns to all the selects):
Code:
SELECT DISTINCT article_id, article_sort, article_date, article_title, article_parent, article_index, article_ref, article_body FROM
(
   SELECT article_id, article_sort, article_date, article_title, article_parent, article_index, article_ref, article_body, 10 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_title = 'trigger')
   UNION
   SELECT article_id, article_sort, article_date, article_title, article_parent, article_index, article_ref, article_body, 20 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_title LIKE '%trigger%')
   UNION
   SELECT article_id, article_sort, article_date, article_title, article_parent, article_index, article_ref, article_body, 30 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_body LIKE '%trigger%')
   ORDER BY MyOrder, LENGTH(article_title), article_title
)

Edit: You could also use SELECT *, XX AS MyOrder, LENGTH(article_title) FROM ...
in all of the inner selects, but it's generally better practice to explicitly list the columns as then your dependent code doesn't break if columns are reordered or added to the database.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Fri Oct 19, 2007 11:53 pm   
 
Hmm, I could have sworn that when I tried that it still returned the duplicate entries. Well, I'll give it a try in 2.08 next week (just a bit too late for 2.07)
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Sat Oct 20, 2007 2:06 am   
 
OK, when I try the above SQL code, then it returns all of the fields as STRING types instead of their proper database field types. It's something about the outer SELECT, since even removing the DISTINCT keyword still has the problem. But if I just use:
Code:

   SELECT article_id, article_sort, article_date, article_title, article_parent, article_index, article_ref, article_body, 10 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_title = 'trigger')
   UNION
   SELECT article_id, article_sort, article_date, article_title, article_parent, article_index, article_ref, article_body, 20 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_title LIKE '%trigger%')
   UNION
   SELECT article_id, article_sort, article_date, article_title, article_parent, article_index, article_ref, article_body, 30 AS MyOrder, LENGTH(article_title)
   FROM articles
   WHERE (article_body LIKE '%trigger%')
   ORDER BY MyOrder, LENGTH(article_title), article_title

the it works (except it has the duplicates). Any idea why the field types are being changed by the outer SELECT?
Reply with quote
Display posts from previous:   
Post new topic   Reply to topic     Home » Forums » CMUD General Discussion All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

© 2009 Zugg Software. Hosted by Wolfpaw.net