source: branches/2.20_branch/NewView/Design.txt@ 461

Last change on this file since 461 was 18, checked in by RBRi, 19 years ago

+ newview source

  • Property svn:eol-style set to native
File size: 20.2 KB
Line 
1NewView Design Notes
2--------------------
3
4Like all good hobbyist apps NewView didn't really get an overall design.
5Overall it is largely on Windows HTMLHelp, although arguably that is a fairly
6obvious design these days.
7
8Some aspects of it are somewhat non-obvious.
9
10NewView.exe is a Speedsoft Sibyl application. The original Sibyl libraries
11have been significantly bug-fixed, and slightly enhanced for performance etc.
12They were originally Sibyl Fixpack 3, as I had difficulties with Fixpack 4
13(FP4 introduced Linux support, so changed many things and broke some).
14
15The new HelpMgr.dll is written using Open Watcom C++
16http://openwatcom.org
17Actually it's straight C, for some reason I did not use C++...
18
19===============================================================================
20
21Sibyl Problems
22--------------
23
24- NewView is near a limit on the total number of Units that can be linked.
25I've removed unneeded dependencies, we must be down to about 6 remaining...
26
27The max I could use had 28 files in the project plus 108 dependencies
28
29- As always you can't use optimisation. Most of it probably works but
30especially the bitmap decompression was going wrong when I accidentally
31turned optimisation on.
32
33- I/O checking is a funny thing. The original Sibyl code turned it off
34explicitly in system.pas so it was useless. I changed that.
35
36- Assembler: Sibyl doesn't recognise some valid 386 instructions
37and certainly can't do any Pentium/SSE/SSE2/3dNow instructions :)
38
39===============================================================================
40
41Strings
42-------
43
44Unfortunately Sibyl is still stuck in the Delphi 1.0 (?) days when strings
45were at most 255 bytes. This makes them useless for a lot of the stuff NewView
46does (such as decoding help topics). Sibyl has AnsiStrings, which are large,
47referenced counted strings. I think they do work, but unfortunately most of
48Sibyl's libraries do not use them, and without optimisation there is no
49efficient way to concatenate onto ansistrings(!).
50
51As a result of all this I use pchars (zero terminated strings), or
52my TAString class. This is incredibly tedious compared to automatically managed
53strings, but no worse than any other objects in Object Pascal, and these strings
54are pretty efficient for concatenation, and have no length limits. I made
55many conversion functions such as AsString to make it easier to combine with
56other types of strings.
57
58===============================================================================
59
60Performance
61-----------
62
63Currently, TAString has a cunning piece of code that explicitly checks for
64invalid object references in *every method*. While very cunning, this is kind of
65ridiculous performance wise, because it sets up exception handling even for addition
66of a single character. Probably, it should be compiled out for a release mode.
67
68Sibyl's compiler is unfortunately completely broken in optimising mode. How a
69commercial compiler could be released like this is an interesting question. Suffice
70to say, Sibyl gives you fast-ish compiled code, but it is probably less than half
71as fast as a properly optimsed C/C++ program, maybe much worse. This is primarily
72because it does not make much use of registers, instead using stack for nearly
73everything.
74
75In any case, I have done a few key things:
76
77- const string parameters wherever possible.
78 This avoids the large overhead of copying the entire string each call
79
80- assembly for a few key things
81 Sibyl already has some assembly for certain string ops, I added a few more.
82 More could be used, but there are diminishing returns. Sibyl does have
83 a highly optimised Move (memory copy) so that is used e.g. in TAString
84
85- Minimising string operations
86 Some bits of the help topic decoding are a bit longer to avoid so much
87 copying of strings.
88
89Of course, algorithms have been optimised carefully, especially in the area
90of file I/O: I changed the help file loader over from loading the entire file
91to just loading pieces as needed.
92
93On the whole this achieves decent peformance. Unfortunately, the SPCC component
94library seems somewhat sluggish - not to the level of interpreted code, but
95definitely much less snappy than compiled C apps. Here, I think optimisation is
96desperately needed, to speed up all the repititive loops in e.g. loading forms
97from SCU (resource) data. Short of re-writing the whole thing in C/C++ this cannot
98be improved much.
99
100===============================================================================
101
102Multi-lingual Support
103---------------------
104
105The goals of the language code is
106
107a) Load languages at runtime, reasonably fast
108b) Be human editable, so that people can contribute without
109 any tools required or even contacting the author
110c) Be able to update an existing language file with missing
111 items to avoid manual tracking of which strings have
112 been translated
113d) Minimal specific code
114
115b and c are to encourage open source contributions.
116
117Resources or msg files would satisfy a, but neither is human
118editable, and both are difficult to update at runtime.
119
120Instead, I implemented the following design
121
122Language Files
123
124These files are simply text files, each line consisting of:
125
126 <item name> "<item value>"
127
128<item name> is the name of various text strings from the application.
129The names are defined by the application code.
130<item value> is the string that should be used for this language.
131Strings can be DBCS for e.g. Japanese.
132
133Forms
134
135When created, each form registers a method of itself for language events,
136which looks like this:
137
138 Procedure OnLanguageEvent( Language: TLanguageFile;
139 const Apply: boolean );
140
141The registration is usually done in OnCreate
142
143 RegisterForLanguages( OnLanguageEvent );
144
145This callback is called in one of two cases:
146 1) Apply = true: A new language is being loaded, or
147 2) Apply = false: A language file is being updated/saved. (more later).
148
149In the either case, the form is expected to load (or pretend to load) all it's strings from the language file it is given. Specifically, it should:
150
151 1) Call the Language.LoadComponentLanguage( self, Apply )
152
153 This loads strings for any static control elements e.g.
154 menu and button captions and hints. Parts of controls that
155 are normally variable e.g listbox items are not loaded.
156
157 2) Load any strings to be used in code ("code strings")
158
159 It should do this by calling Language.LL:
160 Language.LL( Apply, <string var>, '<name of string>', '<default>' );
161 for each string to be found.
162 LL will look up the given name in the language file,
163 then store the value into the string variable, or <default> if not found.
164
165When the form first registers itself, the language callback will be
166called immediately, so the form can load the current language.
167
168Note: Even if the default language is being used (ie. no language file used,
169ie. English) this still happens so that default values can be loaded for
170code strings (e.g. prompts).
171
172If Apply is false then no strings will actually be loaded but the references (names) will be recorded.
173
174Strings Outside of Forms
175
176Code strings that are needed but aren't related to a form can be
177registering a procedure at unit initialization:
178
179 Initialization
180 RegisterProcForLanguages( OnLanguageEvent );
181
182These language events look the same as for forms but are not methods:
183
184 Procedure OnLanguageEvent( Language: TLanguageFile;
185 const Apply: boolean );
186
187this procedure should
188
189 1) Set the prefix: Language.Prefix := '<Category>.';
190
191 This is mostly a convenience, allowing codestrings to
192 be grouped together. <Category> could be unit name, for instance.
193
194 NOTE: LoadControlLanguage sets prefix automatically to the form
195 name being loaded, which allows related codestrings to automatically
196 appear under the form name. However, this prefix remains in effect
197 until changed or another form is loaded, so you MUST set it if loading
198 strings outside a form.
199
200 2) Call LL for each needed string, as before
201
202Loading Languages
203
204 var
205 Language: TLanguageFile;
206
207 try
208 Language := TLanguageFile.Create( Filename );
209 except
210 ...
211 end;
212
213 ApplyLanguage( Language );
214
215This will call all registered methods and procedures telling them that
216a new language is to be loaded.
217
218Updating Language Files
219
220 try
221 Language := TLanguageFile.Create( Filename );
222
223 UpdateLanguage( Language );
224
225 Language.AppendMissingItems;
226 except
227 ...
228 end;
229
230 Language.Destroy;
231
232UpdateLanguage calls all registered methods and procedures, but with
233Apply set to false so that required items are searched for, but not
234actually applied to components or code string variables.
235
236After UpdateLanguage, the TLanguageFile contains a list of missing items
237and their default values. AppendMissingItems writes this list to the
238end of the file called filename.
239
240When a new language file is being saved, none of the required items
241will be found, so all of them will be stored. Note that the strings
242saved to the file will be the default value for codestring, but the
243last loaded language strings for forms.
244
245Dynamically Loaded Forms
246
247In order that language files can include all required strings,
248all forms must be loaded before updating a language file. This can be
249semi-automated by registering an "ensure loaded" procedure for each
250form:
251
252 procedure EnsureMyFormLoaded;
253 begin
254 if MyForm = nil then
255 MyForm := TMyForm.Create( nil );
256 end;
257
258then registering this procedure in the UNIT initialization:
259
260 Initialization
261 ...
262 RegisterUpdateProcForLanguages( EnsureMyFormLoaded );
263
264UpdateLanguage will call all these procedures before accessing the
265language file, ensuring that all forms are loaded.
266
267Performance of String Lookups
268
269Requirement (a) included the desire to be "reasonably fast". Well, looking up
270names in a string list is not the fastest thing to do. I've made this use
271a binary search, so the algorithm is O( N log N ) which is generally considered
272acceptable. I haven't done any timings though.
273
274Testing
275
276... there are currently 443 strings required for NewView ...
277
278In practice it seems to take no detectable time to load a language. That's
279on my Duron 1.1G, 256MB RAM.
280
281Multilingual stuff could be further optimised:
282- Optimise for sequential lookup. Most of the time, the next item will be
283 immediately after the previous item, so long as the language file
284 has not been rearranged. If not, fall back to normal search.
285 This is probably the simplest and most beneficial.
286
287- change to pstrings: instead of copying strings from the language file
288 (codestrings only), store pointers to them.
289 This would save time (since not copying so many strings) and memory
290 (since there would not be two copies of all strings).
291 The time is probably not significant because there is not a lot of data.
292 The memory would be nice to save, but a lot of the strings are component
293 properties which are generally allocated dynamically anyway, and we
294 cannot pass a pstring to component string properties.
295
296- could free strings from the language file after using them.
297 This would save memory, but might have side effects, and would be no faster.
298 OTOH it would save a lot of memory since even component properties would
299 not be duplicated.
300
301- lookup strings hiearchically (e.g. first search for formname in a small list)
302 This would be a huge improvement in speed for random access, but is probably
303 not needed if sequential access is optimised (see above)
304
305===============================================================================
306
307Exception Logging
308-----------------
309
310I added code to the Sibyl libraries to store a callstack into global variables.
311Performance wise this is outrageously inefficient since it happens on every
312exception, but then exceptions should not occur in normal processing.
313
314This global callstack can be accessed in a method specified in Application.OnException.
315
316This code doesn't currently work properly from other threads - ie. the search thread.
317
318
319===============================================================================
320
321Help Manager
322------------
323
324Replacing HELPMGR.DLL (c:\os2\dll\helpmgr.dll).
325
326HlpMgr2.pas contains a DLL implementation with the same interface.
327
328Use DLLRNAME target.exe HELPMGR=HLPMGR2 for testing.
329
330Initially, I was going to put all the code in HLPMGR2, loading mainform etc.
331
332Problem is, that programs using help manager may have very small stack space, e.g.
333ICONEDIT.EXE has 24kB. NewView overflows this and crashes.
334
335Also, as a principle, I feel uncomfortable loading all my unreliable code
336into an existing, possibly very robust application's process space, potentially
337crashing it if anything goes wrong.*
338
339The third problem was the need to implement 16-bit entry points for certain
340older applications, such as PMCHKDsk.
341
342Therefore, instead we launch a NewView session (and relaunch if needed)
343The filename is passed as normal.
344
345In addition these parameters are passed:
346/hm:xxx
347indicating "helpmanager mode"
348xxx is the HelpManager window, for NewView to send messages to.
349
350/owner:yyy
351yyy is the application window using helpmanager
352
353
354* HelpMgr is small, tight C code with lots of validation and no GUI code. It's much
355simpler than NewView itself.
356
357Who Owns the Help Window
358------------------------
359
360In original helpmgr, the help window is set to be owned by EITHER
361a) the active top-level window when help requested (by whatever means)
362OR, if set:
363b) the "relative window" set by a HM_SET_ACTIVE_WINDOW message.
364
365In NewView I've currently disabled the ownership; because it's usually more annoying than helpful.
366
367Associations
368------------
369
370One or more windows can be associated with a help instance.
371
372I suspect the original HelpMgr stores a window's associated help instance somewhere in window words, but this is not documented.
373
374Instead, I store a list of associated windows with each help instance.
375
376?? Is this section out of date?
377
378===============================================================================
379
380ViewStub
381--------
382
383ViewStub opens up shared memory and examines a linked list there, that specifies
384all files that any instance of NewView has open. If it finds the same set of files
385as is being specified in it's parameters, already open in one newview instance, then
386it just activates that instance (WinSetFocus; the list contains the main HWND).
387
388It then passes any important parameters (e.g. search text) to the existing instance
389with window messages referring to shared memory, as for help manager.
390
391If not found, then it runs a new instance of NewView (WinStartApp), passing all the
392parameters unchanged.
393
394On eCS 1.1 we have the problem that there is already a copy of newview.exe in x:\ecs\bin.
395So the installer knows about that and a full install replaces that copy. Similarly the
396help file in x:\ecs\book.
397
398===============================================================================
399
400Searching
401---------
402
403The IPF compiler creates a search index table of all the "words" in the text of help topics.
404This does not include the titles (displayed in contents) of topics, nor does it include index words.
405
406Each entry in the search table is either a series of alpha numeric characters, or a single non-alphanumeric symbol. So for example if the text of a topic includes __OS2__ then the search table will have entries for "_" and "OS2".
407
408However it is often useful to be able to search for "words" containing symbols, so the program must look for consecutive strings of search table words.
409
410NewView does this by this method:
411
412For each search term (TSearchTerm):
413 1. Seperate into IPF words (symbols and alphanumeric strings) (TSearchTerm.Parts)
414 2. Search file dictionary
415 for starting words, search for finishing matches.
416 e.g searching for "if(" we would look for words ending in "if"
417 for middle words, search for exact matches (case insensitive).
418 e.g searching for "_os2_" we would look for words matching "os2".
419 There could only be "os2" "Os2" "OS2" and "oS2".
420 for end words, search for beginning matches.
421 e.g searching for "if(" we would look for words starting with "("
422 3. Matched topics are where there is one or more matches for all words
423 (This is obtained by looking up matching topics for each word and
424 ANDing the results together.)
425 AND a search thru the actual topic text shows one or more occurrences of
426 a matching sequence. e.g. we don't want _os2_ to match a topic
427 that just contains an underscore somewhere and "os2" elsewhere.
428 This is complicated by the fact that each word can match more than 1 dictionary
429 word.
430
431
432
433Search logic is:
434
435topic match if:
436 ( ( matches all required terms ) or ( there are no required terms) )
437 and ( ( matches any optional term ) or ( there are no optional terms ) )
438 and ( doesn't match any excluded term ) or ( there are no excluded terms )
439
440Algorithimically this can be done sequentially:
441
442if no terms
443 nothing matches, abort
444
445if ( any optional terms)
446 match each term, or results
447else
448 all topics match by default
449
450if ( any required terms )
451 match each term, and results
452
453if ( any excluded terms )
454 match each term, and ( not results )
455
456... this has the benefit of not requiring keeping 2 flags per topic
457but who cares... that's not a big cost
458
459Keep an "allowed" (=not excldued) flag a "matched" flag
460
461set allowed to true for all
462set matched to false for all
463
464if optional term
465 or results -> matched flags
466if required term
467 and results -> allowed flags
468 or results -> matched flags
469if excluded term
470 and not results -> allowed flags
471
472In practice, for speed this is implemented with arrays of integers acting
473as flags, one for each topic. This is very susceptible to optimisation, even eg. SSE!
474but I have not done any since it seems pretty fast as is.
475
476
477Highlighting matching search terms.
478
479To handle multi-part search terms, the results are stored as something
480called a "word sequence". Each step of the sequence is a mask
481for the entire dictionary, indicating which words can match at
482that step. The steps are arrays and the sequence is a TList.
483
484To handle multi-term searches, we then store each of these word sequences
485in another list.
486
487Finally, because each file has it's own dictionary and search table, we store
488a final list (AllFilesWordSequences) of the results for each file.
489
490A list of results for open files
491 For each file, a list of word sequences
492 For each term, a word sequence
493 For each term part, a mask of matching words
494
495When displaying a topic, we first select a list of word sequences,
496based on the file the topic is from.
497
498Then, as we go through the topic, at each word we look to see if
499we have started one of the word sequences, by looking at each
500sequence in turn and seeing if the word is allowed in the first step.
501
502If we find a match for the start of one of the sequences, we look
503ahead to see if it is a complete match. If it is, we start highlighting,
504then continue decoding and counting down until the sequence is finished.
505
506
507Startup Topic Search
508--------------------
509
510When running View <file> <topic> view does some arbitrary, undocumented search.
511Seems to be matching, in order:
512 - topic title starts with
513 - index entry starts with
514 - topic title contains
515 - index contains
516
517Test case - view cmdref <topic>
518
519topic match? topic shown
520-----------------------------------------------------------------
521dir y DIR - Displays Files in a Directory
522device y DEVICE (OPTICAL.DMD) - Installs Optical Device Driver
523optical y DEVICE (OPTICAL.DMD) - Installs Optical Device Driver
524drivers y BASEDEV - Installs Base Device Drivers
525syntax y Syntax and Parameters [about TRSPOOL]
526saves y BACKUP - Saves Files
527determine y CHKDSK - Checks File System Structure
528deter y Problem Determination
529determines y BUFFERS - Determines Number of Disk Buffers
530divided y "topic not found" [note: divided occurs in the text of the first topic]
531expiration y NET ACCOUNTS - Administers Network Accounts
532expir y NET ACCOUNTS - Administers Network Accounts
533
534
535Help On Help
536============
537
538
5391. Standalone; invoke help
540 old helpmgr: shows the topic itself
541 new helpmgr:
542 first looks for a help window which has marked itself as showing newview help
543 if so activates that
544 else passes [OWNHELP] to NV as filename
545
5463. App help NOT loaded; invoke help on help from app
547
548
5493. App help loaded; invoke help on help from app
550
5514. App help loaded, help on help loaded; invoke help on help from app
552
5535. App help loaded, invoke help from NV
554
5556. App help loaded, help on help loaded, invoke help from NV
556
557
558
Note: See TracBrowser for help on using the repository browser.