1 | # Version 0.05 alpha $Revision: 1.6 $ $Date: 2001/06/24 17:11:26 $
|
---|
2 |
|
---|
3 | =head1 TO DO
|
---|
4 |
|
---|
5 | =over 4
|
---|
6 |
|
---|
7 | =item *
|
---|
8 |
|
---|
9 | LIST_CACHE doesn't work with ties to most DBM implementations, because
|
---|
10 | Memouze tries to save a listref, and DB_File etc. can only store
|
---|
11 | strings. This should at least be documented. Maybe Memoize could
|
---|
12 | detect the problem at TIE time and throw a fatal error.
|
---|
13 |
|
---|
14 | 20010623 This was added sometime prior to 20001025.
|
---|
15 |
|
---|
16 | Try out MLDBM here and document it if it works.
|
---|
17 |
|
---|
18 | =item *
|
---|
19 |
|
---|
20 | We should extend the benchmarking module to allow
|
---|
21 |
|
---|
22 | timethis(main, { MEMOIZED => [ suba, subb ] })
|
---|
23 |
|
---|
24 | What would this do? It would time C<main> three times, once with
|
---|
25 | C<suba> and C<subb> unmemoized, twice with them memoized.
|
---|
26 |
|
---|
27 | Why would you want to do this? By the third set of runs, the memo
|
---|
28 | tables would be fully populated, so all calls by C<main> to C<suba>
|
---|
29 | and C<subb> would return immediately. You would be able to see how
|
---|
30 | much of C<main>'s running time was due to time spent computing in
|
---|
31 | C<suba> and C<subb>. If that was just a little time, you would know
|
---|
32 | that optimizing or improving C<suba> and C<subb> would not have a
|
---|
33 | large effect on the performance of C<main>. But if there was a big
|
---|
34 | difference, you would know that C<suba> or C<subb> was a good
|
---|
35 | candidate for optimization if you needed to make C<main> go faster.
|
---|
36 |
|
---|
37 | Done.
|
---|
38 |
|
---|
39 | =item *
|
---|
40 |
|
---|
41 | Perhaps C<memoize> should return a reference to the original function
|
---|
42 | as well as one to the memoized version? But the programmer could
|
---|
43 | always construct such a reference themselves, so perhaps it's not
|
---|
44 | necessary. We save such a reference anyway, so a new package method
|
---|
45 | could return it on demand even if it wasn't provided by C<memoize>.
|
---|
46 | We could even bless the new function reference so that it could have
|
---|
47 | accessor methods for getting to the original function, the options,
|
---|
48 | the memo table, etc.
|
---|
49 |
|
---|
50 | Naah.
|
---|
51 |
|
---|
52 | =item *
|
---|
53 |
|
---|
54 | The TODISK feature is not ready yet. It will have to be rather
|
---|
55 | complicated, providing options for which disk method to use (GDBM?
|
---|
56 | DB_File? Flat file? Storable? User-supplied?) and which stringizing
|
---|
57 | method to use (FreezeThaw? Marshal? User-supplied?)
|
---|
58 |
|
---|
59 | Done!
|
---|
60 |
|
---|
61 | =item *
|
---|
62 |
|
---|
63 | Maybe an option for automatic expiration of cache values? (`After one
|
---|
64 | day,' `After five uses,' etc.) Also possibly an option to limit the
|
---|
65 | number of active entries with automatic LRU expiration.
|
---|
66 |
|
---|
67 | You have a long note to Mike Cariaso that outlines a good approach
|
---|
68 | that you sent on 9 April 1999.
|
---|
69 |
|
---|
70 | What's the timeout stuff going to look like?
|
---|
71 |
|
---|
72 | EXPIRE_TIME => time_in_sec
|
---|
73 | EXPIRE_USES => num_uses
|
---|
74 | MAXENTRIES => n
|
---|
75 |
|
---|
76 | perhaps? Is EXPIRE_USES actually useful?
|
---|
77 |
|
---|
78 | 19990916: Memoize::Expire does EXPIRE_TIME and EXPIRE_USES.
|
---|
79 | MAXENTRIES can come later as a separate module.
|
---|
80 |
|
---|
81 | =item *
|
---|
82 |
|
---|
83 | Put in a better example than C<fibo>. Show an example of a
|
---|
84 | nonrecursive function that simply takes a long time to run.
|
---|
85 | C<getpwuid> for example? But this exposes the bug that you can't say
|
---|
86 | C<memoize('getpwuid')>, so perhaps it's not a very good example.
|
---|
87 |
|
---|
88 | Well, I did add the ColorToRGB example, but it's still not so good.
|
---|
89 | These examples need a lot of work. C<factorial> might be a better
|
---|
90 | example than C<fibo>.
|
---|
91 |
|
---|
92 | =item *
|
---|
93 |
|
---|
94 | Add more regression tests for normalizers.
|
---|
95 |
|
---|
96 | =item *
|
---|
97 |
|
---|
98 | Maybe resolve normalizer function to code-ref at memoize time instead
|
---|
99 | of at function call time for efficiency? I think there was some
|
---|
100 | reason not to do this, but I can't remember what it was.
|
---|
101 |
|
---|
102 | =item *
|
---|
103 |
|
---|
104 | Add more array value tests to the test suite.
|
---|
105 |
|
---|
106 | Does it need more now?
|
---|
107 |
|
---|
108 | =item *
|
---|
109 |
|
---|
110 | Fix that `Subroutine u redefined ... line 484' message.
|
---|
111 |
|
---|
112 | Fixed, I think.
|
---|
113 |
|
---|
114 | =item *
|
---|
115 |
|
---|
116 | Get rid of any remaining *{$ref}{CODE} or similar magic hashes.
|
---|
117 |
|
---|
118 | =item *
|
---|
119 |
|
---|
120 | There should be an option to dump out the memoized values or to
|
---|
121 | otherwise traverse them.
|
---|
122 |
|
---|
123 | What for?
|
---|
124 |
|
---|
125 | Maybe the tied hash interface taskes care of this anyway?
|
---|
126 |
|
---|
127 | =item *
|
---|
128 |
|
---|
129 | Include an example that caches DNS lookups.
|
---|
130 |
|
---|
131 | =item *
|
---|
132 |
|
---|
133 | Make tie for Storable (Memoize::Storable)
|
---|
134 |
|
---|
135 | A prototype of Memoize::Storable is finished. Test it and add to the
|
---|
136 | test suite.
|
---|
137 |
|
---|
138 | Done.
|
---|
139 |
|
---|
140 | =item *
|
---|
141 |
|
---|
142 | Make tie for DBI (Memoize::DBI)
|
---|
143 |
|
---|
144 | =item *
|
---|
145 |
|
---|
146 | I think there's a bug. See `###BUG'.
|
---|
147 |
|
---|
148 | =item *
|
---|
149 |
|
---|
150 | Storable probably can't be done, because it doesn't allow updating.
|
---|
151 | Maybe a different interface that supports readonly caches fronted by a
|
---|
152 | writable in-memory cache? A generic tied hash maybe?
|
---|
153 |
|
---|
154 | FETCH {
|
---|
155 | if (it's in the memory hash) {
|
---|
156 | return it
|
---|
157 | } elsif (it's in the readonly disk hash) {
|
---|
158 | return it
|
---|
159 | } else {
|
---|
160 | not-there
|
---|
161 | }
|
---|
162 | }
|
---|
163 |
|
---|
164 | STORE {
|
---|
165 | put it into the in-memory hash
|
---|
166 | }
|
---|
167 |
|
---|
168 | Maybe `save' and `restore' methods?
|
---|
169 |
|
---|
170 | It isn't working right because the destructor doesn't get called at
|
---|
171 | the right time.
|
---|
172 |
|
---|
173 | This is fixed. `use strict vars' would have caught it immediately. Duh.
|
---|
174 |
|
---|
175 | =item *
|
---|
176 |
|
---|
177 | Don't forget about generic interface to Storable-like packages
|
---|
178 |
|
---|
179 | 20010627 It would appear that you put this into 0.51.
|
---|
180 |
|
---|
181 | =item *
|
---|
182 |
|
---|
183 | Maybe add in TODISK after all, with TODISK => 'filename' equivalent to
|
---|
184 |
|
---|
185 | SCALAR_CACHE => [TIE, Memoize::SDBM_File, $filename, O_RDWR|O_CREAT, 0666],
|
---|
186 | LIST_CACHE => MERGE
|
---|
187 |
|
---|
188 | =item *
|
---|
189 |
|
---|
190 | Maybe the default for LIST_CACHE should be MERGE anyway.
|
---|
191 |
|
---|
192 | =item *
|
---|
193 |
|
---|
194 | There's some terrible bug probably related to use under threaded perl,
|
---|
195 | possibly connected with line 56:
|
---|
196 |
|
---|
197 | my $wrapper = eval "sub { unshift \@_, qq{$cref}; goto &_memoizer; }";
|
---|
198 |
|
---|
199 | I think becayse C<@_> is lexically scoped in threadperl, the effect of
|
---|
200 | C<unshift> never makes it into C<_memoizer>. That's probably a bug in
|
---|
201 | Perl, but maybe I should work around it. Can anyone provide more
|
---|
202 | information here, or lend me a machine with threaded Perl where I can
|
---|
203 | test this theory? Line 59, currently commented out, may fix the
|
---|
204 | problem.
|
---|
205 |
|
---|
206 | 20010623 Working around this in 0.65, but it still blows.
|
---|
207 |
|
---|
208 | =item *
|
---|
209 |
|
---|
210 | Maybe if the original function has a prototype, the module can use
|
---|
211 | that to select the most appropriate default normalizer. For example,
|
---|
212 | if the prototype was C<($)>, there's no reason to use `join'. If it's
|
---|
213 | C<(\@)> then it can use C<join $;,@$_[0];> instead of C<join $;,@_;>.
|
---|
214 |
|
---|
215 | =item *
|
---|
216 |
|
---|
217 | Ariel Scolnikov suggests using the change counting problem as an
|
---|
218 | example. (How many ways to make change of a dollar?)
|
---|
219 |
|
---|
220 | =item *
|
---|
221 |
|
---|
222 | Jonathan Roy found a use for `unmemoize'. If you're using the
|
---|
223 | Storable glue, and your program gets SIGINT, you find that the cache
|
---|
224 | data is not in the cache, because Perl normally writes it all out at
|
---|
225 | once from a DESTROY method, and signals skip DESTROY processing. So
|
---|
226 | you could add
|
---|
227 |
|
---|
228 | $sig{INT} = sub { unmemoize ... };
|
---|
229 |
|
---|
230 |
|
---|
231 | =item *
|
---|
232 |
|
---|
233 | This means it would be useful to have a method to return references to
|
---|
234 | all the currently-memoized functions so that you could say
|
---|
235 |
|
---|
236 | $sig{INT} = sub { for $f (Memoize->all_memoized) {
|
---|
237 | unmemoize $f;
|
---|
238 | }
|
---|
239 | }
|
---|
240 |
|
---|
241 |
|
---|
242 | =item *
|
---|
243 |
|
---|
244 | 19990917 There should be a call you can make to get back the cache
|
---|
245 | itself. If there were, then you could delete stuff from it to
|
---|
246 | manually expire data items.
|
---|
247 |
|
---|
248 | =item *
|
---|
249 |
|
---|
250 | 19990925 Randal says that the docs for Memoize;:Expire should make it
|
---|
251 | clear that the expired entries are never flushed all at once. He
|
---|
252 | asked if you would need to do that manually. I said:
|
---|
253 |
|
---|
254 | Right, if that's what you want. If you have EXISTS return false,
|
---|
255 | it'll throw away the old cached item and replace it in the cache
|
---|
256 | with a new item. But if you want the cache to actually get smaller,
|
---|
257 | you have to do that yourself.
|
---|
258 |
|
---|
259 | I was planning to build an Expire module that implemented an LRU
|
---|
260 | queue and kept the cache at a constant fixed size, but I didn't get
|
---|
261 | to it yet. It's not clear to me that the automatic exptynig-out
|
---|
262 | behavior is very useful anyway. The whole point of a cache is to
|
---|
263 | trade space for time, so why bother going through the cache to throw
|
---|
264 | away old items before you need to?
|
---|
265 |
|
---|
266 | Randal then pointed out that it could discard expired items at DESTRoY
|
---|
267 | or TIEHASH time, which seemed like a good idea, because if the cache
|
---|
268 | is on disk you might like to keep it as small as possible.
|
---|
269 |
|
---|
270 | =item *
|
---|
271 |
|
---|
272 | 19991219 Philip Gwyn suggests this technique: You have a load_file
|
---|
273 | function that memoizes the file contexts. But then if the file
|
---|
274 | changes you get the old contents. So add a normalizer that does
|
---|
275 |
|
---|
276 | return join $;, (stat($_[0])[9]), $_[0];
|
---|
277 |
|
---|
278 | Now when the modification date changes, the true key returned by the
|
---|
279 | normalizer is different, so you get a cache miss and it loads the new
|
---|
280 | contents. Disadvantage: The old contents are still in the cache. I
|
---|
281 | think it makes more sense to have a special expiration manager for
|
---|
282 | this. Make one up and bundle it.
|
---|
283 |
|
---|
284 | 19991220 I have one written: Memoize::ExpireFile. But how can you
|
---|
285 | make this work when the function might have several arguments, of
|
---|
286 | which some are filenames and some aren't?
|
---|
287 |
|
---|
288 | =item *
|
---|
289 |
|
---|
290 | 19991219 There should be an inheritable TIEHASH method that does the
|
---|
291 | argument processing properly.
|
---|
292 |
|
---|
293 | 19991220 Philip Gwyn contributed a patch for this.
|
---|
294 |
|
---|
295 | 20001231 You should really put this in. Jonathan Roy uncovered a
|
---|
296 | problem that it will be needed to solve. Here's the problem: He has:
|
---|
297 |
|
---|
298 | memoize "get_items",
|
---|
299 | LIST_CACHE => ["TIE", "Memoize::Expire",
|
---|
300 | LIFETIME => 86400,
|
---|
301 | TIE => ["DB_File", "debug.db", O_CREAT|O_RDWR, 0666]
|
---|
302 | ];
|
---|
303 |
|
---|
304 | This won't work, because memoize is trying to store listrefs in a
|
---|
305 | DB_File. He owuld have gotten a fatal error if he had done this:
|
---|
306 |
|
---|
307 | memoize "get_items",
|
---|
308 | LIST_CACHE => ["TIE", "DB_File", "debug.db", O_CREAT|O_RDWR, 0666]'
|
---|
309 |
|
---|
310 |
|
---|
311 | But in this case, he tied the cache to Memoize::Expire, which is *not*
|
---|
312 | scalar-only, and the check for scalar-only ties is missing from
|
---|
313 | Memoize::Expire. The inheritable method can take care of this.
|
---|
314 |
|
---|
315 | 20010623 I decided not to put it in. Instead, we avoid the problem by
|
---|
316 | getting rid of TIE. The HASH option does the same thing, and HASH is
|
---|
317 | so simple to support that a module is superfluous.
|
---|
318 |
|
---|
319 | =item *
|
---|
320 |
|
---|
321 | 20001130 Custom cache manager that checks to make sure the function
|
---|
322 | return values actually match the memoized values.
|
---|
323 |
|
---|
324 | =item *
|
---|
325 |
|
---|
326 | 20001231 Expiration manager that watches cache performance and
|
---|
327 | accumulates statistics. Variation: Have it automatically unmemoize
|
---|
328 | the function if performance is bad.
|
---|
329 |
|
---|
330 | =item *
|
---|
331 |
|
---|
332 | 20010517 Option to have normalizer *modify* @_ for use by memoized
|
---|
333 | function. This would save code and time in cases like the one in the
|
---|
334 | manual under 'NORMALIZER', where both f() and normalize_f() do the
|
---|
335 | same analysis and make the same adjustments to the hash. If the
|
---|
336 | normalizer could make the adjustments and save the changes in @_, you
|
---|
337 | wouldn't have to do it twice.
|
---|
338 |
|
---|
339 | =item *
|
---|
340 | 20010623 Add CLEAR methods to tied hash modules.
|
---|
341 |
|
---|
342 | =item *
|
---|
343 | 20010623 You get a warning if you try to use DB_File as LIST_CACHE,
|
---|
344 | because it won't store lists. But if you use it as the underlying
|
---|
345 | cache with an expiration manager in the middle, no warning---the
|
---|
346 | expiration manager doesn't know it's managing a list cache, and
|
---|
347 | memoize doesn't know that DB_File is underlying. Is this fixable?
|
---|
348 | Probably not, but think about it.
|
---|
349 |
|
---|
350 | =item *
|
---|
351 | There was probably some other stuff that I forgot.
|
---|
352 |
|
---|
353 |
|
---|
354 |
|
---|
355 | =back
|
---|