Sam Allen's profileDOT NET PERLSBlogListsSkyDrive Tools Help

Blog


    Observations About Icons in Program Design

    Why do desktop applications have icons in their user interfaces? Icons make a program more enjoyable to use, and also easier (those are two sides of the same coin). If a user is tired and wants to go home for the day, and she needs to just do a few more things in the program, having a special symbol for "delete," in addition to the word itself, can be helpful.
     
    My opinion is that a program designer should use icons in the user interface whenever it is practical to do so, with some exceptions. In my journal program, I added icons to the context menu, for the edit options; to the Edit menu itself; and to the File menu. I don't know exactly how much a regular user clicks on "copy" or "save" when using a program, but I do so many times a day (compulsive saving is a good thing). I found that adding special icons to these menus, the usability of the program improved (at least from my perspective--no scientific testing was done at all).
     
    However, I have some observations about icons. I like icons in menus, the same as toolbars--but I don't think infrequently used commands should have icons. In my program, I have a "gear" icon for the "Options" item. I am going to remove this, because the options dialog should be accessed infrequently, and an infrequently seen icon is probably more confusing than useful. (I think icons are a visual cue we don't want to think about; if they are rarely seen, they are not nearly as useful.)
     
     <Sorry, I lost the image somehow!>
     
    Another observation I have made is that some programs overuse icons and they look amateurish and are confusing. Text is good--if the user cannot read text, he will be using a different system to use the computer other than visual display. I don't like huge icons, as I don't think the extra detail improves the visual memory aspect.
     
    I am not an icon designer, but I have found one place where you can download nice icons for use in Visual Studio C# 2008. As part of a bonus in registering VS, Microsoft provides access to a set of high-quality icons from IconBuffet. In the IconBuffet package offered, I used the Redmond icon set and it really enhanced the usability and appearance of my program. There aren't a ton of icons available, so it is hard to overdo it. More information about Microsoft's promotion is available.

    Setting "Enabled = false" to All Items in a Menu

    One problem I decided to solve in my latest creation (a custom journal creating program) is the inconvenience of disabling all the menu items inside a certain menu. Now, someone on his first day of coding Windows Forms in C# could do it this way:

    copyToolStripMenuItem1.Enabled = false;
    undoToolStripMenuItem1.Enabled = false; // etc.

    But, there is an easier way of setting Enabled = false to each item in a certain menu. When a menu item has items that are considered descendants, such as the items in a menu, it can be used as a ToolStripDropDownItem. Then, a property of that object will be DropDownItems. I guess the top level menu items (File, Edit, View, etc.) are the same type of object as the actual menu entries themselves (Save, Save As, Open). But you can use the "as" keyword to use the top level items as drop down items, which allows you to iterate over the actual drop down items (Save, Save As, Open, etc.).

    Here's the code that I am using to disable all items in the Edit menu:

    ToolStripDropDownItem d = editToolStripMenuItem1 as ToolStripDropDownItem;
    foreach (ToolStripItem t in d.DropDownItems)
    {
        t.Enabled = false;
    }

    You could specify the values "true" or "false" by a boolean parameter, and then simply call the enclosing function whenever an operation occurs that would need to change the state of the edit menu items.

    My Awful TextChanged Trick (Updated)

    My best article on this subject is here, on my new site

    Here's the old post for posterity:

    Now, I am not planning to fill this blog with my nasty C# tricks, but this one has come in handy more than a few times for me. I am sure it has repercussions for performance, code readability, elegance, etc. But I don't care because it works, it works well, and it is simple. Here's the trick: the TextChanged event is one of the most useful events around. It is raised anytime the Text property of a control changes. So, the following code will cause a declared TextChanged event handler to run:

    textBox1.Text = "A New String";

    This is of course assuming that textBox1.Text was not already set to "A New String". But what if we want to have an event that is evoked every time the Text property is set, not necessarily changed? I like using the TextChanged to monitor changes to the control, either programmatic or user-generated. So, to ensure that the previous text property change would evoke the TextChanged event handler, we can do this:

    textBox1.Text = "";

    textBox1.Text = "Any String";

    So, is this trick awful? Well, it certainly isn't elegant. I don't know if there is any simpler way to accomplish this. KeyPress, KeyDown, etc. are good for user input, but they can't catch it when the program changes a the Text property directly.

    For now, I will do what works--but I am open to improvements. I don't know whether this trick is more obvious, clever, or awful, but it seems to work well, and perhaps that is more important than anything else.

    Strange Label Shifting Issues in Visual C#

    UPDATE: Use TableLayout panels for spacing and centering, etc. controls and labels! This problem is still an issue, but it can be avoided with TableLayouts.

    I added a label in between some controls (buttons and a password field) at the bottom of my main window in my C# program. I did this because I didn't like how much room the status bar took up, and I didn't need a whole status bar. I really like my new design, with the label in between the buttons. However, there was a strange issue, which could be a Microsoft bug:

    When I had the label in between the two controls (expanding to fit between them, as they are anchored to the left and right edges of the form), it would nicely use the ellipsis as desired, and also resize as required. However, When the ellipsis appeared, it would scoot up a few pixels--not a huge deal, but also not up to my UI standards. I found that, strangely, setting UseCompatibleTextRendering to true would fix this glitch. However, text is somewhat ugly when that is enabled. Today, I fixed the problem by decreasing the height of the label to 13 (!) pixels. This way it works find in Windows Classic mode and Aero both. I haven't tested Windows XP, but in all modes in Vista it works properly. Visual C# makes UI much easier, but sometimes things still need fixing and debugging. I guess this is why a professional needs experience with these tools.

    Modifying--MOVED

    UPDATE: A new version of this post is now located at dotnetperls.com! Here is the page. It has no ads, and is faster!

    My Custom Journal Program, Encrypted and Minimal

    It was a good holiday for me here in California (Thanksgiving). I managed to do a bit of programming as well. I can't really believe this but I actually made a program that is useful, and that I am already using. I am a serious journal-writer: I love to organize my thoughts in written words, and I also like to preserve all those words for future reference. Well, now, my newfound skills in .NET and C# gave me the ability to make a program that fits my needs a journal-writer perfectly. It is a custom made program, and although it is only a few days old, it is very useful to me and maybe someday it will be useful to someone else.

    They always say that big, commercial software packages have to have so many thousands of features because each person only uses 10% of them--but a different 10% each time. Wouldn't it be nicer if you could have a lighter program that only does that 10%? It would result in less clutter, more speed, less memory, less distraction. Well, for the last seven or years that I have been sporadically writing in my computer journal (note that this is private one, not a weblog), I have used Microsoft Word, or, briefly, OpenOffice. Here's the best I ever managed to do with Word:

    1. Each day, open a template that includes the date and some formatting rules.
    2. Save the file--this involves choosing a specific filename, and typing the filename into the box, and selecting the proper directory.
    3. Write the file--almost never using any of those stupid toolbars or buttons.

    Now, I even bought Word 2007 in hopes that it would improve my ability to create journals--and it did. Microsoft fixed a bug involving "fields" and filenames. But, in May of this year, circumstances arose and it became too difficult for me to carry through with the Word thing each day. I stopped writing. Well, a few days ago, I decided to attempt something new: not only fully automate my journal setup process, but to stop using Word, and add some new features as well.

    So what did I come up with? Well, now, even with the program in its early stages, I only have to do this to write in the journal:

    1. Click on the program icon in the Start menu. The program pops up and starts a new journal entry for the day, or opens a journal entry for today that was created previously.
    2. Type the title, and press return to go to the body text box.
    3. Click "lock" or "save" to save the file. Close the window--that's it.

    But there is more. Although I am not doing anything illegal, I am more comfortable having my journals be private or at least difficult for someone else to read casually. I have always been a private person. So I implemented encryption, using AES. Wow! I guess it is military-grade encryption, as present in the .NET framework. This stuff is the same stuff that professionals use. So, my journals are automatically encrypted on disk. The password is only in my mind--I allow myself to have a hint, but there is nowhere on my PC or elsewhere it is written down. It is secure. Maybe a cryptographer could read my files, but it would be hard (unless there was some kind of social engineering).

    The program has the ability to "lock" the screen--hide the content, encrypt the file and write it to disk, and require a password to "wake up." Not only can I lock the program with a single button click, but the program will lock itself after X minutes of idling. So if I am writing a journal at night, and forget to close the window, the program will be locked in the morning. That is a security feature, and it was so simple to implement. (Implementing all these features well will take a bit longer.) I wanted all the files to enforce a "one password" rule--so I had to implement some special code for that.

    The shortcomings of the program are mainly of the editor--it lacks formatting capabilities, it lacks spell-checking and grammar checking, and auto correction. However, I have plans--there is an open-source .NET spell checker--I could use that, or I could write my own, with a limited set of features that I would find useful. And I have a plan to implement a "deeper" undo function, and a "replacer"--an algorithm that replaces certain sets of characters with other sets. (I would want to capitalize the word "I" for example, because I sometimes type the lowercase letter in error.)

    There is a nice button to save or lock the document, and the program puts itself to sleep. There is a handy password form right on the main window--no dialogs (I hate dialogs). In normal usage, there is only one window. To save, I don't have to use a dialog. The program files the journal in the correct place. Now, there is more--I wanted to implement a rudimentary file manager.

    So I made a TreeView that presents a two-level hierarchy of data. Each year has its own node--2000 to 2007, and beyond. (I have no writings from before 2000, really, so I won't implement 1999 and before right now.) I can expand each year node and all the entries will appear. No double-clicking (I hate double-clicking). I made a one-button way to delete an entry, and it disappears from view. I have a one-button way to open any entry that exists in the hierarchy. There is a nice cancel button. (I use a confirm dialog for "delete" because I sometimes click things I shouldn't.)

    Next, there is a "Create New" dialog, which is a simple dialog that allows me to choose any date and the program will insert a journal file there. Note that to save any journal in the same directory as an established journal set, I have to use the same password--a universal, master password. This is for ease of use, because I can't remember endless passwords. The create new function is very simple but functional, and won't be part of the normal daily usage of the program. So simple is good.

    When the program is installed and first run, it requests that you input and validate a password, which it will use for every file. This can't be changed right now (that will need to be fixed). You can also provide a password hint, which I store in Settings.settings. (I like hints, even though they are a big security risk.) This dialog nicely reminds the user that the dialog will only run once. (I need more configuration options, because the journal data may become valuable to a person [for sentimental reasons, most likely].)

    The journal is like email programs in some ways--when you type in a journal title, the title bar automatically updates. The date is nicely inserted, in a succinct format. The program inserts none of its own text, unlike Word, in which I had it insert date content as words into the document. The writing the Journal Console.exe uses is mine, entirely and totally. It is genuine, authentic, and secure.

    I also figured out how to make a program persist its size and location--this is important for a program that is used frequently. I like my journal's window to be smallish in size, so I can see other things on the desktop too. Word would also take up a large part of the screen, which was annoying. There is also a big difference in the two programs: Word uses the page metaphor, and prepares for printing with its formatting. Well, I almost never print my journals, so I don't need pages. And I don't tend to use headings too often, although I might try to make some kind of "heading tool tip guide" like Word has when it scrolls.

    I always thought Word was fast, and it is--remarkably so, for a program with so many thousands of features. It is highly optimized and really amazing for what it is. The problem is that it can't do what I want it to do. My new program can, and it is superior in most ways, excluding formatting and spell-checking and the like. I think these deficiencies are surmountable, although it will take me some time. I need to improve Undo and think it out really carefully. It might be fun to implement.

    Another nice improvement over Word is that the title bar doesn't say "non-commercial use" like Word does (I bought the Home & Student version). And the About box says my name, not Microsoft's. Word 2007 is prettier, except for the endless buttons (even with the ribbon). I want a big save button, and all the encryption. (Word can do encryption, but to navigate the UI is a hassle and doing it every day is prohibitive.) Word 2003 files are about the same size, despite their formatting, because the current system I am using doesn't use compression but it uses encryption. (I have done compression too, but not in this program. That is a future enhancement.)

    There is a lot more I want to do:

    1. Compression
    2. Import/Export
    3. Change password
    4. Expose each entry's title to the tree
    5. Search every journal's title (I don't want full text search because of the security issue)
    6. Navigation--why not have back/forward controls? I like reading things sequentially, and I missed that feature in Word
    7. Automatic text replacement (convert simple errors to correct form)
    8. Live word count (I like this feature in Word, even though it isn't critical to me)
    9. Options and Customize dialogs--maybe offer ability to adjust font size or family
    10. Ability to have multiple documents for a single day
    11. Sometimes, journals have stray pages that don't fit anywhere else--maybe make a "miscellaneous" folder
    12. Make the UI more reactive--clear old status messages after X minutes
    13. Automatic save more frequently, without doing a full lock--why not save 2 minutes after a keystroke, and then every 5 minutes or until the program locks itself

    As always, I have thoroughly tested the program on Windows Standard appearance and Aero. And it looks good. I have tweaked it a bit. Oh, I also made it so that it locks pretty soon after being minimized to the task bar (on the assumption that you aren't using it when it is minimized). There is also an easy way to flip to the current day's entry. If I am running it and midnight comes and passes, it will recognize that it is not on the current day, and it is easy to make a new day's entry.

    This was a fun project and I now have a nice program that does something I couldn't find in any other program (I looked at shareware, freeware, open source, commercial--there might be something, but I didn't find it). I think my own program fits my needs better and more succinctly than any other could. Microsoft OneNote is too busy and complicated as well. The Journal Console.exe is likely more useful to me than Scrabble or an Unabridged Dictionary--although I wonder if I couldn't somehow leverage the latter into the Journal. It might be best to keep them separate.

    I guess I made a great program for the slightly paranoid and lazy habitual writer. (That would be me.) Oh, and this writer is also interested in minimalism, aesthetic and technical. Extra buttons detract from a program, even if they don't waste memory or CPU resources. I wanted this exact program, so I had to make it myself. And it gives me a sense of pride that I use my own software--why wouldn't it? For now, I will just write some journal entries, but I am planning on more improvements soon. (Sometimes I don't know what to do, so I end up coding, and occasionally, I code something interesting.)

    I want to post more things about various code snippets. Too bad I have to use this Windows Live Writer program--I might have to make my own blog posting program. Why not?

    Visual Studio 2008 and Some Improvements

    A few things happened in the programming field today. First, I read on a blog (can't remember which) that Microsoft had released their Visual Studio 2008 suites! I had tried the beta, code-named Orcas, and I used it for a while but eventually gave up with it because it had some graphical glitches on my Vista computer. Well, now that VS 2008 is final, I installed it again. So far it seems to be working nicely and I haven't had any serious flaws. The big test will be how well .NET 3.5 installs on different machines. I had a lot of problems with the installation of .NET framework 3.5--it was agonizingly slow, and I am not sure it worked right. I haven't deployed any of my applications on another computer, but I will try that some time this weekend. I guess I installed Silverlight on my PC, as well--finally. I am not actually opposed to Silverlight, even though it is not totally 'open', because I think it is well-designed (if it is anything like C# and .NET).

    Having installed Visual Studio C# and C++ 2008, I had to update my two main projects I have been working on--my dictionary program and my Scrabble program. They both work well, and while I was dealing with them I made some enhancements to my Scrabble program. As you probably know, Scrabble is a word game where each player takes turns playing crossword-like moves on a grid. I wrote a C library that plays Scrabble quickly. I have been working on various Scrabble algorithms for several years, just considering the theory behind string matching, trees, etc. It is interesting to me. Over the past couple months I have undertaken an effort to write UI code more, and make my programs more accessible.

    I guess my first project in C# was to use its interoperability to call my Scrabble code, which is written in C. I had to learn all about IntPtr, Marshals etc., export, import, etc.--cross-library calls. And, of course, 'works' is not good enough for me--I want 'works well, is exceptionally good.' So it took me quite some time to work out the kinks, but I am proud of the results. I will talk more about the internals of my Scrabble engine, but the general idea behind it has been written about in various academic papers several times. (It is actually an interesting challenge in the world of computer algorithms.) Anyway, today my focus with my Scrabble program was the UI and some more simple stuff (and, as you might know, simple stuff is always more complex than it should be).

    I wanted to implement full persistence in my program. Basically, the following data needed to be saved to disk in between sessions:

    1. The current rack (set of letters a player can use to make a word on the board).
    2. The current bag (set of letters that the players will draw from when they use letters).
    3. The current board (the arrangement of the tiles, including blanks, etc.)
    4. History and scores (I haven't dealt with this yet).
    5. Possibly another representation of the board (I can probably eliminate this).

    I decided to use Settings.settings and .NET's XML facilities for this. I added FormClosing and FormClosed event handlers--the first to set all my data, and the second to save it. The challenge with making everything persistent in my Scrabble program was basically that the memory used by the DLL and the memory used by the C# program are different. The one is compiled C/C++ and the other is .NET code. Here's what I did to solve the problem:

    1. Create functions to set every string and integer in the DLL from the C# front-end.
    2. Create functions to get every string from the DLL.

    So, at startup, the program automatically opens the settings XML file, and if there is data there it calls the functions to set the data in the DLL's memory. Then, everything runs as normal. When the FormClosing event is handled, I get all the same variables from the DLL's memory space and set the relevant settings in the C# program. In this way, I implement all the settings in the C# program, and just use the DLL's variable as a cache of sorts for faster access in the Scrabble engine code. All the interoperability is handled with Marshal, PtrToStringAnsi, and IntPtr. The Scrabble DLL has a simple interface that I call in various places.

    I find the ability for C# to access DLLs like this to be an excellent way to design a more complex program. The C# .NET code can be used simply as a user interface that does simple manipulations of data and validation, and then for complex or processor-intensive tasks it uses C++ code in a custom DLL. This technique allows for uncompromising performance and memory usage, but also quick development of a pleasant and usable UI. I would never have been able to make the GUI I made without Visual C# and .NET, but the core engine would never be as efficient in C# as in C.

    The two programs--the C# UI, and the C algorithm and engine--are dependant on one another, and useless without one another. (Note that the C engine could use a text-based GUI, and it did for a long time.) The C# acts as a controller of the C engine, telling the engine what the user wants it to do. There are several layers here, but the system works quite well when the bugs are worked out. It is logical, efficient, and robust.

    I have a lot more to say about algorithm design and efficiency, but I won't go into that right now. C# is not about optimal performance, but about robust code and ease of design, and good but not great code efficiency. With C#, my programs have features they would never have otherwise. C# and C++ complement each other quite a lot.

    A Byte[] of System.IO.Compression

    I didn't do a lot of programming today, but I did experiment with something I have been wondering about. I have seen code in certain desktop applications that compresses data in memory while the program is running to save space. I have wondered how efficient this is--perhaps the information could be decompressed on a different thread. Anyway, I implemented this idea in my dictionary program, using .NET's built-in System.IO.Compression class.

    The implementation was interesting and more complicated than I had hoped it would be. To decompress a byte[] array, you need to know the size of the resulting byte[] array. So I implemented a system where the actual decompressed byte length was stored in a separate file. This added a lot of complexity, and wasn't pretty. I used a few hash tables (well, Dictionary<string, byte[]> actually), and eventually the implementation was complete and was functional. So how did the new solution perform?

    Well, first, memory usage--the compression resulted in a different pattern of memory usage. The memory usage would go higher when certain data were decompressed, and then it would remain lower. So, overall, the compression did not improve memory usage very much. I think the self-expiring cache I implemented is a much more efficient and graceful solution. And, the worst problem with my compression scheme was the speed--a query would take up to 5 times longer! On one query, the program with decompression took 300+ ms and the original took 60 ms. The program really felt slow (even though it was perfectly usable still).

    So, due to the marginal or non-existent memory usage decrease, and the drastic decrease in performance, I removed the decompression code. My conclusion here is that GZipStream is really only useful in a program where file size is critical, such as a web application. Halving the amount of data sent over the network is a big gain and would be a performance improvement over sending uncompressed data. However, due to the IO speed of modern desktop PCs, the decompression scheme is a big loss.

    I guess now I know how to decompress, compress, and store byte[] arrays in my programs. I suspect that might come in handy sometime. I just don't think it is the right solution for the program I am currently experimenting with. I have to say that I have never coded a compression system, even at a high-level, before, but have always been interested by it. (One benefit I see is that I can write my own zip utility--although I probably won't.)

    Making a Context Menu MOVED

    Rewritten, improved, added a screenshot, posted to CodeProject. The final copy is at my site right here.

    Timing Interactions, For Fun

    I have discovered a blog called Coding4Fun at Microsoft's developer site. Since I believe the best way to learn to code is indeed for fun, I have been looking at the various projects on the site. I have no interest in writing 3D games, even though doing so could form a good career. Now, when I look at other sites for information about programming, I am always trying to use their approach and their knowledge to improve something I am trying to do.

    The current project on the site is a small application that monitors a user's activity by listening to Windows events. I am thinking about the ideas in this project and some of them strike me as very useful in a wider context. What if all the programs on the PC were smart enough to tell when the user is away and to empty their caches, save data, and release their memory? My theory is that when a user comes back to her PC, she will want to close the old stuff and start new stuff. When I use my PC, and I leave it on overnight and I wake up and resume my session, I am usually not interested in last night's web page!

    So what if the programs could clear themselves out and virtually shut themselves down? This would benefit hibernate as well, as then they wouldn't need to be read in from disk on resume. If programs were to automatically clear their caches and free their memory, while still being open, then the whole system would respond faster on resume. In my opinion, the most important thing that needs to happen with desktops is for programs to cooperate and respect each other more.

    What I want to see and build in the computing world is a sort of perpetual motion machine: an application that never crashes, never slows down, and can dynamically adjust itself over time. This program would clear its memory when the user is idle.

    And this is where the Coding4Fun article got me thinking about this. The example exposes the GetLastInputInfo method which is a Windows DLL function. I am not interested in keeping track of idle time per se, but this technique is very similar to how, in my dictionary program, periodically scan the caches and remove them if they haven't been used. Time is critical with a program.

    How about this: caching using time is hugely important on a computer. But let's say your computer never releases its caches. The computer doesn't have a way of getting the last user input. The computer will crash as it runs out of memory and virtual memory. So, clearly, computers have been implementing these ideas for a long time. But, I wonder, how can we make the situation even better? We can--it is a primitive sort of artificial intelligence and decision making. An algorithm can decide what to do and when to do it.

    Byte[] Data Storage and Efficiency

      Memory used at startup Page faults at startup Details
    Version 1 19 MB 9903 Creates hash for each open file; stores as strings.
    Version 3 17 MB 9550 Caches raw string data and does lookups on it when required.
    Version 4 16.5 MB 10187 Caches data as single bytes and does lookups when needed, converting each time to string.
    Version 5 10.6 MB 8198 Caches data as bytes and does lookups when needed, using a custom parser that scans the bytes.

    The above table provides a brief overview of some of the changes I have made in the last day or two. I have discovered that, in C# .NET, strings are Unicode by default, which makes them use much more memory than the char* in C/C++. However, C# provides a nice way to use single byte characters--the byte type. So I converted the program to use byte[] arrays, instead of string or hashes of strings. The savings are quite impressive. One thing to note is that the difference between #4 and #5 is small--they both store the data as byte[], but version 4 does an on-the-fly conversion of the byte[] array to a string, using the Convert class. As I have figured out from C, writing a custom parser to accomplish something can be far more efficient than doing things the easy, obvious way.

    One thing about C# that I appreciate, and that I have read others noting, is that the language provides ways to do things in a more "low-level" way. The language allows you to write user interfaces in a high-level way, but when you need more performance from a back-end call, you can write it in basically the same way as C. The language allows you to use the byte type, or byte[] arrays, even in "safe" code.

    Another important feature is C#'s interoperability features--it can call into regular Windows DLLs. In fact, I wrote a program that does that, and it works quite well. The program I am talking about here is entirely in C#, and now that it uses a custom parser for byte[] data, it is very efficient. What have I learned the last couple days? Well, C# is both a (somewhat) low-level language and a high-level language. Scanning through bytes manually is not really a high-level thing to do these days. But C# provides me with the flexibility both to use the easy and convenient string and StringBuilder, and also to get down and dirty with byte[].

    And, finally, using System.Text.Encoding.ASCII.GetString() on a large byte[] array is very inefficient and wasteful. I am not sure how slow it is, as I haven't benchmarked the running times of the various programs. Generally, anything that needlessly uses a lot of memory is slow, however.

    Testing a Font Before Setting It

    I am very picky about the fonts that a program should use. It is not a critical issue but I hate it when I see jagged fonts on my screen. It is 2007, and I shouldn't have to see the jagged edges. So I will never make a program with jagged fonts. I like ClearType and Apple's font smoothing methods (actually I like the latter more), and fonts are important to me. One of my favorite features of Windows Vista is the new fonts. I like Segoe UI and Cambria particularly. How can I make my programs look beautiful, at least in a Windows-UI way? By enhancing the fonts.

    As I stated, I like Cambria on Windows Vista, but Cambria is not generally available on Windows XP or below. (I think it might come with Office 2007, however.) So how can I have the program assign Cambria to Windows Vista and Times New Roman to Windows XP? Well, I did some experimentation and found a simple and easy way to do it. When creating a new font object, if the specified font is not found, the name of the font object will not equal the name passed to the font object in its constructor. So, all I need to do is pass the string "Cambria" to the font object constructor, and then test to see if the name stuck, and it is still Cambria. Here's my code:

    Font ft = new Font(rm.GetString("DisplayFont"), 16.0f, FontStyle.Regular, GraphicsUnit.Pixel);
    if (ft.Name == rm.GetString("DisplayFont")) // "DisplayFont" evaluates to "Cambria"
    {
        textBox1.Font = ft; // Only change the font if we have "Cambria"
    }

    How important are little tweaks like this? Well, in the grand scheme of things, not very. But can a little extra polish make the user just a little bit happier? I would think so. As a programmer, making the user happy is important. I like Cambria and in several years, when Vista is more popular, my dictionary program will look nice on many systems. (Times New Roman looks good, too, but not as good.)

    Caching, Theory, and More Laziness

    I have spent some time experimenting with my dictionary program. I guess I made a little bit of progress, but mainly what I have been working on is trying to figure out the "best practices." I really think I need a programming theory book, instead of just the syntax/reference books I have. What I was trying to solve today was the nature of file caching--should I cache the raw file, or should I interpret the file each time it is read? Should there be two steps--one to store the textual data, and another to interpret the data? Should I do these steps separately? Every time I load the text file, I need to read it and look the for relevant word.

    Anyway, what I did initially with my program was read in the definitions file and put all the data in a bit hash table (Dictionary<string, string> object). I always wondered if there wasn't a cleaner and faster way of doing the lookup than this. I came to realize that the bigger problem with my program was not its speed, but its memory use. However, it would be very bad for performance not to cache the files all. Here's my initial design:

    1. Use heuristics to load a certain file (such as "defs_1g.txt" for the word "good").
    2. Read and parse the file completely, using a hash table.
    3. Store the parsed hash table in another hash table for future reference.
    4. Expire the hash table after X seconds of no access.

    This was relatively efficient, and didn't use too much memory. However, I modified the algorithm today in an effort to be more efficient. Here's what the program now does:

    1. Same as above.
    2. Read the file into memory, but do not interpret or parse it.
    3. Store the text stream (stored as a string) in the hash table as a cache.
    4. Use IndexOf() to find the correct definition in the file, and return that.
    5. Expire the hash table as before.

    What I found was that it was very fast to scan the files once they were in memory, certainly fast enough for my application. IndexOf() is really a fast function. I wasn't particularly surprised by the results, and was actually a little disappointed. The savings amounted to me than 10% of memory, possibly 20%. The speed hit is not noticeable. There are still more optimizations to be made.

    So, when should a file be parsed and when should it just be stored in memory? This is a critical question and it is relevant to many software projects. Does a web browser need to store the decoded image data, or just the compress GIF? What is the best tradeoff of space and speed? Here's what I think--computers' CPUs are becoming increasingly fast, but memory less so. So I think it is best to store files in the cache that are not fully parsed and interpreted. This is the principle of laziness.

    Here's another relevant thought. Microsoft and Sun use JIT compiling for C# and Java. They don't compile the entire program at once. They compile "just in time," meaning that they are very lazy (in a good sense). Storing a data file in its raw form is lazy as well. Why do work up front, when it is not completely positive that the work will be needed? For all my program knows, the cached file will never be needed again. In that case, heavily interpreting it would be a waste.

    The theory is here is one of compromise--what is the best compromise? You cannot have it both ways. And, sometimes, when a cache is too aggressive, it slows the program down. Not have a cache is awful for performance, but having too aggressive a cache is nearly as awful. The cache of a program should be as small as possible to be effective. Hypothetically, a cache that is 10 MB might provide 90% of the benefit. The next 10 MB might provide another 8%. And the last 10-20 MB might provide the rest.

    In that case, I think 10 MB is the best number. Cache enhances locality of reference, but cache can also defeat locality of reference. Is the only solution to test your programs in various configurations? Well, is it computer science, or engineering? It is a little bit of both. Computer science provides the theory, and engineer provides the data and evidence.

    For my own future reference, I think it is better to read and save plain text, and then interpret it as needed. Another idea I have had is to compress the plain text file in memory. I wonder what that would do to performance, however. It might be that reading the plain text from disk would be faster than decompressing the text from memory. There are so many options--which is optimal? This stuff fascinates me. How will I ever learn all the answers?

    Lazily Initialization the Lazy Way

    What's new? I figured out a nice and clean way to do something that has been troubling me for a while. Let me explain lazy initialization. It is a principle that code or data should not be loaded, or accessed, before it is needed. So, for example, a word processor wouldn't load a library that displays images if the present document has no images. It is an optimization, and a very simple form of adapting to what the user is doing. It is part of making a flexible and fast program. But what is the best and most graceful way of doing lazy initialization?

    One of my programs loads a C/C++ DLL and calls functions in it, using the interoperability facilities in C# and .NET. Calling a DLL is not really complex; it just involves matching up the code in each file and compiling and putting the DLL in a specific place. Windows will not load a DLL until it is called in the code. So that is lazy initialization already. But the trick is that your program must not call the DLL and force Windows to load it until necessary. Let's say you have 10 places were you call into the DLL. I would wrap these in a special class whose job it is to interact with the DLL. Let's call this class DllManager. Every function in DllManager that calls a function relies on the DLL having the dictionary file loaded. So, we could repeat this code code 10 times:

    if (manager == null) // Repeated throughout the class
    {
        manager = new DllManager();
        manager.LoadDict(fileName);
    }
    manager.CertainFunction();

    However, that is prone to errors--and, even worse, what if you want a different file name? It is not flexible. If you wanted to call LoadDict somewhere else, you would need to deal with the manager object directly. Now, this technique works well and can be done easily. But it is not graceful and it is not beautiful code. It creates unnecessary complexity, makes the code longer, and actually causes code bloat because there is a lot of duplication the compiler can't remove.

    C# includes a special way of using get and set functions. In C++, programmers will make special get member functions, such as GetDict() or something similar. But in C#, there is a more graceful way. What I have decided is best is to define a special getter function that checks if the dictionary is loaded. Then, in the class that calls the DllManager class, we can keep track of whether we need to load the dictionary. All the details depend on the program's call pattern, but this is what I devised:

    DllManager DllStart // Appears once in class
    {
        get
        {
            if (manager == null)
            {
                manager = new DllManager();
                manager.LoadDict(fileName);
            }
            return manager;
        }
    }

    This getter function ensures that whenever it is called, the DllManager object will not be null, and the dictionary will be loaded. Remember, this is just for convenience, but it is an important thing to do for maximum clarity in the code. Another benefit is that you can insert Debug logging to monitor the program and gain a clear idea of what it is doing if there is a problem (or if you are just curious).

    So, instead of the example in the first block of code, we can call CertainFunction() like this:

    DllStart.CertainFunction(); // Repeated 10 times in class

    A few details are different but the code could easily be made equivalent. You could calculate anything in a get or set function, but it is best to simply do any initialization or create a new object, or even just return a variable. Additionally, get and set are ideal for cross-class variable modifications. They are an interface that can be easily changed and is intuitive.

    So that's my best idea about how to load resources only when absolutely necessary. By making my program lazy, it starts up in 70 ms instead of 130 ms now. That adds an extra degree of "snappiness" that a theoretical user might appreciate. The first interaction with the program will be 60 ms slower, but that is not noticeable. This approach spreads out the slowness, which should make it less noticeable. Of course, making the program faster in addition to lazy would be great as well.

    What Streams Are You Using? MOVED

    There's a really awesome article that I created from an edited version of this one at Dot Net Perls Algorithms section. Sorry for the inconvenience--you'll like the new site much better. Thanks for visiting.