Shearer Software

 

 

November 8, 2003
The fastest way to resize images with Panther

To continue on the recent image resizing theme (probably of interest to Python scripters only), I made some changes as a result of upgrading to Panther last week. I wanted to use the new built-in Mac OS X version of Python 2.3 (plus the MacPython Extras from Jack Jansen—thanks, Jack!). But a problem with the initial Package Manger distribution of the Python Imaging Library made me look at a new Panther feature that let Python scripts use the native Quartz graphics library directly. (The hitch with PIL was that it was built to require a Fink install of libjpeg for full JPEG support. A quick compile of libjpeg and placement of it and its headers into Fink’s preferred locations didn't work, and either installing Fink or compiling PIL from source would have taken a while.)

That was as good a reason as any to explore Panther’s new Quartz scripting feature. So I read what I could find on Quartz, and modified my photo album code to use Quartz if available. It still uses PIL to gather EXIF and size information, which works even without libjpeg, but then it uses Quartz to manipulate the actual image content.

The results were terrific, mostly. In real-world testing on an 800 MHz PowerBook G4, the PIL-only version spat out 8 JPEGs per minute, and the Quartz version spat out 65 JPEGs per minute. That’s a welcome improvement, especially when you multiply my typical batch of 100 photos by 3 sizes apiece.

The one problem is that I don’t yet know how to set the quality level. There’s a parameter that should contain this number, but as far as I can tell it isn’t documented anywhere. All of the supplied examples save as PNG or PDF, rather than JPEG, and the function isn’t documented along with the rest of Quartz because it’s not a real Quartz function—the release notes say that image export is actually handled through QuickTime. (This will be the first public mention in the history of the world, as far as Google is concerned, of the Core Graphics function that the API summary says it calls: CGBitmapContextWriteToFile. The last parameter, vaguely named “params” and defaulting to a zero-length string, is where a data structure including the quality level would obviously go.)

So for now it’s using a default JPEG quality level, which, whatever it is, is noticeably worse than the quality=90 setting I used with PIL, especially on thumbnails. Though I haven't done a controlled side-by-side test, it seemed that lower quality levels resulted in some low-frequency blurriness, which looked much less objectionable than the high-frequency ringing (making macroblock boundaries visible) that PIL tended to show. It looked bad enough that I couldn't really run PIL with anything below quality=90. And because of the lower quality setting, the file sizes on the Quartz side were half that of the PIL versions.

Here’s all the code the deals with Quartz in the new photo album. newImagesInfo holds a list of destination file paths and pre-calculated pixel dimensions.

def resizeImagesQuartz(origFilename, newImagesInfo):
    # newImagesInfo is a list of (newFilename, newWidth, newHeight) tuples
    if not newImagesInfo: return
    import CoreGraphics
    origImage = CoreGraphics.CGImageCreateWithJPEGDataProvider(
        CoreGraphics.CGDataProviderCreateWithFilename(origFilename), [0,1,0,1,0,1],
        1, CoreGraphics.kCGRenderingIntentDefault) 
    for newFilename, newWidth, newHeight in newImagesInfo:
        print "Resizing image with Quartz: ", newFilename, newWidth, newHeight
        cs = CoreGraphics.CGColorSpaceCreateDeviceRGB()
        c = CoreGraphics.CGBitmapContextCreateWithColor(newWidth, newHeight, cs, (0,0,0,0))
        c.setInterpolationQuality(CoreGraphics.kCGInterpolationHigh)
        newRect = CoreGraphics.CGRectMake(0, 0, newWidth, newHeight)
        c.drawImage(newRect, origImage)
        c.writeToFile(newFilename, CoreGraphics.kCGImageFormatJPEG) #, params?)

If you're on a Panther machine with the Developer Tools installed, you can find the examples I started with in:

/Developer/Examples/Quartz/Python/

Seems obvious where they would be in retrospect. Thanks to the folks on the MacPython channel in iChat for pointing me to them.

  11/08/03 04:38 PM  Mac OS X, Python, Software

October 17, 2003
“As Seen On xml.com”

My XMLFilter package was mentioned in Uche Ogbuji’s latest Python XML article on xml.com:

XMLFilter is one of those great examples of a unglamorous but extremely valuable program. Based on its description (and I expect to try it out and report on it in this column soon), it is a must-have for anyone building SAX programs. It provides a fallback SAX parser/driver to avoid SAXReaderNotAvailable errors that users encounter on some platforms. It also offers a safety net against the XMLGenerator bug that bit me earlier in this series. Its main feature, however, is a framework for SAX filters. See Andrew Shearer's announcement.

Thanks, Uche!

  10/17/03 12:46 AM  Python, Software

October 17, 2003
Pictures. Now 100% Bigger!

A few days ago, I made changes to my photo album software. Now all current and past photo albums have an optional “large” size with double the pixel count, preserving more detail for users with large screens.

(There are also some other minor improvements, such as a photo count for each album, links to the next and previous albums by date, and more links to related sites.)

  10/17/03 12:37 AM  Python, Software, Pictures

October 9, 2003
Image Size Reduction for the Web

Tim Bray is looking for a better way to post photos to his web site. To judge from the sample photo, his current method doesn't antialias the image, so sharp edges in the original look jagged when reduced in size.

I went through the same thing with iPhoto, which has an HTML Export feature that is similarly broken—it doesn’t antialias at all. It’s a strange limitation, considering that the Mac OS X graphics system has fast, high-quality antialiasing everywhere else, including fonts and Dock icons. It’s as if Apple turned off a global switch in iPhoto for better performance when displaying large number of images onscreen, but forgot to turn it back on for HTML exporting, where quality should count for much more.

In any case, the quality of iPhoto’s exports was poor, so I wrote a Python script to handle the export using the Python Imaging Library. (Contact me if you’d like the code. So far, I’ve publicly released only the general-purpose plist parser that I wrote to handle the AlbumData.xml file.)

The script reads the titles and comments assigned in iPhoto, and parses them for category and other tagging information I’ve appended to the comments. Then it generates date-based and category-based HTML page hierarchies for all the albums whose names start with "Web-", and generates any thumbnails or medium-sized images that are missing.

The Python Imaging Library, or PIL, is very easy to install with MacPython 2.3’s Package Manager.

There are some drawbacks, though:

  • I had to push the JPEG quality setting very high to avoid obvious macro-blocking (squares showing up around detailed areas), and pushing the quality any higher caused PIL to fail by throwing an exception.
  • The BICUBIC setting for image reduction didn’t appear to work at all. The image ended up non-antialiased, the same as Photoshop's "Nearest Neighbor" setting. Only ANTIALIASED had any effect. This may result in bilinear instead of bicubic interpolation, but the documentation isn’t clear.
  • The Thumbnail setting produces an image quickly, but they are very low-quality.
  • The Progressive setting for JPEGs seemed to cause even more exceptions when trying to save at high quality levels, so I was forced not to use it.
  • It’s not nearly as fast as Mac OS X’s Core Graphics image reduction. But then again, I wouldn’t expect it to be.

On the positive side, the antialiasing looks good, and PIL can also read embedded EXIF data. Images that I've tagged as deserving more info automatically get the aperture and shutter speed printed on the page.

The code for actually reducing and saving the image, ignoring the EXIF and album manipulations for now, is as simple as this:

                if not os.path.exists(newPath):
                    shrunkImage = im.resize(size, resample = PIL.Image.ANTIALIAS)
                    shrunkImage.save(newPath, 'JPEG', quality = 90)

You can see samples in my Pictures section. Check out the first batch of Providence photos for some night examples with shutter speeds and apertures shown, and the Providence and Boston kayaking photos for examples of pictures with lots of edges that would have looked much worse without antialiasing.

  10/09/03 12:49 AM  Software

September 28, 2003
HTMLFilter 1.1 released

HTMLFilter, in its first public standalone release, is a module for Python programs. It parses an HTML 4 document, allowing subclasses to pass through or modify the the text and tags as they go by. The resulting copy will be an otherwise exact replica of the original, including whitespace and comments. ASP, PHP, JSP, or other server-side code will generally survive the round trip. (The only exception is if the code is embedded inside an HTML tag you’re actually modifying, not just passing through, and in most cases any tag attributes not explicitly modified are safe.)

The use can be as simple as adding a <meta> tag to an existing web page without disturbing the rest, or as complex as merging two HTML pages (as it’s used in ShearerSite, which intelligently merges content pages into template pages).

You can also use it to generate HTML from scratch, with HTMLFilter taking care of the attribute encoding for tags.

HTMLFilter. Python-licensed. Unicode and encoding-savvy. Tested with Python 1.5.2 through 2.3.

  09/28/03 03:52 PM  Software, Python

September 15, 2003
XMLFilter

I just released version 1.1 of XMLFilter, which marks the first public standalone release. XMLFilter is an open-source Python module you can include with your programs to provide XML parsing even if the target system lacks a working xml.sax package. You can use it to quickly adapt existing xml.sax-compatible scripts to work out of the box, for example, on Jaguar (Mac OS X 10.2), which lacks expat.

It works by using the older xmllib module as a fallback for xml.sax. A test suite verifies call-by-call compatibility no matter which module ends up being used.

Other features include XML event-stream filtering, writing, and creation, with support for writing CDATA sections. (Using these classes also avoids bugs in some versions of xml.sax.)

Generally, the newer your version of Python, the faster it goes. For example, if xml.sax and expat are working, they give a factor-of-3 speedup over the pure-Python xmllib, and on Python 2.3, Unicode encoding conversions will use xmlcharrefreplace for faster writing of XML numeric entities.

Python-licensed. Tested all the way down to Python 1.5.2 and up to Python 2.3. xml.sax-compatible, Unicode-savvy (wherever Python is), and optionally namespace-aware.

  09/15/03 08:47 PM  Software, Mac OS X, Python

August 5, 2003
Netscape is dead, long live Mozilla

Matthew Thomas: Netscape is dead, long live Mozilla.

Netscape’s control over Mozilla was the single biggest factor in making Mozilla’s usability suck, from the project’s inception until two days ago. That’s a pretty tall order — to make an interface design crappier even faster than hundreds of volunteer geeks are — but somehow, Netscape managed it. That was the main reason I used to get so angry, so often.

  08/05/03 06:53 PM  Software

June 15, 2003
Comment APIs, going once, going twice

Joe Gregorio has a RESTy comment API based on RSS 2.0. His article compares it with the soup of other protocols available: TrackBack, PingBack, and Post-It. One problem: it links to the author’s home page rather than a specific post, so it’s not good as a link-notification mechanism, as TrackBack is. And John Gruber points out that TrackBack isn’t really that good for comments in practice, because the dominant implementation just resends the article summary.

His Referrers list, however, continues to show a lot of junk along with the real links, including one user’s local Radio Userland aggregator on port 5335.

  06/15/03 09:22 AM  Software

June 14, 2003
DirectRSS

Announcing DirectRSS. For when you want as little as possible between you and your RSS.

It’s an open-source MetaWeblog API implementation that modifies RSS 2.0 files in place. It also supports the Blogger and b2 APIs. No database required.

With it, you can use a weblog editing client such as NetNewsWire or w.bloggar to update an RSS feed, then use XSLT or the companion HTML renderer to generate a web site.

To handle larger collections of posts, it supports Dave Winer’s blogBrowser format, which, instead of a single file, uses one RSS file per month and one folder per year. To the weblog client, it looks like one big file, with all posts editable. A file containing the few most recent postings is generated automatically, for the benefit of news aggregators.

It was originally written as an XML experiment, but it’s proven reliable. It’s packaged as a Python CGI script, and comes with its own pre-configured Python web server for running locally. If you already have Python installed, there’s no setup required to run the working tutorial. (If you don’t, it only takes a few minutes to install Python.) It’s compatible with the bundled Python in Mac OS X 10.2 (Jaguar) and other Pythons lacking an expat parser. (It falls back on xmllib.)

New features in this version include full support for namespaces in both the RSS file and the MetaWeblog API, post modification dates, and a tutorial showing how to render the posts into HTML.

Currently, it’s packaged with ShearerSite, the (awkwardly-named) web interface that also performs the rendering into HTML. I may split out just the RSS editing portion if there’s interest. The HTML renderer can display RSS 2.0 files or blogBrowser archives filtered by category, date range, or numeric range.

See the download page (hosted by SourceForge), tutorial, and revision history.

  06/14/03 08:16 PM  DirectRSS, ShearerSite, Software

June 14, 2003
Trackbacks, Referrers, Comments?

I pointed to Daring Fireball’s Trackback critique, but I didn’t comment. On the Internet, if I don’t do it, someone else will, and this article did. It’s a very well thought-out response.

To summarize: John Gruber’s criticisms of TrackBack are valid, but his referrer system has its own problems. It trades increased ease on the sending side for lower quality on the receiving side.

To improve TrackBack, it should be made easier. I don’t see why all comment forms on sites with TrackBack couldn’t be changed into combined comment/TrackBack forms. TrackBack would almost disappear to the web surfer; it would just be Remote Comments.

I hope to find a way on my own site to integrate comments with static main pages. (I have a few ideas.)

  06/14/03 07:45 PM  Software

June 14, 2003
IE on its way out

No one has been covering the Internet Explorer from a web author's perspective as well as Zeldman.

2005? Are they kidding?: “Scoble says Longhorn will be available in 2005. Which is another way of saying IE/Win won't change for at least two years. It is not good enough to stay as it is. ...Can anyone tell us how two more years of flawed standards support is supposed to be a good thing?”

RIP:

...Our friends there [at Microsoft], we knew, were working on improvements, particularly in the areas of CSS and DOM support. Yet no significantly new browser version ever came of their activity. IE6/Win still had trouble with parts of CSS1, still did not support true native PNG transparency, and still did not incorporate Text Zoom...

Over the past weeks, the stories we and others have been covering (including the unavailability of an improved version of IE5/Mac outside the subscription-based MSN pay service, and the news that IE/Win was dead as a standalone product) painted a picture of a product on its way out. And now we know that that is the case.

We know that, after spending billions of dollars to defeat all competitors and to absolutely, positively own the desktop browsing space, Microsoft as a corporation is no longer interested in web browsers...

From here, as it has for several weeks now, it looks like a period of technological stasis and dormancy yawns ahead. Undoubtedly the less popular browsers will continue to improve. But few of us will be able to take advantage of their sophisticated standards support if 85% of the market continues to use an unchanged year 2000 browser.

OK, enough quoting. Go read the articles. It‘s getting late, but I’ll comment on one thing. I’ll do it even though it requires another quote.

IE5/Mac, with its Tasman rendering engine, was the first browser to deliver meaningful standards compliance to the market, arriving in March, 2000, a few months ahead of Mozilla 1.0 and Netscape 6... IE5/Mac introduced innovations like DOCTYPE switching and Text Zoom that soon found their way into comparably compliant browsers like Navigator, Konqueror, and Safari. And all but Text Zoom eventually made it into IE6/Win...

Add to that feature list the printer equivalent of Text Zoom: interactive fit-to-page controls in the print preview window. A very useful solution for a problem I saw users on other browsers and platforms (including IE/Win) struggle with frequently.

The reason IE 5/Mac was good was because it had to be. It was fighting against a large installed base of Netscape 4 on its merits, and Microsoft couldn’t fall back on their Windows franchise to push it. It was designed to be better than Netscape 4, and it succeeded at that. (Also helping its market share was Microsoft’s public threat to pull Office for Mac, which resulted in Apple shipping IE as their default browser.) Still, the competition made Microsoft produce some of its best work.

Soon after, with the game won (or at least, with everyone but Microsoft having lost sufficiently) Microsoft has gone home. They may have even done that years ago, quietly.

IE 6/Win wasn’t much of an upgrade. (A CNET review: “Just about the only reason we can figure that IE 6 even deserves the full 6 version number is its release in conjunction with Windows XP. For those of you not upgrading to Windows XP, whether you run IE 5.x or Netscape 6.x, there's no need to rush for this download.”)

Which brings up a question: when was this decision made? It was made public only recently, but could have been in the air in the Microsoft executive suite for much longer. They have the money to keep the development teams going regardless of the outcome. (According to a Think Secret article, IE 6/Mac was largely finished last year, but according to a former developer “We were told by upper management to hold it back until they gave it the green light.”) Aside from a 2001 update just to keep up with the release of Mac OS X, there haven't been any real feature upgrades to Internet Explorer for either Mac or Windows for the past three years. Both of them might as well have been cancelled then.

We’ve been using a dead product all this time and didn’t even know it!

  06/14/03 01:28 AM  Software

June 13, 2003
TrackBacks, or Referrers?

Daring Fireball: The problems with Movable Type's TrackBack protocol.

  06/13/03 12:04 AM  Software

June 8, 2003
Cross-platform, cross-browser XML apps

Jon Udell writes: “Let's review what's happening in this screen shot. I'm running Mozilla Firebird on my Mac. The application is a structured search of my OSCOM slides. There's no search engine beyond the browser itself, which provides the JavaScript UI, the XPath-based search, and the XSLT-driven results display.”

This is great. It’s a working XPath lab in your browser. At least, if your browser is IE or a recent Mozilla derivative. (Come on, Apple, implement XSLT and the XMLDocument request APIs next in Safari.)

  06/08/03 11:14 PM  Software

June 8, 2003
Link: Accelerated PHP for Windows; Turck MMCache and FastCGI

PHP Everywhere: The Ultimate PHP for Windows with Turck MMCache (and FastCGI)

The old problem with running PHP on Windows is that many extensions are not thread safe and can only run safely in CGI mode. Unfortunately CGI is dead slow because the web server creates a new CGI process for each page request.

FastCGI is the solution to this. Instead of creating a new CGI process for each page view, it reuses existing CGI processes. This also improves database scalability because persistent db connections work properly.

...Well there is a lesser known GPL'ed accelerator, Turck MMCache that supports both Windows, Unix and Linux, written by Dmitry Stogov. I tested MMCache and FastCGI against several PHP scripts accessing MySQL, Oracle and Microsoft SQL Server. I gave the web-server a good pounding with a 60 minute stress test using fiendish scripts that cause PHP in ISAPI mode to crash. MMCache appears to be made of sterner stuff and passed with flying colours. While the test was running, I modified the source code of the test scripts; MMCache auto-detected the changes and recompiled.

[via PHP Everywhere]

  06/08/03 10:04 PM  Software

June 4, 2003
IE on ice?

CNET: Microsoft's browser play. The removal of IE as a free, downloadable software application could have a profound effect on the Web and the development of Web standards.

Zeldman: IE/AOL; the flip side. Why proposed negative (anti-IE) and positive (pro-luxury-browser) grassroots campaigns cannot change consumer behavior or alter Microsoft and AOL's business decisions. What is likely to happen to design and development methods over the next few years. Some of this is actually good news, really.

  06/04/03 08:15 PM  Software

June 3, 2003
Orphan IE

CNN: Microsoft to pay AOL $750M. Tech titans settle Netscape lawsuit, set seven-year licensing pact for AOL to use Internet Explorer.

CNET: Microsoft to abandon standalone IE.

So, AOL can now install IE with their product for no charge, just as MS terminates development of the installable IE. Great deal.

Jeffrey Zeldman has some great analysis.

Aside: does this mean AOL must use IE? From initial reports, the answer appears to be no. The alternative browser would just have to be free to AOL (or strategically valuable enough to justify its cost). However, AOL’s track record has it bundling IE for Windows even without a special agreement, even as it owned Netscape.

Presumably the termination of IE development means that users can only receive major browser updates by buying new versions of Windows. For the majority of users, who are likely to stick with their current OS version for a while, IE 6 SP1 is the end of the road. Its slow march toward compliance with CSS and other standards can go no further, no new web technologies will be added, and no more bugs will be fixed.

This is a huge problem for web programmers and designers. A large majority of web surfers—those using IE on probably all versions of Windows before Longhorn (scheduled for 2005)—have just had their browser orphaned, with no simple upgrade path. With all its warts, it’s going to stick around for a long time.

In other words, Internet Explorer 6 has been Netscape 4-ed.

Whopper of the day (from the CNET article): "Legacy OSes have reached their zenith with the addition of IE 6 SP1," [IE program manager] Countryman said. “Further improvements to IE will require enhancements to the underlying OS.” Is he trying make us believe that bug fixes, CSS3, XForms, etc., are impossible without a new operating system, due to some technical limitation? Maybe the quote was meant to look like a statement of technical possibility, while it was really a marketing dictum. As in: for the users to get further improvements in IE, they must first buy and install an updated OS. (Because we want it that way.)

Tim Bray, wresting with page layout in IE, puts it more strongly:

The problem isn’t that CSS is too hard. The problem isn’t browser incompatibilities in general. The problem is specifically that Microsoft Internet Explorer is a mouldering, out-of-date, amateurish, out-of-date pile of dung. Did I say it’s out-of-date? As in past its sell-by, seen better days, mutton dressed as lamb, superannuated, time-worn. It’s so, like, you know, so twentieth-century.

Ron Green raised the alarm, which echoed through Scripting News and then around the usual hallways: “All this has lead me to ask if IE is dead.”

Firebird [mozilla.org] and Safari [apple.com] are looking really good right now.

  06/03/03 07:48 PM  Software, Open Source

May 31, 2003
OSCOM Day 3: John Udell Keynote

Excellent keynote. He started with a simple, obvious thing which we tend to get wrong because we’re blind to it: weblog item doctitles that show up properly in search engines. Then a bunch of specific things we can implement, and a look toward the future. Good, practical stuff.

Talked about the content side of content management. Importance of titles and topic sentences. Communication skills. Don’t hit.

Content is the expression of ideas, request for attention, or attempt to influence. Technologists don’t think hard enough about the effort & the reward of making content.

Showed an entry on Don Box's site that displayed its title perfectly in his aggregator NetNewsWire, but Google didn't see it, because it wasn't in the doctitle. Easy to make this mistake. (Reiterated point: Publishing is essentially engineering. We forget these issues because engineers think from the inside out.) What is the right unit of content? Radio Userland has the day’s posts on one page, with the date as doctitle; Moveable Type one per page, so it can use the item's RSS title. Dave Winer's weblog comes in like an IV drip all day, but the audience for most weblogs isn't like that, and they need titles.

This affects how John Udell uses Radio Userland. Dave Winer interjected to ask if it would help to have a field to choose the day's title.

Brent’s Law of URLs: the more expensive the CMS, the crappier the URL. Showed a bunch of typical CMS & welogging system URLs. Tim Bray’s homegrown site was best: example ended with 2002/02/13/NamingFinishing. Vignette’s > $200K product was worst with an awful, long numeric URL.

Structure in doctitles. Search results pages can parse & group the titles. Example: with doctitle like Magazine Name | Date | Dept | title, group search results by magazine issue. Showed good example of this on O'Reilly's site.

Great example of broken titles in just about every mailing list archive. All the titles are wrong—they are the same as the last message in the thread. Not scannable. Showed a mockup with meaningful titles.

A few of the examples had the common thread of repetition of data in the user interface. Search results kept repeating the site name in document titles. Discussion board forums kept repeating the same subject lines. The mailing list example he showed was pretty much wall-to-wall repetition of the same thing. Only difference between successive lines was indentation and author name. A better interface would strip it all out, summarize, whatever. I've run into all the things he mentioned and just gotten used to them. I have to look at them with new eyes.

Call to implement ThreadsML.

Discussion of SlideML. Showed his method of generating it, but it isn't usable by “civilians”. No help in writing the actual content apart from typing raw XHTML in Emacs.

CMS systems came from publishing & were ported to web. Weblogs are web-first.

Hypertextual writing is still stuck in 1995. Netscape did as much or more than wer're doing today in 1996. We need lightweight web-aware writing tool. Need to advance beyond emacs, TEXTAREAs or the shoddy Windows DHTML edit control. InfoPath still relies on crummy XHTML editor.

Compound documents: tend to explode to meaningless names because the system has to add them (e.g. slide027.html). Discussion of old Netscape cid: protocol.

CMSs solve refactoring problems “in the large”: making consistent changes to many files, access, etc. Refactoring “in the small” suck up a huge amount of time: reformatting email messages, etc.

Categorization is a heavyweight operation; there should be other lightweight ad-hoc ways. Example: All Consuming book aggregator finds book references in blogs.

Showed example of searching his SlideML markup with XPath for code examples.

Update: Here are the slides and notes from Bitflux: part one, part two.

  05/31/03 09:46 AM  OSCOM, Open Source, Software, Interface

May 29, 2003
OSCOM Day 2: Other sessions

10 Best Features from Commercial CMS

Browser-based image editing, pre-localized interfaces

Extra credit: In-context editing (Edit This Page), dependency reporting, semblance of autoclassification, relational viewing tools

Reporting: such as Never Logged In

Configurable, forms-based workflow (ingest Visio WFML?)

508/WA compliant output — accessibility. Table headings + row headings, alts, etc.

Browser-based content object development (schema, essentially)

OpenCourse educational site. opencourse.org. “It rhymes with open source!” (The presenter avoided saying this, but I'm sure he wanted to.) Slow-moving.

Dublin Core Metadata in CMS

On oscom.org presentation slide show, different DC formats for XHTML, HTML, RDF XML are linked.

Good reference impl.: DC-dot. Another: Reggie

Elements (such as DC.Subject.Keyword) appearing multiple times, yes. Comma-separated value lists, no.

Discussion on thesauri, search engines, etc. Overall, I didn't get a huge amount out of this session, at least not directly. I'll have to find the references impls online.

  05/29/03 05:44 PM  Software, Technology, Open Source, OSCOM

May 29, 2003
OSCOM Day 2: WebDAV

Provides a standard way to place content on a web server, with metadata, file locking, versioning. Also can decouple filesystem layout from author's view. Uses HTTP for all logins, so no need to create full user accounts.

Very few clients support metadata so far. Cadaver does, but cmd-line based. Kcera? KExplorer? support properties.

To check out: Joe Orton's sitecopy. Twingle.

WebDAV for filesharing tested lighter than SMB on network traffic.

Question on ranged PUTs. WebDAV and mod_dav support it, but some servers don't. The Mac OS X WebDAV client can't use ranged PUTs for this reason, or it would risk replacing the entire file with the tiny part that was changed. They're working toward some kind of solution.

Servers include Apache mod_dav (which the speaker wrote) and Zope, Tomcat. Jakarta Slide requires a lot of work to connect its memory-based store to something. Can even handle WebDAV with CGI except for OPTIONS method.

Subversion supports DeltaV WebDAV. You can mount & copy files from vanilla Windows & Mac OS X. But you can't modify them, because the client don't support DeltaV. (There is an experimental "autoversion" plugin to server to allow this.)

Extensions: ACL. Remote management of ACLs; close to RFC status. DASL (DAV Searching & Locating). Yet another query language. Further off.

MS WebDAV does a little check for FrontPage first, but is pretty much straight WebDAV otherwise.

My question: best/simplest route to implement a change trigger for a WebDAV server, so I could run a script? Can I plug in easily to any of the existing servers?

A. Zope supports WebDAV and is programmable. It uses its own data store, though, not the filesystem. So the whole system would have to use Zope.

Best answer. Could look at logs / an Apache filter to implement change response. Great idea.

Alternative: Author of FS watch & notify utils suggested those. They only run on Unixes, though. (I need Windows support, so I could look into NT's APIs for filesystem notification too.)

  05/29/03 05:42 PM  Open Source, Technology, Software, OSCOM

May 29, 2003
OSCOM Day 2: Dave Winer Keynote

Dave Winer (introduced as "King of the Blogging World") said that was a great introduction, and he didn't agree with anything in it. Call to open source & commercial software worlds to work with each other. Speaking as a commercial developers who has also released open source.

Q: "Proprietary" label used to be sold as a good word. Open source just used it to differentiate themselves.

"40-person company" is what he recommends would be best for customers. 2-3 people doesn't cut it. But those 40-person companies don't exist anymore. Users look at Unix-style OS and think it must be very difficult to write. But it's actually much harder to write software that's easy to use, while users won't recognize its complexity.

Halley Suitt: Is she missing the marketing for open source? What does Linux look like? There's something with a penguin. Someone helpfully brought up his laptop and opened it for her. "My Linux virginity is gone," she announced.

Internet Explorer: users are stranded. Has a development team, but they don't fix the bugs.

XML-RPC: Dave did design in 2 weeks, met with Don Box et al once. Secret of success: not overloaded with complexity. Extra features were aggressively not included. Has not changed since 1999.

Audience member disputed the assertion that there were no 40-person software firms. Many CMS packages (shrinkwrapped) come from such companies.

What audience member wants: to be able to fix software. Even if developer goes bankrupt. Dave: What you want is not to be locked in. You want open file formats. Another audience member: retraining is high part of switching cost, not data conversion. Q: Source code escrow?

Q: With IE, doesn't want to be stranded. His weblog won't display properly in IE, and he can't fix it. Dave: Source code for IE should have been put in escrow and released already, because they're not working on it. He had strongly suggested that as a remedy in the MS antitrust trial.

Movivations for Open-Source Developers essay. To do: find link; it scrolled off my NetNewsWire aggregator before I read it.

Q: Audience member complained that Radio Userland has support issues, documentation issues.

Dave: They all do! There's no money in software! It's $39.95; that doesn't pay for a lot of support.

Sound bite about personally not liking Bill Gates or Richard Stallman. Neither of them take baths. This is quoted more accurately elsewhere.

Discussion of unifying variants of RSS.

And here we come to the climactic faceoff of the keynote. Apparently Dave Winer & Bill Kearney have never met in person before. I'll let the record speak for itself (search the web for both their names), but if you've ever seen their online mailing list discussions, you'd expect a matter vs. antimatter reaction if ever they were to meet.

Bill Kearney: I'm Bill Kearney, from Syndic8.

Dave: (no particular reaction) What's Syndic8?

Bill: (explains, happening to mention again that he's Bill Kearney)

Dave: Oh, you're Bill Kearney. My God.

[Bill starts talking about "democracy, rather than benevolent dictatorship"; discussion degenerates into shouting & swearing. Elapsed time: about 15 seconds. The play-by-play doesn't really matter, but if you want one, see Aaron's weblog. After the OSCOM organizer Charlie steps in after a few minutes, Dave is too rattled to move on and ends the session.]

I didn't get to ask my question.

  05/29/03 09:55 AM  Open Source, OSCOM, Software

November 20, 2002
Mac OS X Disk Image Installers

Steven Frank starts a user-interface discussion: .dmg Files Considered Harmful (via Daring Fireball).

I had some of the same misgivings about .dmg files, but there are also drawbacks to .sit and .tgz archives in that it's still not obvious to the untrained user how to install the programs after download. At least you can more reliably put instructions into a disk image window.

It's possible to make disk images user-friendly provided they satify three requirements:

  1. They should open a window automatically. After double-clicking (mounting) the image file, too many images show up in My Computer and the desktop, both of which may be hidden. So it looks like nothing happened. I just checked NetNewsWire, and it gets this right.
  2. They should have instructions in the main window on dragging the program to Applications. Again, NetNewsWire gets this right.
  3. They should show the Finder toolbar inside the window, so that the user can actually drag straight to the Applications icon. (Yes, this won't directly help the few users who may have manually removed this icon from their toolbars, but the majority of users will benefit.) NetNewsWire fails this test. So the user has to do some kind of dance in a different window, avoiding interfering with the image's window, either by opening up a separate Applications view or by finding a toolbar that does exist in an unrelated folder window. There are many opportunites for error here, including the Applications window obscuring the image or vice versa.

It would be interesting to see a new user try out a disk image. Without that, all I can do is speculate.

  11/20/02 03:52 PM  Software

November 19, 2002
SD East 2002 Notes

I posted my notes from day 1. There's no WiFi in the conference building, but I walked a few blocks down Newbury St., with free wireless access, and I'm posting this from a street corner.

  11/19/02 01:18 PM  Software

November 19, 2002
SD East 2002 Software Development Conference

I’m at the the SD East conference in Boston every day this week.

  11/19/02 12:15 PM  Software

November 12, 2002
ShearerSite Template System Update

I posted diagrams and a status update on the template system.

  11/12/02 11:54 PM  software

October 30, 2002
Template System: RSS

This page is now generated by the template system directly. It now comes with a new feature: an RSS plug-in. More on this later.

This replaces pyblosxom, which I had used temporarily, piping the output to static files to avoid CGI overhead.

Below is the bolus of postings built up over the few days that I worked on it.

  10/30/02 09:52 AM  software