Enterprise portals may have become too complex and unwieldy; web browser don't have the "programmatic sophistication" to support security, content management and much else. Argues that companies "need better solutions and smarter technical and business approaches to take advantage of increasing desktop/laptop computing power and emerging service-oriented architectures in the business enterprise. Perhaps that is one of the things that Web 2.0 is all about—breaking the tyranny of the portal." Recommends adding Rich Internet Application (RIA) technology to expand function and better inter-application data sharing.
Friday, December 28, 2007
Using Web 2.0 for the Enterprise Portal
Enterprise portals may have become too complex and unwieldy; web browser don't have the "programmatic sophistication" to support security, content management and much else. Argues that companies "need better solutions and smarter technical and business approaches to take advantage of increasing desktop/laptop computing power and emerging service-oriented architectures in the business enterprise. Perhaps that is one of the things that Web 2.0 is all about—breaking the tyranny of the portal." Recommends adding Rich Internet Application (RIA) technology to expand function and better inter-application data sharing.
Thursday, December 27, 2007
IBM Classification Module
Press release:
"IBM announced new capabilities in its content classification software used to automatically categorize large volumes of enterprise information, making it easier to find, access, and use in the context of enterprise content management systems. The IBM Classification Module provides seamless connection to the IBM FileNet P8 content management platform to tackle the categorization of vast amounts of unstructured content in the enterprise, especially content stored or arriving in FileNet repositories. It automates the process of determining whether content is important, and how it should be handled. It can also automatically classify vast amounts of previously unmanaged content or reclassify content already under management so it can be leveraged for business purposes such as records management.
IBM also announced that Cloudmark, a provider of carrier-grade messaging security, has selected IBM content classification software to support its customer base with improved online customer support. The IBM software is intended to help Cloudmark reduce the workload and cost of handling online customer queries."
Saturday, December 22, 2007
Webinar on Folksonomies and Taxonomies
Daniela Barbosa of Dow Jones Client Solutions will be leading a Webinar organized by the Dow Jones InfoPro Alliance about Folksonomies & Taxonomies in the Enterprise on January 10, 2008. Registration is free. (Link is in that posting)
Among the topics:
* Business value of a taxonomy/folksonomy
* Impact of social networking tools on the enterprise
* Governance tools
* Merging folksonomies and existing taxonomies
* Some best practices and common obstacles
This entry comes from her weblog - daniela barbosa chitchatting about information delivery.
Postscript Feb 15, 2008 - webcast for this session is available from Factiva InfoPro probably until mid 2008. Uses Event On24.
Friday, December 07, 2007
Demonstrating Value of a Taxonomy
"In this primer/roundtable, Montague Institute founder Jean Graef will show how concepts from IT, library science, and corporate publishing can be used to communicate taxonomy benefits to different stakeholder groups. She will also summarize the experiences of Society members in selling taxonomy to their management."
Full description, price, and registration at http://www.montague.com/roundtable43.html
Exalead Enterprise 4.6
Exalead, notable for its search product that can extract related terms, has released exalead one:enterprise 4.6 with "several new enhancements that are designed to help organizations easily configure and customize business applications, including hybrid vertical search applications."
Thursday, November 29, 2007
Teragram for managing ontologies
"Teragram has unveiled Semantic Term Manager (STM) 2.0, software that enables management of content and maintenance of ontologies in enterprise content repositories and databases. STM 2.0 is designed to help corporate librarians maintain ontologies and integrate this information directly with Teragram’s TK240 taxonomy management tool. The combination of these two programs allows knowledge workers to maintain metadata across repositories and databases and to automatically tag documents according to the defined taxonomies. These tools help to simplify the enterprise search and retrieval process, says the company."
Enterprise Search: Information Architecture that includes users
Jayne Dutra is the Lead Enterprise Information Architect at the Jet Propulsion Laboratory, California Institute of Technology. She understands the importance of information architecture and the need to engage users in tagging content as well.
Users are in the habit now to add "metadata" to describe what something means to them and how this can be useful to others. But that alone won't be sufficient.
"Successful enterprise search today doesn't mean making keywords work well. It means creating a holistic information architecture designed for the enterprise that allows input and evolution by the users themselves. Ironically, this usually relies on the time honored and humble practice of generating metadata and controlled vocabularies that enable data connectedness and intuitive recall. For years, we've heard that users won't fill out metadata fields. Then how does one account for the phenomenal success of Flickr? If one enters a set of bookmarks in del.icio.us, doesn't that tell us something about the person's interests and background? New Web 2.0 technologies generate metadata in the wild that can be domesticated if we are wily enough to recognise the opportunity."
Dutra also argues for installing the foundation pieces - specifically the creation of a "metadata core specification" and an associated taxonomy.
"The ultimate goal is an information environment enhanced by metadata and served up through a number of rich user interactions facilitated by role based access. "
Monday, November 19, 2007
Recommind does Federated Search
"Recommind (www.recommind.com), a provider of enterprise search, automatic categorization, and eDiscovery systems for law firms and enterprises, announced the availability of the MindServer 5.1 platform, which combines robust navigation and grouping controls over external content with multilayered security to deliver a federated search framework. This latest version of Recommind’s flagship MindServer enterprise search platform is designed to bring the full potential of federated content to organizations."
Sunday, November 18, 2007
Weinberger on organizing digital information
He explained that he doesn't mean miscellaneous as a jumble of things that are unrelated to each other but as "the aggregation of everything, with the important difference that with the digital miscellaneous, we find all sorts of ways that the things are alike, all sorts of connections and relationships". He believes in the power of user tagging - of using the relationships that people identify as the means for finding information in an enterprise."Tagging systems let the users of information decide how they’re going to think about that information, or what that information means to them. Tagging within the corporation is potentially a very powerful tool for sharing knowledge and for enabling social networks to emerge around shared expertise."
On being asked if this replaces the traditional top-down taxonomies, Weinberger comes very close to saying yes, although in the end he seems to see them as being complementary.
"The real importance of a folksonomy is that it retains much more information than the traditional top-down taxonomy does. The top-down taxonomy only knows, typically, that x is a member of y and y is a member of z. With a folksonomy, you know that 17 percent of people think of x as a member of y, but 23 percent think of it as a member of q, and 42 percent of them think that it’s really the same thing as an x." ... "The folksonomy doesn’t have to replace the taxonomy with another static set of categories. It can instead allow the people who are in the minority a way of thinking about something to search the way that they want to. The folksonomy can surface those minority relationships."
Follow David Weinberger's musings about the organization of information at his blog Everything Is Miscellaneous.com/. The main page also has links to interviews, videos, and podcasts with Weinberger.
Professor Michael Wesch's video Information R/Evolution is especially recommended as it brings home the point that organizing digital information is much different to what civilization worked out for paper.
Friday, October 26, 2007
Taxonomies & the Semantic Web - Call Session
"Taxonomy is the art of adding value to information by placing it in a useful order that supports both direct searching and serendipitous browsing. Taxidermy is the art of stuffing and arranging the skins of dead animals to create lifelike effects. Taxonomies are a fundamental part of the Semantic Web: machine-readable hierarchies that enable intelligent agents to make logical inferences, thereby making information retrieval an entirely new, more sophisticated experience. However, recent books such as Dave Weinberger's Everything is Miscellaneous and Eric Abrahamson's A Perfect Mess suggest that taxonomy and taxidermy are closer than we care to acknowledge."
Montague Institute on Sharepoint
> Yahoo discussion list for Sharepoint search.
http://tech.groups.yahoo.com/group/sharepointsearch
"This topic includes all issues relating to creating, organizing, and finding documents on the Sharepoint platform. The group is intended as a forum for people who manage records, documents, intranet content, and external collaboration sites as well as for indexers, corporate taxonomists, and enterprise search managers."
> Web course Taxonomies, Search & Sharepoint -($$$)
Thursday, October 25, 2007
Semantic Web Blend
The Semantic Web - the structure that will enable us to see connections between databases and gather information with less effort - is getting closer. This article describes the objectives, the progress, the players in making smarter tools for organizing and finding information. Ontologies are involved as are user-generated tagsonomies.
"The Semantic Web community's grandest visions, of data-surfing computer servants that automatically reason their way through problems, have yet to be fulfilled. But the basic technologies that Miller shepherded through research labs and standards committees are joining the everyday Web. They can be found everywhere--on entertainment and travel sites, in business and scientific databases--and are forming the core of what some promoters call a nascent "Web 3.0.""
The writer traces the history to organize information from Melvil Dewey's days, through the early days of directories on the Web, and the increasing acceptance of using metadata to describe information objects.
Eric Miller, an MIT-affiliated computer scientist, has been one of the contributors to furthering "semantic web" enabling technologies.
Meanwhile, social tagging has been gaining acceptance imposing a "grassroots order" on collections.
"No one knows what organizational technique will ultimately prevail. But what's increasingly clear is that different kinds of order, and a variety of ways to unearth data and reuse it in new applications, are coming to the Web. There will be no Dewey here, no one system that arranges all the world's digital data in a single framework."
Saturday, October 20, 2007
Folksonomies and Tagging
Introduction: Folksonomies and Image Tagging: Seeing the Future? by Diane Neal
Check other articles in the issue:
- Why Are They Tagging, and Why Do We Want Them To?
- Trouble in Paradise: Conflict Management and Resolution in Social Classification Environments
- Image Indexing: How Can I Find a Nice Pair of Italian Shoes?
- Flickr Image Tagging: Patterns Made Visible
Thursday, October 18, 2007
Faceted Analysis of online media types
Interesting application of faceted analysis.
Abstract
"The emergence of new media types, many seemingly without counterparts in the non-digital world, challenges the readiness of existing knowledge organization schemes to accommodate them. A knowledge organization scheme based on a faceted analysis of existing classes of bibliographic materials is likely to accommodate new developments better than one based on a list of unanalyzed material types. The faceted analysis undertaken here, in which seven facets are recognized (content, generation of content, recording of content, publication/distribution, physical characteristics, perception/use, and relationships) shows the inadequacy of the traditional view of the bibliographic community of a fundamental distinction between content and carrier; interaction between content and carrier is common and enters into the characterization of material types. The facet analysis is validated by applying it to two new material types, wikis and blogs.
Wednesday, October 17, 2007
Enterprise Search Practice Blog
From time to time she comments on taxonomies and taxonomy development. For example, in The Marginal Influence of E-commerce Search and Taxonomies on Enterprise Search Technologies she wrote,
"The second distinction relates to taxonomies, and the increase in their development and use. I’ve seen a dramatic increase in job postings for “taxonomists” and have managed several projects for enterprises over the years to build these controlled lists of terms for categorizing content. What is noteworthy about recent job opportunities is that most seem to be for customer facing Web sites. Historically, organizations with substantial internal content (e.g. research reports, patents, laboratory findings, business documents) hired professionals to categorize materials for a narrowly defined audience of specialists. The terminology was often highly unique, could number in the hundreds or thousands of terms, even for a relatively small enterprise. This is no longer a common practice."
Saturday, October 13, 2007
Folksonomies meet Taxonomies
Thomas Vander Wal Talks with Talis about Folksonomies (Aug 3, 2007)
Wednesday, October 10, 2007
Book: Making Search Work
Reviewer Jothi Nedungadi suggests that better titles might be "Making Intranet Searches Work" or "Making Enterprise Searches Work". These titles capture the breadth and the intent better. A company would need to be visionary to adopt the practices necessary to really improve search. He doubts that many would but still recommends the book - "This is however an impressive compilation of information and a commendable effort by the author to address intranet search. His perspectives on making searches work are invaluable. I would recommend this book to those who are considering implementing an enterprise/intranet search engine."
Martin White gives a sample in an article written for Update Magazine.
The table of contents and sample chapter are at the publisher's site - Facet Publishing Online. There is some (small) mention of taxonomy management and social tagging.
Monday, October 01, 2007
Tagging Practices and Their Value
[Also available through http://eprints.rclis.org/archive/00011413/ ]
"Abstract: This paper examines the tagging practices evident on CiteULike, a research oriented social bookmarking site for journal articles. Tagging practices were examined using standard informetric measures for analysis of bibliographic information and term use. Additionally, tags were compared to author keywords and descriptors assigned to the same article."
Shows that user tagging can enhance findability over full-text search by keyword and indexing with controlled vocabulary.
Conclusion: "The differing terminology use in tag lists suggests that tagging may be a working example of Vannevar Bush's associative trails. He argued that associative trails better represented how users actually work with their documents: by association rather than by categorisation. (Bush 1945) This suggests that user tagging could provide additional access points to traditional controlled vocabularies and provide users with the associative classifications necessary to tie documents and articles to time and task relationships as well as other associations which are new and novel."
Classification Software - webcast
"In simple terms, classification software catalogues your information so that it can be found easily and logically, regardless of the file format or where it is stored. It should classify new information as it is brought in as well as sort through content that is already under management. It should learn more about your content over time, so that it becomes more reliable and knowledgeable. And it should streamline searches by HR, Finance, Customer Care agents, Marketing, and all the other decision makers in your organization—making them more productive and accelerating the time-to-value of your investment in ECM."
Sponsored by IBM
More information and registration at http://www.kmworld.com/Webinars/Details.aspx?EventID=251
Friday, September 21, 2007
Ways to Improve Enterprise Search
"Quality search results only come about through applied effort, requiring in particular the skills of an information architect.2 And IAs must be ready to go well beyond their traditional front-end role, digging into the functional backend and source data of the search engine. This article outlines how we can bolster findability and win back users’ confidence."
Excellent description on the design factors that contribute to better search including the use of metadata, standardized set of keywords, and an ontology.
Book Review: Glut: Mastering Information Through the Ages
" Information architects—and anyone curious about the roots of information management—will find much of interest in Glut’s thought-provoking tale."
Microsoft SharePoint 2007 - Taxonomy Piece
BA-Insight has published a white paper on the search capabilities of SharePoint -- SharePoint Search: Five Keys to a Successful Implementation
"BA-Insight is a Microsoft ISV partner whose product Longitude for SharePoint extends the search capabilities of SharePoint to deliver dramatic improvements in usability and relevance for the user."
One of the five keys listed is to "Plan and build an effective taxonomy in SharePoint". In the paper, BA-Insight recognizes that taxonomies provide a means for "clarifying" results, but argues that the "centralized, classified and managed store of content", attractive as it is, is impossible. They recommend leveraging SharePoint to create a "simple" taxonomy.
The first step is to accurately map the hierarchy in your organization to you SharePoint Site structure. A site per department is typical and useful. Secondly, identify what types of content is useful to each department, and create a document library and/or List for each type. Heavily leverage the Content Type construct in SharePoint to tag specific types of content. Finally, and most importantly, optimize the SharePoint ranking algorithm. Taxonomy is only necessary, because the ranking algorithm isn’t doing its job.
Some might take issue with the statement that "taxonomy is only necessary because the ranking algorithm isn't doing its job". A taxonomy can also provide navigational access, big picture, vocabulary, and greater precision in search.
BA-Insight's Longitude module adds to this the capability for users to tag content - claiming that "over time a rich set of metadata is derived". There is some value to user tagging - different points of view, meaningfulness to the user - but "rich set of metadata" is a stretch unless the tagging is managed or guided.
Microsoft SharePoint Server has a large presence in corporations and with SharePoint 2007 Microsoft is taking on content management along with enterprise search. Taxonomy design and use will need to be figured into the plans. There may be other vendors, like BA-Insight, who will develop products that will enhance the MOSS search function.
SharePoint 2007 Review: Six Pillars of MOSS (Nov 2006) at CMS Wire provides a full description of the direction and capabilities of SharePoint.
Thursday, September 20, 2007
Endeca Discovery Suite
Endeca is extending the capabilities of extend the Endeca Information Access Platform (IAP)with the Endeca Discovery Suite. Some of the improvements relate to tag extraction and tag-based visualization.
- tag extraction capabilities to pull together and reveal common themes, concepts and entities from text-based reviews, blogs and posts for use in site navigation, search relevancy and search engine optimization;
- tag-based visualization and navigation to complement static and dynamic site navigation, giving users more ways to explore and find desirable content and products
- meta-relational capabilities to link different content types by common concepts, allowing people to dynamically summarize the user-generated content associated with any set of products
Tuesday, September 11, 2007
Freebase for Structured Content
"Freebase, like Wikipedia, is an open encyclopedia that most anyone can edit. But alongside each free-form article in Freebase, there are database fields for relevant hard data points. If the article is about a movie, you'll find fields for its release date, director, producer, screenwriters and so on. If the article is about a city, it will have fields for the city's population and location. If the article is about an artist, it will have a field for every one of that artist's works."
Freebase is an open project for building structured data applications using types and defined properties. Film will have one set or properties, geographic places another set. It will support complex queries.
From the FAQ - "Finally, while information in Freebase appears to be structured much like a conventional database, it’s actually built on a system that allows any user to contribute to the schemas—or frameworks—that hold the data. This wiki-like approach to structuring information lets many people organize the database without formal, centralized planning. And it lets subject experts who don’t have database expertise find one another, and then build and maintain the data in their domain of interest."
Understanding OWL
In this ACM IT magazine article, Saha describes the web ontology language (OWL) which is a principal part of enabling a "semantic web". Has illustrations, examples and explanations of code.
"Web Ontology Language (OWL) is a language for defining and instantiating web ontologies (a W3C Recommendation). OWL ontology includes description of classes, properties and their instances. OWL is used to explicitly represent the meaning of terms in vocabularies and the relationships between those terms. Such representation of terms and their interrelationships is called ontology. OWL has facilities for
expressing meaning and semantics and the ability to represent machine interpretable content on the Web. OWL is designed for use by applications that need to process the content of information instead of just presenting information to humans. This is used for knowledge representation and also is useful to derive logical consequences from OWL formal semantics."
Card Sorting Challenges
The author speaks from experience in this article about card sorting. It's a simple concept, deceptively so, and people may expect more than it can deliver.
"I’ve accepted the fact that card sort analysis—much like usability test analysis—is often messy and subjective. It’s part science, but mostly art. As with many aspects of our work, there isn’t necessarily a single correct, quantitative answer, but rather a number of different qualitative answers—all of which could be correct. Our job is to use our experience and our understanding of people to make judgment calls."
[Mentioned in InfoDesign: Understanding by Design ]
Wednesday, August 22, 2007
Debate about tagging
One wonders if people really want to spend the extra few seconds to tag an item, and if they do tag to use something more useful than "read later". I suspect that tagging will remain personal, and that general access will depend on automatic categorization based on business rules.
Tuesday, August 21, 2007
Social search and taxonomies
Webinar will be available for 90 days at www.kmworld.com/webinars/bea/21aug2007
Thursday, August 16, 2007
Facets and Taxonomies
From the announcement: "We'll start with an overview of facets and faceted search and then hear from Peter Bell, one of the founders of Endeca, a faceted search company, about new developments in the field that allow a combination of unstructured and structured tagging and classification. "
Thursday, July 19, 2007
SchemaLogic's Content Tagging
SchemaLogic provides a solution for customers to implement a collaborative process that enables writers, photographers, and editors to participate in the development and enrichment of the underlying “content tags” that describe information in a dynamic, ever-changing environment – and they do not have to change the way they use their own terminology. Content tagging is an advanced method of identifying and labeling information assets including audio, video, news stories, and other web content using text descriptions. SchemaLogic’s software manages the definition and relationships between content tags so that each individual in each department can continue to work in a way that makes sense for them, while the semantic differences are resolved by the technology.
SchemaLogic Delivers First Business Semantics Management Solution for Media and Publishing Enterprises, Press Release (July 16)
Friday, July 13, 2007
Authors argue that text mining - for drawing relationships between disparate data from many sources - needs a "semantic infrastructure that focuses on information quality and decision support".
Key point (bolding added) : "The interpretation of text is just the first step in making the information usable. Another key part is then organizing the resulting “text pieces” into some form of usable network. This is addressed by building taxonomies and ontologies that can be navigated to explore specific topics of interest. Finally, the results must be output in a format that can be interpreted and lead to knowledge discovery."
It identifies three parts to a text-mining system: parsing the text into parts, tagging extracted information, and organizing the parts using taxonomies and ontologies.
Wednesday, July 11, 2007
Dow Jones Releases Synaptica 6.4 for Improved Business Semantic Management
In early June 2007 Dow Jones & Company introduced Synaptica 6.4 - its latest semantic Web-enabled knowledge organization system for the enterprise.
Synaptica 6.4 simplifies and standardizes vocabulary and metadata management in order to unlock valuable business intelligence.
“Computers can store, search and display enormous amounts of information, but until recently machines have not been able to understand the meaning of the content,” said Dave Clarke, global taxonomy director, Dow Jones. “Now, with the semantic Web being able to capture the meaning in a machine-readable way, users can discover latent information and make new connections between isolated content while benefiting from comprehensive and precise information recall.”
To read the full press release visit http://www.factiva.com/investigative/releases/20070605_synaptica.asp?node=menuElem1176
For more information about Synaptica 6.4, visit http://www.factiva.com/products/taxonomy/synaptica.asp?node=menuElem1511
To learn more about Dow Jones services, visit www.dowjones.com/clientsolutions
Sunday, July 08, 2007
Taxonomy Boot Camp 2007
Monday, July 02, 2007
Cogito semantic intelligence
Monday, May 28, 2007
Book and Blog about Organizing Knowledge
He explained, "Hence, as far as I know, this is also the first taxonomy book that combines a practical guide to taxonomy development with a broader explanation of how taxonomy work contributes to knowledge management in a variety of ways."
His weblog, Green Chameleon, has several categories related to knowledge management and to taxonomy which provide various insights. This one on Folksonomies and Rich Serendipity argues for the value of people as "knowledge aggregators". This is a very thoughtful piece that was later included in Lambe's book.
Patrick Lambe is a principal with Straits Knowledge, a consulting firm for information and knowledge management based in Singapore.
Tuesday, May 15, 2007
Enterprise Search
Siderean's Relational Navigation
Siderean's Seamark Navigator 4.5 enables identification of the relationships between sets of data across disparate sources whether structured or unstructured. Among other benefits, it may make it easier for users to create and manage taxonomies, but also to be able to see connections between facets - to be able to "branch out".
"The offering takes full advantage of semantic technology to enable users to harness content from across the enterprise and on the Web, greatly facilitating information access and discovery. Further, says Siderean, it enables more user participation than before, providing new tagging, voting, ranking and reviewing capabilities. It also helps knowledge workers to efficiently collaborate via commenting features and the ability to save and share searches."
Paula Hane at Information Today's Newsbreaks provides a longer description in Siderean Upgrades Its Relational Navigation Platform (May 14) Sue Feldman of IDC was among those interviewed for this article. She said that "Siderean handles dynamism — elements don’t have to be predetermined. And, while other discovery tools allow drill down through facets and hierarchies, Siderean can handle ‘sideways relationships’ in a unique way."
Siderean Software website has whitepapers and articles on its relational navigation. Especially recommended is the short video (3 minutes) on the Evolution of Search in which the VP of software engineering, Jack Berkowitz, compares keyword search, faceted search, and relational search.
David Weinberger in his blog, Everything Is Miscellaneous, references the Siderean patent for relational navigation, saying "Faceted classification and taxonomies both work by showing the user narrower and narrower results. That's often what we want, but in this crazy world, we may also want to leap off the branch we've walked onto." Siderean's relational navigation might be the method.
Tuesday, May 01, 2007
Endeca Finds Relationships
Endeca Technologies (www.endeca.com) is best known for Guided Navigation, a faceted view for discovering content at a website. This article describes improvements that make possible "new data-driven approach with metarelational indexing that lets users navigate complex relationships between different types of information from different sources".
From the article - "Here’s a simplified description of how Endeca’s architecture works. Each document or record is a set of facets. Some facets are explicit, such as database fields or file metadata. In addition, the text itself can be transformed into explicit facets through entity and term extraction, classification, and other techniques. Finally, what the document is “about” from the user’s perspective is implicit, so Endeca keeps a full-text index of the document as another facet. Working with facets allows Endeca to adapt to disparate information types, to make connections and correlations, and to show the content interrelationships to users."
Article includes screenshots and links to a demo.
Thursday, April 26, 2007
Evaluating Classification
Measuring the Success Of a Classification System by Iain Barker, Boxes and Arrows (April 2007)
Barker adapted work by Donna Maurer for evaluating card-based classification and applied it to quantitatively showing the improvements to be obtained from a new classification system for the company intranet.
Tuesday, April 24, 2007
Tagging made easier
Tagsearch Technologies Launches Tagging and Web Collaboration Platform, EContent (Apr 24)
Friday, April 20, 2007
Recommended: Enterprise Search Sourcebook 2007
+ Seth Earley has an article on Taxonomies, MetaData and search.
+ Peter Morville is interviewed in Enterprise Search and the Future of Findability
+ Susan Feldman writes on Search, The Quiet Revolution.
+ Francois Bourdoncle, CEO of Exalead, wrote Transform your Intranet into a Source of Knowledge.
There is also an index to advertisers and a showcase area for vendors.
I found the easiest way to navigate was to use the Contents button in the e-book control menu to get to articles, and then to set the view to single page at 100% for reading. Fortunately, it is possible to selectively print pages.
This e-book is a substantial resource on enterprise search for the variety of topics, the excellent writers, and the size - 124 pages.
Wednesday, April 11, 2007
Metadata helps keyword search
Web site makeover: Legacy retrieval tools save time for users (March 2007)
The article examines the U.S. Supreme Court web site and presents a careful analysis of the audience for this site, alternative sources of information, and a makeover that would incorporate the best from each.
The main point: "it's time to rethink legacy retrieval tools in a Web context and consider a metadata repository as the implementation vehicle".
Monday, April 09, 2007
FAST buys Convera's RetrievalWare
Convera Corp has sold its RetrievalWare business to FAST. This gives FAST extra strength in the enterprise search market along with large number of client departments in the U.S. government.
"Convera was formed in December 2000 through the combination of the former Excalibur Technology Corp. and Intel’s Interactive Media Services Division. Excalibur had previously merged with Conquest in the mid-1990s."
The announcement states that FAST won't be developing RetrievalWare any further, but will "port" some capabilities from it to FAST's platform.
"However, according to Bauert [Peter Bauert, senior vice president of corporate development at FAST], RetrievalWare does offer some desirable features and functionality that FAST does not have now. He explained: “An example of a feature that we intend to ‘port’ from RetrievalWare to FAST ESP is the way that Convera is doing semantic mining using ontologies and taxonomies. While FAST already supports ontologies/taxonomies, Convera’s customers are used to a tool called the KnowledgeWorkbench and we intend to make that available to the customers as they migrate to FAST ESP.”"
Tuesday, April 03, 2007
Inxight Extracts Metadata
"Inxight Software, a provider of enterprise software solutions for information discovery, has launched the Inxight SmartDiscovery Metadata Management System (MMS), designed to allow users to review, cleanse, and augment automatically extracted metadata--the entities, relations, and event data trapped in electronic text."
New taxonomy at CIO.com
Of note, are two navigational devices:
1) A new taxonomy covering technology and leadership.
2) A Google Custom Search that searches the CIO.com domain for content, and offers some standard tags for viewing the content: blogs, white papers, webcasts, advice/opinion.
Among the Web 2.0 elements are polls, RSS feeds, blogs with readers' comments and discussion.
Vendors will have more opportunity to add content, such as white papers and articles. Search for taxonomies as a start.
Thursday, March 22, 2007
Taxonomy Warehouse Update
Wednesday, March 21, 2007
Using The Thesaurus of Aging Terminology
This is a good application for seeing how the thesaurus has been constructed and how it is used to assist in finding materials in the AgeLine Database of articles and studies.
Thesaurus - http://www.aarp.org/research/ageline/thesaurus.html
"The Thesaurus of Aging Terminology is a controlled vocabulary of subject terms (also called keywords or descriptors) used to index all publications cited in AgeLine. Because AgeLine focuses on aging-related topics from a variety of disciplines, the Thesaurus can be very useful in constructing a thorough search of the database, in defining how a term is used in AgeLine, and in identifying references having a major focus on that topic."
It is not the easiest to use. Basically, browse the PDF version of the thesaurus (272 pages), note terms you'd like to use, and do a copy and paste into the AgeLine search form.
"The Thesaurus of Aging Terminology is divided into three sections: Relational Terms, Rotated Terms, and Geographical Terms. The Relational Terms section indicates all levels of relationship among Thesaurus terms. The Rotated Terms section provides an alphabetized columnar listing of all words found within Thesaurus terms. The Geographical Terms section provides a ready reference list of state, province, country, regional, and continent names searchable as Descriptors."
AgeLine Database - http://www.aarp.org/research/ageline/index.html
Search Ageline - there are several options - basic keyword, subject, and multiple options. It covers a great range of aging topics related to health, living, and well being.
Watch for the Descriptors on articles that are displayed, and navigate to other topics.
Thursday, February 15, 2007
Taxonomy Boot Camp 2006 Presentations
+ Taxonomy 101 by Marjorie Hlava, Data Harmony - introduces taxonomies and how to create them.
+ Semi-Automated Creation of Faceted Hierarchies Marti Hearst, Berkeley
"Taxonomy Boot Camp 2006 offered nearly 200 enthusiastic attendees the rare chance to focus on taxonomies-all the time-and nothing but taxonomies."
Conference for 2007 is scheduled for November 8-9, 2007 in San Jose, CA.
Thursday, February 08, 2007
Earley buys Wordmap
See Tantalizing taxonomies, KMWorld (Feb 7, 2007)
Seth Earley, principal of the firm, explained, "An interesting aspect of the Wordmap suite that differentiates it from many products on the market is the integration with content tagging and search. After all is said and done, taxonomies are only useful if they are presented to the user in a meaningful way. Wordmap modules have the ability to do that without a lot of API level coding. The tagging module overcomes many limitations of content management tools in presenting and applying taxonomies for tagging. The navigation module is an easy way to add faceted search (also called guided navigation) without having to acquire additional faceted search tools. In these ways, Wordmap adds value to existing search and content management environments."
Wednesday, January 17, 2007
Study into Tagging Practices
Abstract
This paper analyzes the tagging patterns exhibited by users of del.icio.us, to assess how collaborative tagging supports and enhances traditional ways of classifying and indexing documents. Using frequency data and co-word analysis matrices analyzed by multi-dimensional scaling, the authors discovered that tagging practices to some extent work in ways that are continuous with conventional indexing. Small numbers of tags tend to emerge by unspoken consensus, and inconsistencies follow several predictable patterns that can easily be anticipated. However, the tags also indicated intriguing practices relating to time and task which suggest the presence of an extra dimension in classification and organization, a dimension which conventional systems are unable to facilitate.