news-brainstorming

From Microformats Wiki
Jump to navigation Jump to search

News Brainstorming

There have been several efforts to define data formats for news content. Almost all have focused on the interchange of news content between systems and organizations, and so contain dozens (if not hundreds) of fields that are targeted at "news management"--a mix of content management, metadata management, versioning and other operations undertaken by news organizations.

This page serves to document the brainstorming and ideas resulting from analysis of news examples from real world sites and systems for the design of a simple news microformat. - Jonathan Malek

Contributors

  • Jonathan Malek
  • Stuart Myles
  • Martin Moore
  • Mark Ng
  • Todd Martin

See Also

The Problem

While there are dozens of formats used on thousands of news sites, there is no single standardized format for presentation of news on the web. Having a standardized news format for web publishing would significantly benefit readers, aggregators, search engines and researchers alike. With no standard format for news, search engines are forced to parse unstructured data, and errors can be costly (see Wired.com, 2008).

Thoughts on a Microformat for News

We found significant overlap with hAtom, and simplified an initial effort at a data format for news away from describing any fields already in hAtom, or the superset Atom, with the expectation that future versions of that draft specification would approach feature parity. Instead, we focused on those news fields not in hAtom.

In much the same way that one extends Atom, we are looking to extend hAtom with the most vital news-specific fields.

The fields we've selected are a combination of the common fields from many of the news formats currently in use, and the introduction of one new field, principles.

Common News Fields

  • hAtom fields
  • source-org: the source organization for this particular news story--should be considered different from the atom:source element because it does not represent the source feed, but rather the source organization (and so should use hCard). We're using source-org to avoid name conflict with hAtom should the draft decide to include the atom:source element.
  • dateline: using text or hCard, not to be confused with date (see dateline for more information).
    • would an hCalendar event (which can contain an hCard location) make sense for a dateline, or is the 'date' part more often omitted? Kevin Marks 18:32, 24 August 2009 (UTC)
    • Confusingly, the journalistic term "dateline" isn't anything to do with a date or time. It is the location from which a report is filed and is generally the main location associated with a story. Generally, a dateline consists of a city (e.g. "Rome") but could be the name of a ship at sea or even a space station. Stuart Myles 21:12, 24 August 2009 (UTC)
  • geo: using geo, a simple way of providing the information necessary for services for readers around local news content
    • is geo really in use here, or would using an hCard (that can contain geo) be a better way of representing locations referred to in the story, as more human readable? Kevin Marks 18:32, 24 August 2009 (UTC)
  • item-license: to express licensing around the item
  • principles: using the draft format rel-principles

Naming

Here are candidate names for a news microformat:

  • hNews