any23
Jump to navigation
Jump to search
Apache Anything To Triples (Any23) is a library, a web service and a command line tool that extracts structured data in RDF format from a variety of Web documents.
Project pages:
- Homepage: http://any23.apache.org/
- Supported I/O Formats: https://any23.apache.org/supported-formats.html
- Microformats Extractor Support: https://any23.apache.org/dev-microformat-extractors.html
- Microformats Extractor Javadoc: https://any23.apache.org/apidocs/org/apache/any23/extractor/html/package-summary.html
- Project Issue Management: https://issues.apache.org/jira/browse/ANY23
Implemented Microformats
Microformats2 support
Any23 supports microformats2, which was implemented in [1]
Clients
The WebDataCommons [2] project uses Any23 and now extracts a large and varied volume of Microformts from the Common Crawl Corpus [3].
Web Service
TODO (lewismc 2017-03-28)