Thomas Steiner, PhD

Thomas Steiner, PhD

Reutlingen, Baden-Württemberg, Germany
500+ connections

About

Developer Relations Engineer at Google, focused on the Web and Project Fugu. Dad-of-3…

Activity

Join now to see all activity

Experience

  • Google Graphic

    Google

    Hamburg Area, Germany

  • -

    Heilbronn, Baden-Württemberg, Germany

  • -

    Greater Hamburg Area

  • -

    Hamburg Area, Germany

  • -

    Lyon Area, France

  • -

    Hamburg Area, Germany

  • -

  • -

  • -

Education

  • Universitat Politècnica de Catalunya

    -

    Finished a PhD on event summarization based on social networks, co-supervised by Joaquim Gabarro (UPC) and Michael Hausenblas (former DERI, now MapR). The title of my thesis is Enriching Unstructured Media Content About Events to Enable Semi-Automated Summaries, Compilations, and Improved Search by Leveraging Social Networks. The thesis was submitted for review in October 2013.

  • -

    Postdoctoral researcher working on models and tools for publishing and analyzing annotated videos of live performances.

  • -

  • -

Licenses & Certifications

Publications

  • Exploring entity recognition and disambiguation for cultural heritage collections

    Literary and Linguistic Computing

    Unstructured metadata fields such as ‘description’ offer tremendous value for users to understand cultural heritage objects. However, this type of narrative information is of little direct use within a machine-readable context due to its unstructured nature. This article explores the possibilities and limitations of named-entity recognition (NER) and term extraction (TE) to mine such unstructured metadata for meaningful concepts. These concepts can be used to leverage otherwise limited…

    Unstructured metadata fields such as ‘description’ offer tremendous value for users to understand cultural heritage objects. However, this type of narrative information is of little direct use within a machine-readable context due to its unstructured nature. This article explores the possibilities and limitations of named-entity recognition (NER) and term extraction (TE) to mine such unstructured metadata for meaningful concepts. These concepts can be used to leverage otherwise limited searching and browsing operations, but they can also play an important role to foster Digital Humanities research. To catalyze experimentation with NER and TE, the article proposes an evaluation of the performance of three third-party entity extraction services through a comprehensive case study, based on the descriptive fields of the Smithsonian Cooper–Hewitt National Design Museum in New York. To cover both NER and TE, we first offer a quantitative analysis of named entities retrieved by the services in terms of precision and recall compared with a manually annotated gold-standard corpus, and then complement this approach with a more qualitative assessment of relevant terms extracted. Based on the outcomes of this double analysis, the conclusions present the added value of entity extraction services, but also indicate the dangers of uncritically using NER and/or TE, and by extension Linked Data principles, within the Digital Humanities. All metadata and tools used within the article are freely available, making it possible for researchers and practitioners to repeat the methodology. By doing so, the article offers a significant contribution towards understanding the value of entity recognition and disambiguation for the Digital Humanities.

    Other authors
    See publication
  • Exploring entity recognition and disambiguation for cultural heritage collections

    Literary and Linguistic Computing

    Unstructured metadata fields such as ‘description’ offer tremendous value for users to understand cultural heritage objects. However, this type of narrative information is of little direct use within a machine-readable context due to its unstructured nature. This article explores the possibilities and limitations of named-entity recognition (NER) and term extraction (TE) to mine such unstructured metadata for meaningful concepts. These concepts can be used to leverage otherwise limited…

    Unstructured metadata fields such as ‘description’ offer tremendous value for users to understand cultural heritage objects. However, this type of narrative information is of little direct use within a machine-readable context due to its unstructured nature. This article explores the possibilities and limitations of named-entity recognition (NER) and term extraction (TE) to mine such unstructured metadata for meaningful concepts. These concepts can be used to leverage otherwise limited searching and browsing operations, but they can also play an important role to foster Digital Humanities research. To catalyze experimentation with NER and TE, the article proposes an evaluation of the performance of three third-party entity extraction services through a comprehensive case study, based on the descriptive fields of the Smithsonian Cooper–Hewitt National Design Museum in New York. To cover both NER and TE, we first offer a quantitative analysis of named entities retrieved by the services in terms of precision and recall compared with a manually annotated gold-standard corpus, and then complement this approach with a more qualitative assessment of relevant terms extracted. Based on the outcomes of this double analysis, the conclusions present the added value of entity extraction services, but also indicate the dangers of uncritically using NER and/or TE, and by extension Linked Data principles, within the Digital Humanities. All metadata and tools used within the article are freely available, making it possible for researchers and practitioners to repeat the methodology. By doing so, the article offers a significant contribution towards understanding the value of entity recognition and disambiguation for the Digital Humanities.

    Other authors
    See publication
  • Capturing the functionality of Web services with functional descriptions

    Multimedia Tools and Applications

    Many have left their footprints on the field of semantic RESTful Web service description. Albeit some of the propositions are even W3C Recommendations, none of the proposed standards could gain significant adoption with Web service providers. Some approaches were supposedly too complex and verbose, others were considered not RESTful, and some failed to reach a significant majority of API providers for a combination of the reasons above. While we neither have the silver bullet for universal Web…

    Many have left their footprints on the field of semantic RESTful Web service description. Albeit some of the propositions are even W3C Recommendations, none of the proposed standards could gain significant adoption with Web service providers. Some approaches were supposedly too complex and verbose, others were considered not RESTful, and some failed to reach a significant majority of API providers for a combination of the reasons above. While we neither have the silver bullet for universal Web service description, with this paper, we want to suggest a lightweight approach called RESTdesc. It expresses the semantics of Web services by pre- and postconditions in simple N3 rules, and integrates existing standards and conventions such as Link headers, HTTP OPTIONS, and URI templates for discovery and interaction. This approach keeps the complexity to a minimum, yet still enables service descriptions with full semantic expressiveness. A sample implementation on the topic of multimedia Web services verifies the effectiveness of our approach.

    Other authors
    See publication
  • Google Scholar Profile

    -

    Please click through to see my full list of publications.

    See publication

Patents

  • Dynamic adjustment of video quality

    Issued US US20150121226 A1

    A video quality module receives data indicating a visibility status of a tab of a web browser running on a user device. The video quality module determines, based on the data indicating the visibility status of the tab whether the tab of the web browser is currently visible to a user of the user device, the tab of the web browser comprising a streaming media player. If the tab of the web browser is not currently visible to the user, the video quality module decreases a quality of a video…

    A video quality module receives data indicating a visibility status of a tab of a web browser running on a user device. The video quality module determines, based on the data indicating the visibility status of the tab whether the tab of the web browser is currently visible to a user of the user device, the tab of the web browser comprising a streaming media player. If the tab of the web browser is not currently visible to the user, the video quality module decreases a quality of a video component of a streaming media file playing in the streaming media player.

    See patent
  • Non-textual user input

    Issued US US8935638 B2

    A computing device receives a first user input at presence-sensitive display of the computing device, wherein the first user input corresponds to a portion of a desired non-textual object. The device displays a first graphical representation indicating the first user input at the touchscreen display, and determines a first non-textual object suggestion based upon at least the first user input. The device displays a second graphical representation indicating the first non-textual object…

    A computing device receives a first user input at presence-sensitive display of the computing device, wherein the first user input corresponds to a portion of a desired non-textual object. The device displays a first graphical representation indicating the first user input at the touchscreen display, and determines a first non-textual object suggestion based upon at least the first user input. The device displays a second graphical representation indicating the first non-textual object suggestion, wherein the second graphical representation of the first non-textual object suggestion is displayed differently than the first graphical representation of the first user input, and detects an indication whether the first non-textual object suggestion corresponds to the desired non-textual object.

    Other inventors
    See patent

Projects

  • RESTdesc

    - Present

    RESTdesc allows you to capture the functionality of hypermedia APIs, so automated agents can use them.

    Other creators
    See project

Languages

  • German

    -

  • French

    -

  • Spanish

    -

  • Catalan

    -

  • English

    -

  • Latin

    -

Organizations

  • Association for Computing Machinery (ACM)

    Professional Membership

    - Present

More activity by Thomas

View Thomas’ full profile

  • See who you know in common
  • Get introduced
  • Contact Thomas directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Thomas Steiner, PhD

Add new skills with these courses