|
||||||||||||||||||||||||||||||||||||
DBTA Workshop on Information Retrieval:
|
10:15-10:30 | Welcome and Introduction |
|
10:30-11:15 | Web 2.0 Research at Yahoo! There are several semantic sources that can be found in the Web that are either explicit, e.g. Wikipedia, or implicit, e.g. derived from Web data. Most of them are related to the Web 2.0 or what is called user generated content (UGC). In this talk we show several applications of mining the wisdom of the crowds behind these data to assess its quality, to improve image search or to generate new semantic resources, as our final goal is to produce a virtuous feedback circuit based in machine learning for leveraging the data itself. | |
11:15-11:30 | Coffee break |
|
11:30-12:00 | Web Reputation Manager: a monitoring and reporting tool for brands, products and people Reputation Manager is a commercial tool that enables companies to daily monitor the web reputation of their brands, products and C-level officers. The system is based upon a multi-stage concurrent architecture and is able to analyse, interpret and report a variety of web channel inputs, ranging from sites, to blogs to videos. In this presentation some of the key system components and architecture will be described with particular reference to the integration of Web 2.0 concepts in a standard enterprise application. |
|
12:00-12:30 | Topical Opinion Retrieval: a Dictionary-based Approach We present a method of constructing automatically dictionaries relative to a specific context and how to retrieve information with such background knowledge. The methodology is applied to the case study of sentimental analysis (topical opinion retrieval). In general the contextual retrieval problem can be represented as a couple of queries, the topic and the context, where the context is represented by a weighted dictionary. I will describe the strategy to perform contextual document retrieval. The derived contextual ranking formula is shown to be very robust and does not contain parameter to tune or learn. Then I will show how to reduce the size of the dictionary in order to maintain good performance of the system. Because we are able to reduce the size of the dictionaries, we may boost retrieval of opinionated and relevant documents at real-time with a negligible computational cost. |
|
12:30-14:00 | Lunch break |
|
14:00-14:30 | DelosDLMS: a Novel Infrastructure for Web-based Digital Libraries DelosDLMS is an innovative Digital Library Management System that has been developed as an integration effort within the DELOS Network of Excellence. A key aspect of DelosDLMS is its novel generic infrastructure that allows to easily generate Digital Library Systems out of a set of Web-based Digital Library (DL) services in a modular and extensible way. It is the result of integrating various specialized DL services like feature extraction, visualization, intelligent browsing, media-type-specific indexing, relevance feedback and many others provided as Web services by partners of the DELOS network into the OSIRIS platform. Based on these services, DelosDLMS provides support for content-based retrieval in image, audio, video, and 3D collections and a combination of any of these media types with keyword queries. It allows annotating retrieved information, provides a rich set of advanced graphical user interfaces to browse and explore large collections, and supports users in interacting with the system using a speech interface and interactive paper. Thus, DelosDLMS showcases a great variety of functionality that is outlined as part of the DELOS vision for future Digital Library Systems. |
|
14:30-15:00 | Tag Data and Personalized Information Retrieval Tag data from Social Bookmarking sites has been shown to be a useful source of information for improving Web Search. In this talk I will discuss the use of this data for personalizing Web Search. In particular, I will answer the questions: Are social bookmarking data and query logs comparable and if so, how similar are they? Do we really need query logs, or can we just use public tag data as an initial testbed for evaluating personalized Information Retrieval systems? |
|
15:00-15:15 | Coffee break |
|
15:15-15:45 | Harvesting Adjacent Metadata in Large-Scale Tagging Systems In this talk we consider the problem of tag prediction in collaborative tagging systems where users share and annotate resources on the Web. We put forward HAMLET, an algorithm to automatically propagate tags from one document to similar documents in Web 2.0 tagging applications. We present the core principles underlying tag propagation, for which we derive suitable scoring models. We will conclude the talk by presenting experiments on real-world data sets. |
|
15:45-16:15 | Web 2.0 and Personal Information Management Web 2.0 applications are increasingly being used, not just to share personal information, but also to manage it. As a result, personal data and its management become fragmented, not only across desktop applications, but also between desktop applications and various Web 2.0 applications. We will discuss some issues of personal information management in the realm of Web 2.0 and then present a data management architecture designed to support a separation of concerns between the management and the sharing of personal information. |
|
16:15 | Closing and Apero |
|
There is no charge to attend the workshop, but registration is required to help with catering arrangements. If you would like to register please send an email to shima.gerani@lu.unisi.ch before 01 October 2008.
If you would like to be notified about news of this and other DBTA events by email, you can join the DBTA mailing list.