Too Cool for Internet Explorer

The project, with partners here at EPFL, conducts research in the design, use and interoperability of topic-specific search engines with the goal of developing an open source prototype of a distributed, semantic-based search engine (the architecture is reported in the picture below).

Existing search engines provide poor foundation for semantic web operations, and US companies such as Google are becoming monopolies, distorting the entire information landscape. Our approach is not the traditional Semantic Web approach with coded or semi-automatically extracted metadata, but rather an engine that can build on content through automatic analysis. Linguistic processing is inside the search engine and a probabilistic document model provides a principled evaluation of relevance to complement existing standard authority scores. This facilitates semantic retrieval and incorporates pre-existing domain ontologies using facilities for import and maintenance. The distributed design is based on exposing search objects as resources, and on using implicit and automatically generated semantics (not ontologies) to distribute queries and merge results. Because semantic expressivity and interoperability are competing goals, developing a system that is both distributed and semantic-based is the key challenge: research involves both the statistical and linguistic format of semantic internals, and determining the extent to which the semantic internals are exposed at the interface.



Alvis


Tags: , , , ,

The Open Directory Project

0 comments

The Open Directory Project is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors.

The idea behind is that search engines are increasingly unable to capture the growing complexity of the web. Their solution is to let people to do this job: a community of editors (71053 to date) review hundreds of sites each day and catalogue them.

Dmoz


Tags: , ,

According to Global Exchange (full text here), an NGO that promotes social, economical and environmental justice, corporations carry out some of the most horrific human rights abuses of modern times. In this report they focus on the worst 14th companies:

Caterpillar -> contracting with known violators of human rights, enabling house demolition, supplying equipment that kills Palestinian civilians and American peace activists

Chevron -> environmental destruction, health violations, and violent killings

Coca-cola -> violent killings, kidnap and torture, water privatization, health violations, and discriminatory practices

Dow Chemical -> creation of chemical weapons, marketing poisonous chemicals, illegal dumping of toxins into populated areas, environmental destruction, health problems, death

Dyncorp / CSC -> causing health problems, environmental devastation and death; endangering lives; physically abusing individuals; sex trafficking

Ford Motor Company -> environmental degradation, climate change, fueling wars for oil

Kellog, Brown and Root -> Overcharging and providing unnecessary services on taxpayer’s dollar, bribery, exploiting third country nationals

Lockheed Martin -> War profiteering, warmongering

Monsanto -> Displacement, health violations, and child labor

Nestle USA -> Abusive child labor, repression of worker rights, aggressive marketing of harmful products, violation of national health and environmental laws

Philip Morris USA -> aggressively marketing lethal products

Pfizer -> Killer price-gouging

Suez-Lyonnaise des Eaux -> Water privatization

Wal-Mart -> worker rights violations, labor discrimination, union busting


Tags: ,

I-Spy: a Collaborative Search engine

0 comments

I-Spy is a community-based Internet meta-search engine that provides you with search results that are informed by similar users. By joining an I-Spy community and using it to search the Web you not only benefit from high-quality meta-search results, but also from the result selections of users within that community. The idea behind this is that users with similar interests are likely to find the same results interesting for similar queries.

I-Spy implements collaborative ranking, borrowing ideas from collaborative filtering. It is a meta search in the sense that builds on top of an existing web search engine (in this case google).

I-Spy


Tags: , ,

FlashLite 2.0 is here

0 comments

Flash Lite 2 is based on the Flash 7 standard for content. This means that content developed in the latest Flash authoring environment can be re-purposed for mobile and consumer electronic devices. It supports loading and parsing of external XML data in Flash content using the same XML handling methods as Flash Player 7.

Flash Lite 2 supports the ability to locally store and retrieve relevant, application-specific information such as preferences, high scores, usernames, etc. This provides a much more robust development environment.

Flash Lite 2 enables dynamic loading of multimedia content such as images, sound and video, based on supported codecs available on the device. This includes loading and handling XML data and SWF content. Flash Lite 2 also provides video support and external multimedia support. This includes in place video as well as image loading (gif jpeg, png w/ transparency) and audio loading.

Flash Lite 2 enables developers to easily create sophisticated vector graphics and animated shapes, at runtime, using ActionScript 2.0.

Flash Lite2 InterfaceFig01 Sm


Tags:

MSN History Visualization

0 comments

Messing around on the web 1.0 ;-) I found this nice visualization tool that I like. I guess it can even be used for some particular research purpose, maybe in the Mutual Modeling project. The application reads the xml files that are being stored by the MSN, and makes a graphical display that allows to make comparisons between conversations with different people and tries to answer to the following questions:



  • how many words do I use in each utterance?
  • which are the words that I use the most?

Msn History


Tags: , ,

Lorys L. Pognon  wrote a white paper on the techniques to determine location on UMTS networks. The paper answer questions such as can we get location information on UMTS networks the same way we get location on Cell-ID over GSM networks.

LBS (Location-Based Services) is a recent concept that denotes applications integrating geographic location (spatial coordinates). One of the important aspect of LBS is the location of the mobile user. Depending on the network, the location techniques are differents. This paper gives a brief detail of some existing technologies that could be used for mobile user localisation.

Download here the [pdf]. (via)


Tags: , , ,

ISKODOR is an experimental search system developed a the University of Bonn, which goal is the implementation of the ‘congenial web search’, a user-centered approach where search quality is constantly evaluated through explicit feedback.

ISKODOR implements personalized ranking matrices; collaborative information retrieval in the form of peer groups, which are used to limit the scope of a search.

The Web provides a global platform for knowledge sharing. However, several shortcomings still arise from the absence of personalization and collaboration in Web searches. More effective retrieval techniques could be provided by means of transforming explicit knowledge into implicit knowledge. Iskodor is based on a peer-to-peer architecture and aims at complementing classical Web searches in terms of personalized ranking lists. These local rankings can be accumulated and evaluated in order to supplement the process of knowledge generation by building Virtual Knowledge Communities. Furthermore, the aggregation of ranking lists can be used to identify topics as well as communities of interest. Together with social aspects for community support, a framework for congenial Web search is defined.

Mysearch


Tags: , ,

Social Information Retrieval

0 comments

S. M. Kirsch. Social information retrieval. Diploma thesis in computer science, Rheinische Friedrich-Wihelms-Universität Bonn, Institut für Informatik III, Bonn, Germany, 22nd of November 2005.

———————

The goal of this thesis’ work is the combination of well established retrieval methodologies with the most recent social network analysis. The opening claim is that a modern information retrieval system should determine the exact nature of the user’s information needs. This can be achieved looking at information that comes from immediate contacts that is usually preferred to that that comes from anonymous sources.

Current search engines, according to the author, are susceptible of a form of tyranny of the majority: they can only display those sites that will be relevant to the majority of its users, but not to the actual users who submitted the query. Two viable solution are identified on the literature and studied in deept: personalization of search and the addition of collaborative elements.

This thesis therefore defines the social information retrieval task and describe its domain. A formalization on the basis of associative networks is provided, as well as search procedures for these networks. An evaluation compares the described methods to conventional information retrieval methods.


Tags: , , ,

K. Lund, C. Burgess, and C. Audet. Dissociating semantic and associative word relationships using high-dimensional smeantic space. In Proceeding of the Cognitive Science Society, pages 603–608, Hillsdale, N.J., USA, 1996. Erlbaum Press. [pdf]

————————

The paper studies the lexical/semantic priming effect which is questioned to be associative in nature. The aim of the paper is to shed some light on the question for which two crucial point are tackled: firstly, an operational definition of of semantic and association is needed; secondly, the definition of a framework for modelling semantic representations.

Their proposition for the first point is that semantically related words (TABLE - BED) are instances of the same category and share a number of features. Associated words (MOLD - BREAD) are those which are associated as determined by human word association norms. There is also a third type that are both semantically and associatively related (UNCLE - AUNT).

To solve the second point they propose a framework, called HAL (Hyperspace Analogue to Language) that allow to simulate different experiments. The methodology is based on the computation of a matrix of co-occurrence vectors for each word, which can be analyzed for semantic content. The co-occurrence is defined using the window-size parameter (co-occurrence within n words). Than a similarity is computed between the vectors using an Eucledian distance measure.

Using a certain dataset, the author simulated a certain kind of association between word pairs. They they repeated the experiment with human subject and confronted the results. The conclusion was that the notion of associativity can be characterized by temporal association in language receive little or no support from their corpus analysis. Word association seeed to be more a function of semantic neighborhood.

Another interesting result was that the distinction between associative and semantic information corresponds to the distinction between local co-occurrence and global co-occurrence. Temporal information is reflected in local co-occurrence. Global pattern of co-occurrence across a vocabulary is connected to semantic information.


Tags:


About

This blog is about my Phd research: grasping structures in 'spatialised' communication, how people make inference and ground their communication while explicitly using space as referencial context. Some other keywords: - Information retrieval - Spatial clustering - Text data mining - Cognitive Semantics - Spatial Cognition - e-Government - social navigation - CSCL (computer supported collaborative learning) - constructionism - mobile learning - mental maps - cognitive geography - urban planning. [My portal], [My bio], [Contacts]

contact: martigan (at) gmail.com

Flickr

RSS
    forbruker02
    presepe_latina_2
    molested_bicycle
    easyjet_geneva
    LIFT06 | Life, Ideas, Futures. Together


Google
Mauro Cherubini’s moleskine is powered by WordPress 1.5.2 and K2 Beta One 96
RSS Entries and RSS Comments