ZHAW-Logo OAPA-Logo TAT-Logo
zurück  
Diplomarbeit 2005 (DA05): Arbeits-Archiv
 
DA Bao 05/5 - Automation of News Feeds (RSS, Webcrawler et al.)
Studierende: Boun Lor, lorbou
  Fabio Pisacane, pisacfab

Betreuer: Gerold Baudinot, baug
  Eduard Mumprecht, mpre

The problem of arranging different types of documents and other information sources within a database system has often been studied and implemented. This is the job of so called Document Management Systems DMS. The company InfoCodex SA has developed their own solution. The InfoCodex DMS-software works with a content-recognition based on a linguistic database linked to a universal taxonomy. Thus it isn?t language dependent and is capable of displaying results created by self-organizing neural networks. Hence, terms are grouped according to their meaning. With it, users are given the possibility to search for contents within these documents with the following options: full-text, synonyms, indexed and similarity. The company Kendox Systems GmbH has developed an Application Programming Interface InfoCodex-API which allows developers to add their own functions to the application. In this final thesis this API is used to add additional information-sources in the form of News-Feeds to InfoCodex. These are standardized text-files in the XML (Extensible Markup Language) format. Their advantage is, that they contain specific information concerning one or more topics. One can subscribe to them individually with a so called feed-reader. When searching with a searchengine in the World Wide Web, this results in a much wider range of articles. Many of them are commercial and therefore useless. There are different types of News-Feeds. The application developed supports three of them: RSS 1.0, RSS 2.0 and Atom 0.3.

zurück