Carregar apresentação
A apresentação está carregando. Por favor, espere
1
Knowledge Extraction from the Web (ISEWO)
Franz-Josef Katzdobler Heraldo Pimenta Borges Filho DEI – FCTUC, 2008/2009 Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
2
Content Introduction Objectives of the Project Architecture Demo
Conclusion Future Work Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
3
Introduction Increasing Amount of Knowledge
Extract relevant information from the Web is a complex and important task Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
4
Objectives Franz-Josef Katzdobler Heraldo Pimenta Borges Filho
Project ISKM, 2008/2009
5
Architecture Franz-Josef Katzdobler Heraldo Pimenta Borges Filho
Project ISKM, 2008/2009
6
Web-Harvest Open Source Web Data Extraction Tool Written in Java
Mainly focused on HTML/XML based Websites XSLT, XQuery, Regular Expressions Configured with a Config-File Get XML as output (comfortable for processing in the next stages) Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
7
XML-Output <supermarket> … <product>
<name>Pate Bocadelia c/ Caranguejo La Piara </name> <price>2.10</price> <isAvailable>false</isAvailable> <isSpecialOffer>false</isSpecialOffer> </product> <name>Molho Base p/ Engrossar Molhos Express Maizena </name> <price>2.49</price> <name>Arroz Basmati Continente </name> <price>2.13</price> ... </supermarket> Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
8
Ontology Franz-Josef Katzdobler Heraldo Pimenta Borges Filho
Project ISKM, 2008/2009
9
Jena Framework for Building Semantic Web Applications
Possibliity to load ontology from existing file Persistent Model available The only thing the programmer needs to do is offering a database connection OntModel class can be used to manipulate the data Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
10
Demo Franz-Josef Katzdobler Heraldo Pimenta Borges Filho
Project ISKM, 2008/2009
11
Conclusion Advantages Limitations
Get the cheapest product from several different markets No need to visit all the different webpages Limitations The markets are predefined Config Files manually created Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
12
Possible Future Work More Flexibility
Creation of config-file automatically Extend the ontology e.g. Categories for the product Give suggestions to the user (Product X may also be interesting for you...) Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
13
Questions / Discussion
End of Presentation Questions / Discussion Franz-Josef Katzdobler Heraldo Pimenta Borges Filho Project ISKM, 2008/2009
Apresentações semelhantes
© 2024 SlidePlayer.com.br Inc.
All rights reserved.