Mining wikipedia categories

From AIRWiki
Revision as of 22:32, 29 May 2009 by DavidLaniado (Talk | contribs)

Jump to: navigation, search
Title: Wikipedia category map
Wikipedia categories.png

Image:wikipedia_categories.png

Description: Wikipedia articles are organized in a hierarchy of categories, manually assigned by users. This process can be considered a huge effort for the collective categorization of human knowledge; the result is a wide and disordered graph which can provide precious information for a variety of applications (natural language processing, information retrieval, ontology building...).

Aim of the project is the development of a tool for the visualization of this graph. The project can be extended to a thesis in various directions, as the development of advanced visualization features or the creation and population of an ontology.

Tutor: DavidLaniado (david.laniado@gmail.com), RiccardoTasso (tasso@elet.polimi.it), MarcoColombetti (colombet@elet.polimi.it)
Start: Anytimewarning.pngThe date "Anytime" was not understood.
Students: 1 - 2
CFU: 5 - 20
Research Area: warning.png"Ontologies and Semantic Web" is not in the list of possible values (Affective Computing, Agents - Multiagent Systems - Agencies, BioSignal Analysis, Computational Intelligence and Games, Computer Vision and Image Analysis, E-Science, Machine Learning, Philosophy of Artificial Intelligence, Robotics, Social Software and Semantic Web) for this property.Ontologies and Semantic Web
Research Topic: Wiki analysis
Level: Bs+Mswarning.png"Bs+Ms" is not in the list of possible values (Bs, Ms, PhD) for this property.
Type: Course
Status: Proposalwarning.png"Proposal" is not in the list of possible values (Active, Closed) for this property.

Wikipedia articles are organized in a hierarchy of categories, manually assigned by users. This process can be considered a huge effort for the collective categorization of human knowledge; the result is a wide and disordered graph which can provide precious information for a variety of applications (natural language processing, information retrieval, ontology building...). Aim of the project is the development of a tool for the visualization of this graph. The project can be extended to a thesis in various directions, as the development of advanced visualization features or the creation and population of an ontology.

Tools and instruments
the software can be implemented in any programming language; we have already developed a java prototype that queries the wikipedia APIs, which can be used as a starting point.