User:Physikerwelt/Sandbox/MathSearchTask

= Free Wikipedia Subtask =

Dataset overview
3 Formats available: editing
 * 1) HTML
 * 2) * Standard output of future Wikipedia
 * 3) * Parsed with PHP
 * 4) XHTML
 * 5) * Standard output of new Visual Editor
 * 1) * Parsed with JavaScript (Parsoid)
 * 2) WikiText
 * 3) * Raw source data not parsed at all

Format of math elements
The id attribute is composed as follows:



Additional List with Page-ID and Page-Title mappings can be found here

Wrapper for the page content
In HTML format the PHP output is wrapped with the following html elements:

Query 1: 1/〈x〉 ≤〈1/x〉
in the query only format input data necessary
 * •Uses standard Content MathML
 * •Systems must use information given
 * •Systems must decide on result
 * •Pre-processing of the Wikipedia
 * •Other languages can be used

Query 2: E=mc²
in the query only not the presentation layout
 * Content Query
 * Uses content dictionaries
 * System must use information given

Performance queries
result of the reference hit
 * NTCIR10 Query format
 * The link provided must be included in the
 * Participants should report the position

Final Presentations
NTCIR session)
 * No evaluation
 * Presentation judged by audience
 * Requirements
 * Length 30 Min (hard cut) (no official
 * Structure (recommended)
 * System overview
 * Performance Queries
 * Demonstrations Queries
 * Summary


 * Sample stru

tu (re
•Presentation Systemoverview

•Test Hardware description

•Software used

•Licenses and Cost

•Reproducibility of the setup

•Spend effort i.e. development time

•Presentation Performance Queries

•Time measurement

–Indexing

–Queries

•Overall time incl. Reading of input and output of the result in NTCIR format or similar

•Coverage

–Were all seeds found

–Average position

•Presentation Performance Queries

•Time measurement

–Indexing

–Queries

•Overall time incl. Reading of input and output of the result

•Coverage

–Were all seeds found

–Average position

•Result format

–How many information is returned to the user

–How is the portion of the result selected

–Goal: Answer the users information need very quick

•Presentation Demonstration Queries

•What are the most impressive results?

•How were those result achieved?

•Why does that principal works independant of the particular queries?

•Which key information was missing to archive better results?

•What are further steps?

•Presentation Summary

•Key features of the presented system

•What are the differences to other systems?

ToDos

 * Find a public web server
 * Update the URLs
 * Publish Task Description (as you see here)
 * Make Video announcement (this presentation) probaly a wontfix?