Publication Date
12-2005
Advisor(s) - Committee Chair
Guangming Xing
Degree Program
Department of Computer Science
Degree Type
Master of Science
Abstract
Extensible markup and platform independence make XML [5] a befitting document format for a wide range of applications – both online and offline. Computing the edit distances between an XML documents and schemata and the transformation of XML documents to conform to a schema are critical for various document engineering and document mining tasks.
This thesis focuses on the problem of finding the minimum edit distance and an optimum sequence of edit operations to transform an XML document so that it conforms to a schema. Few proposed solutions for this problem [1, 2, 3, 4] have been studied and two of the [1, 2] have ben practically implemented. A schema in DTD is translated to a normalized regular hedge grammar [2] and an XML document is represented as a node labeled ordered tree [1, 2]. A comparative study of the performances of the implemented algorithms [1, 2] has been presented.
Document size becomes a major restriction on the application of such algorithms for practical purposes. A divide and conquer strategy is developed to adopt the second algorithm [2] to process documents larger than what it usually can.
Disciplines
Computer Sciences | Physical Sciences and Mathematics
Recommended Citation
Malla, Chaitanya, "Experimental Studies of Edit Distances Between XML Documents & Schemata" (2005). Masters Theses & Specialist Projects. Paper 3407.
https://digitalcommons.wku.edu/theses/3407
Comments
Access granted to WKU students, faculty and staff only.
After an extensive unsuccessful search for the author, this thesis is considered an orphan work, which may be protected by copyright. The inclusion of this orphan work on TopScholar does not guarantee that that orphan work may be used for any purpose and any use of the orphan work may subject the user to a claim of copyright infringement. The reproduction of this work is made by WKU without any purpose of direct or indirect commercial advantage and is made for purposes of preservation and research.
See also WKU Archives - Authorization for Use of Thesis, Special Project & Dissertation