Abstract—In this paper we present an open source text mining system, UMagic, which identifies key domain entities and their relationships from text documents. UMagic extracts the key information components from the textual document and transforms them into UML based diagrams without human intervention. UMagic performs the linguists processing of given text using open source tool named GATE , to mark entities and relationships between these entities. Afterwards, it generates ER diagram from the marked text automatically. Though the task in hand is very complex, specifically when carried out in absolutely automated fashion, but it has immense applications in real world scenarios. From Software Engineering perspective, this approach can be employed to bridge the gap between the analysis phase and design phase of the software development process. This results in reduced time and complexity of the design phase, as well the improved degree of correctness of the design documents.
Index Terms—Artificial intelligence, ERD, GATE, natural languare processing, text mining, UML, XML.
Iram Shahzadi1, Qanita Ahmad1, Imran Sarwar1 and Waqar Mahmoodl are with Al-Khawarizmi Institute of Computer Science University of Engineering & Technology, Lahore, Pakistan (e-mail: firstname.lastname@example.org)
Kiran Fatima is with Department of Computer Science & Engineering, University of Engineering & Technology, Lahore, Pakistan ( email@example.com )
Cite: Iram Shahzadi, Qanita Ahmad, Kiran Fatima, Imran Sarwar, and Waqar Mahmood, "UMagic! THE UML Modeler for Text Documents," International Journal of Computer Theory and Engineering vol. 5, no. 1, pp. 166-169, 2013.