Chinese text analyser

Mak, Sai-wai David

Author:	Mak, Sai-wai David
Title:	Chinese text analyser
Degree:	M.Sc.
Year:	2001
Subject:	Hong Kong Polytechnic University -- Dissertations Chinese language -- Data processing Chinese language -- Discourse analysis Text processing (Computer science)
Department:	Multi-disciplinary Studies Department of Computing
Pages:	96 leaves : ill. ; 30 cm
Language:	English
Abstract:	As there are more and more content available in Internet and other media, there will be a strong demand in having a smart tools for searching accurate, comprehensive and relevant information. This means a searching tool simulate human perspective will be essential in finding information quickly and accurately. Currently, most searching tools are based on exact "text-matching", however the accuracy and relevance of the retrieved information is not guaranteed. The accuracy can be improved by searching the content through a hierarchy reflecting the relation of Words and clauses. The documents with highest counts of words and clause will be sorted. If multiple documents are searched against the hierarchy, a list of contents of similar meaning or relevancy will be retrieved. There are similar tools built by some researchers. However, these tools are specifically designed for Western language structure. As there are more and more Chinese (traditional or simplified) content available in Internet or other media. It will be convenient to have a tool, which assist Chinese user to search /select relevant information. Since the lexical structure are different for Chinese and English language. This dissertation is intended to investigate a generic method of analysing without employing much of the complicated linguistic rules. It serves Chinese content particularly. It is expected to be able to run under general operating platform. The proposed methodology is making use of identifying and removing the useless words or phrases in a passage and then the important content will be extracted. Some statistical rules are employed to calculate and summarise the relationship between the important words and phrases. A primitive semantic network is built, which can be used for further processing of other Chinese documents.
Rights:	All rights reserved
Access:	restricted access

Files in This Item:

File	Description	Size	Format
b16681496.pdf	For All Users (off-campus access for PolyU Staff & Students only)	2.81 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/100