Chinese text analyser

Mak, Sai-wai David

Full metadata record

DC Field	Value	Language
dc.contributor	Multi-disciplinary Studies	en_US
dc.contributor	Department of Computing	en_US
dc.creator	Mak, Sai-wai David	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/100	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	-
dc.rights	All rights reserved	en_US
dc.title	Chinese text analyser	en_US
dcterms.abstract	As there are more and more content available in Internet and other media, there will be a strong demand in having a smart tools for searching accurate, comprehensive and relevant information. This means a searching tool simulate human perspective will be essential in finding information quickly and accurately. Currently, most searching tools are based on exact "text-matching", however the accuracy and relevance of the retrieved information is not guaranteed. The accuracy can be improved by searching the content through a hierarchy reflecting the relation of Words and clauses. The documents with highest counts of words and clause will be sorted. If multiple documents are searched against the hierarchy, a list of contents of similar meaning or relevancy will be retrieved. There are similar tools built by some researchers. However, these tools are specifically designed for Western language structure. As there are more and more Chinese (traditional or simplified) content available in Internet or other media. It will be convenient to have a tool, which assist Chinese user to search /select relevant information. Since the lexical structure are different for Chinese and English language. This dissertation is intended to investigate a generic method of analysing without employing much of the complicated linguistic rules. It serves Chinese content particularly. It is expected to be able to run under general operating platform. The proposed methodology is making use of identifying and removing the useless words or phrases in a passage and then the important content will be extracted. Some statistical rules are employed to calculate and summarise the relationship between the important words and phrases. A primitive semantic network is built, which can be used for further processing of other Chinese documents.	en_US
dcterms.extent	96 leaves : ill. ; 30 cm	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2001	en_US
dcterms.educationalLevel	All Master	en_US
dcterms.educationalLevel	M.Sc.	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.LCSH	Chinese language -- Data processing	en_US
dcterms.LCSH	Chinese language -- Discourse analysis	en_US
dcterms.LCSH	Text processing (Computer science)	en_US
dcterms.accessRights	restricted access	en_US

Files in This Item:

File	Description	Size	Format
b16681496.pdf	For All Users (off-campus access for PolyU Staff & Students only)	2.81 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/100