Study on Information Extraction of Clinic Expert Information from Hospital Portals_Journal of Biomedical Engineering

Authors：

ZHANGYuanpeng , DONGJiancheng , QIANDanmin , GENGXingyun , WUHuiqun ,  WANGLi

Department of medical informatics, Nantong University, Nantong 226001, China;

Corresponding?author：

WANGLi, Email: wangli@ntu.edu.cn

Keywords：

information extraction; clinic expert information; domain model; block importance model; support vector machine

DOI：

10.7507/1001-5515.20150222

Video：

Export PDF Favorites Scan Get Citation

Abstract Full text Figures/Tables Video References Cited by

Clinic expert information provides important references for residents in need of hospital care. Usually, such information is hidden in the deep web and cannot be directly indexed by search engines. To extract clinic expert information from the deep web, the first challenge is to make a judgment on forms. This paper proposes a novel method based on a domain model, which is a tree structure constructed by the attributes of search interfaces. With this model, search interfaces can be classified to a domain and filled in with domain keywords. Another challenge is to extract information from the returned web pages indexed by search interfaces. To filter the noise information on a web page, a block importance model is proposed. The experiment results indicated that the domain model yielded a precision 10.83% higher than that of the rule-based method, whereas the block importance model yielded an F₁ measure 10.5% higher than that of the XPath method.

Citation： ZHANGYuanpeng, DONGJiancheng, QIANDanmin, GENGXingyun, WUHuiqun, WANGLi. Study on Information Extraction of Clinic Expert Information from Hospital Portals. Journal of Biomedical Engineering, 2015, 32(6): 1249-1254. doi: 10.7507/1001-5515.20150222 Copy

1.	BERGMAN M K. The Deep Web:surfacing hidden value[J]. The Journal of Electronic Publishing, 2001, 7(1):8912-8914.
2.	GHANEM T M, AREF W G. Databases deepen the web[J]. Computer (Long Beach Calif), 2004, 37(1):116-117.
3.	王理, 張遠鵬, 董建成.利用領域關聯知識從電子病歷中抽取檢查數據[J].中華醫院管理雜志, 2014, 30(3):210-213.
4.	CHANG K C C, HE B, LI C, et al. Structured databases on the Web:Observation and implications[J]. SIGMOD Record, 2004, 33(3):61-70.
5.	COPE J, CRASWELL N, HAWKING D. Automated discovery of search interfaces on the web[C]//Proceedings of the l4th Australasian Database Conference. Adelaide, Australia:2003, 143:181-189.
6.	FU Yan, YANG Dongqing, TANG Shiwei, et al. Using XPath to discover informative content blocks of web pages[C]//Proceedings of the Third International Conference on Semantics, Knowledge and Grid. Shan Xi:2007:450-453.
7.	BEGHOLZ A, CHILDLOVSKⅡB. Crawling for domain-specific hidden Web resources[C]//Proceedings of the Fourth International Conference on Web information Systems Engineering. 2003:125-133.
8.	WANG Li, FUKETA M, MORITA K, et al. Context constraint disambiguation of word semantics by field association schemes[J]. Inf Process Manag, 2011, 47(4):560-574.
9.	張慧斌.Deep Web查詢接口及查詢結果抽取研究[D].天津:南開大學, 2010.
10.	FUREY T S, CRISTIANINI N, DUFFY N, et al. Support vector machine classification and validation of cancer tissue samples using microarray expression data[J]. Bioinformatics, 2000, 16(10):906-914.

1. BERGMAN M K. The Deep Web:surfacing hidden value[J]. The Journal of Electronic Publishing, 2001, 7(1):8912-8914.
2. GHANEM T M, AREF W G. Databases deepen the web[J]. Computer (Long Beach Calif), 2004, 37(1):116-117.
3. 王理, 張遠鵬, 董建成.利用領域關聯知識從電子病歷中抽取檢查數據[J].中華醫院管理雜志, 2014, 30(3):210-213.
4. CHANG K C C, HE B, LI C, et al. Structured databases on the Web:Observation and implications[J]. SIGMOD Record, 2004, 33(3):61-70.
5. COPE J, CRASWELL N, HAWKING D. Automated discovery of search interfaces on the web[C]//Proceedings of the l4th Australasian Database Conference. Adelaide, Australia:2003, 143:181-189.
6. FU Yan, YANG Dongqing, TANG Shiwei, et al. Using XPath to discover informative content blocks of web pages[C]//Proceedings of the Third International Conference on Semantics, Knowledge and Grid. Shan Xi:2007:450-453.
7. BEGHOLZ A, CHILDLOVSKⅡB. Crawling for domain-specific hidden Web resources[C]//Proceedings of the Fourth International Conference on Web information Systems Engineering. 2003:125-133.
8. WANG Li, FUKETA M, MORITA K, et al. Context constraint disambiguation of word semantics by field association schemes[J]. Inf Process Manag, 2011, 47(4):560-574.
9. 張慧斌.Deep Web查詢接口及查詢結果抽取研究[D].天津:南開大學, 2010.
10. FUREY T S, CRISTIANINI N, DUFFY N, et al. Support vector machine classification and validation of cancer tissue samples using microarray expression data[J]. Bioinformatics, 2000, 16(10):906-914.

Journal of Biomedical Engineering

Study on Information Extraction of Clinic Expert Information from Hospital Portals

Abstract Full text Figures/Tables Video References Cited by

Previous Article

Next Article

Format

Content