[1]董丽丽,高 山,张 翔.集成学习算法在实体关系抽取中的应用[J].西安建筑科技大学学报:自然科学版,2011,43(03):446-450.[doi:DOI :10.15986/j .1006-7930.2011.03.007]
 DONG Li-l i,GAO S han,ZH ANG X iang.Application of the research on extraction of entity relationship based on integrated learning algorithm[J].J.Xi’an Univ. of Arch. & Tech.:Natural Science Edition,2011,43(03):446-450.[doi:DOI :10.15986/j .1006-7930.2011.03.007]
点击复制

集成学习算法在实体关系抽取中的应用()
分享到:

西安建筑科技大学学报:自然科学版[ISSN:1006-7930/CN:61-1295/TU]

卷:
43
期数:
2011年03期
页码:
446-450
栏目:
出版日期:
2011-06-30

文章信息/Info

Title:
Application of the research on extraction of entity relationship based on integrated learning algorithm
文章编号:
1006-7930(2011)03-0446-05
作者:
董丽丽 高 山 张 翔
(1.西安建筑科技大学信息与控制工程学院, 陕西西安710055;
2.西部建筑科技国家重点实验室(筹), 陕西西安710055)
Author(s):
DONG Li-l i GAO S han ZH ANG X iang
(1.Schoo l o f informa tion and contr ol enginee ring , Xi′an Univ .of Ar ch .& Tech ., Xi′an 710055 , China ;
2.State Key Labora to ry of Architecture Science a nd Techno log y in West China(XAUAT), Xi′an 710055 , China)
关键词:
集成学习实体关系抽取特征向量ADABoo st .MH
Keywords:
integrated learning ex traction o f entity re lationshi p f eature vector adaboost .mh
分类号:
TP391
DOI:
DOI :10.15986/j .1006-7930.2011.03.007
文献标志码:
A
摘要:
针对基于特征向量的实体关系抽取方法中分类算法分类精度的不足, 提出了基于集成学习算法的实
体关系抽取方法.该方法将实体特征组合并转化为特征向量, 使用集成学习中的ADABo ost .MH 算法来构造
实体关系抽取的分类器, 弱分类器采用决策树进行构造, 通过提高分类效果好的分类器的权重和分类错误样
本权重的方式来提高分类的精度, 从而实现实体关系类别的识别.该方法在对《人民日报》语料库的测试中, 得
到了比较好的效果.
Abstract:
To o vercome the classifica tion accuracy defects of tr aditional classificatio n algo rithm , a method of integ rated
lear ning is broug ht fo rw ard .The method w hich combined entity characteristics and transla ted entity characteristics into
fea ture vector introduced an integr ated lear ning a lg orithm .ADABoo st.MH algo rithm is used to divide weak classifie r .By
improv ing the w eigh t o f go od classifier and w rong r esults to incr ea se classification accuracy realized the recog nized cla sses
of entity .The method proved to be effective in test of the co rpus of the people’ s Daily .

参考文献/References:

 References
[ 1]  程显毅, 朱 倩, 王 进.中文信息抽取原理及应用[ M] .北京.科学出版社, 2010:70 .
CH ENG Xian-yi, ZH U Qian , WANG Jin.Chinese information ex traction principle and application[ M] .Beijing , .
Science Publishing Company .Febr ua ry , 2010 :70 .
[ 2]  苏金树, 张博峰, 徐昕.基于机器学习的文本分类技术研究进展[ J] .Journal o f Softw are , 2006, 17(9):1848-1859.
SU Jin-Shu , ZHANG Bo-Feng , XU Xin.Adv ances in Machine Learning Based Te xt Catego riza tion[ J] .Jo urnal o f
Sof tw are .2006 , 17(9):1848-1859 .
[ 3]  车万翔, 刘 挺, 李 生.实体关系自动抽取[ J] .中文信息学报, 2004, 19(2):2.
CH E Wan-x iang , LIU Ting , LI She ng .Automatic Entity Rela tion Ex traction[ J] .Journal of Chinese info rma tion
pro cessing , 2004 , 19(2):2.
[ 4]  ACE.2007 .The nist ace eva luatio n w ebsite .[ OL] .[ 2010/ 8/ 27] .http:// ww w .nist .gov/speech/ te sts/ ace/
ace07/ .
[ 5]  LIANG YingYi .I nteg rated lear ning review[ OL] .http :// soft .cs.tsinghua .edu .cn/ ~ keltin/docs/ensemble .pdf
[ 6]  SCHAPIRE R, SINGER Y.Boo sTe xter :a bo osting based sy stem for tex t catego rization[ J] .Machine Learning ,
2000 , 39(203):135-168 .
[ 7]  ZH OU GuoDong , SU Jian , ZHANG Jie , et al .Ex plo ring Various Know ledge in Relatio n Ex tractio n[ J] .Associatio
n fo r Computational Ling uistics , 2005 :427-434.
[ 8]  姚谦峰, 侯莉娜, 黄 炜.给予遗传算法的多层密肋壁板结构优化设计方法研究[ J] .西安建筑科技大学学报:自
然科学版, 2009, 41(4):445-460.
YAO Qian-fe ng , H OU Li-na , H UANG Wei .Optimizatio n desig n method of multi-sto ried multi-ribbed slab str ucture
based on GA[ J] .J .Xi′an Univ .o f Arch .& Tech .:Natural Scie nce Edition , 2009, 41(4):455-460.

备注/Memo

备注/Memo:
*收稿日期:2010-10-20   修改稿日期:2011-04-12
基金项目:陕西省自然科学基金资助项目(2009JM8006);陕西省教育厅专项科研项目(2010JK620)
作者简介:董丽丽(1960-), 女,福建福州人, 教授, 硕士生导师, 主要研究领域为分布式系统与计算机网络应用、数据挖掘.
更新日期/Last Update: 2015-11-02