当前位置: 首页>編程日記>正文

[论文阅读笔记29]生物医学文本摘要(Biomedical Text Summarization)

[论文阅读笔记29]生物医学文本摘要(Biomedical Text Summarization)

论文:Clinical Context–Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation

Afzal M, Alam F, Malik K, Malik G
Clinical Context–Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation
J Med Internet Res 2020;22(10):e19810
URL: https://www.jmir.org/2020/10/e19810
DOI: 10.2196/19810

摘要

*背景:*深度学习在自动文本摘要方面比较传统的有比较大优势,可是在医学文本方面,还未有研究。【Automatic text summarization (ATS)】

*目的:*传统方法带来的基础问题例如捕捉临床上下文,证据质量,对于摘要的文本目标的段落选择。围绕这些问题提出准确的,简明的,一致的信息抽取。

*方法:*提了一个框架: Biomed-Summarizer 。基于质量感知的病人/问题(Patient/Problem),干预(Intervention),比较(Comparison)和结果(Outcome )(PICO)的智能和上下文支持的总结生物医学文本。

第一步,开发一个二类分类器。用作质量识别,过滤掉质量不好的科学研究;

第二步,开发一个Bi-LSTM作为下下文的感知分类器。用作为PICO句子的识别;

第三步,语义相似器。使用Jaccard相似来计算query与PICO文本表达序列近似值,这里加入了丰富的医学本体语义;

*第四步:*最后从高分PICO中生成摘要表达;

结果:

​ 1. 识别质量:95.41% (2562/2686);

​ 2. 分类(5类【 aim, population, intervention, results, outcome】):93% (16127/17341) ;

​ 3. 语义相似度算法,相对于基线提升了8.9%;

​ 4. 生成的摘要,经过三个专家从不同维度去估计,获得比较高的正相关结果,表明自动摘要系统是令人满意的。

结论:通过应用提出的Biomed-Summarizer,在ATS上获得高精度,使生物医学文献的研究证据的无缝管理能够用于临床决策。

整体框架

image-20210318171557423

主要分成四个部分:

1. data preprocessing

2.quality recognition

3.contextual text classification

4.text summarization

流程(案例)

image-20210318174312169

数据预测处理

image-20210319093334804

预后质量识别模型

Prognosis quality recognition (PQR) model

5个特征:two data features (title and abstract),three metadata features (article type, publishing journal, authors).

image-20210319093731518

CCA分类模型

对于CCA分类模型:主要是对pico进行分类;

image-20210319094831067

ATS评分

Automatic text summarization

基于多特征矩阵的句子评分机制来来实现句子抽取。

image-20210319095527281

Relevance score

Jaccard similarity metric: Jaccard similarity with semantic enrichments (JS2E)

semantic enrichment : biomedical ontologies(SNOMED CT, MedDRA, NBO, NIFSTD)

步骤:

step 1, 清洗query句子;

step2, 使用BioPortal进行对句子标注;

step3, 每个token通过使用本体中“definition,” “synonyms,” and “prefLabel”来丰富语义;

step4, to retrieve the annotations of text;

step5, 构建metatokens数据结构;

step6, 计算相似值;

image-20210319101024746

Study Type

研究类型在临床上办演着重要的角色。这个经过专业人员的打分。

Venue Credibility

这个经过专业人员的打分。

Freshness

image-20210319104927292

Text Selection for Summary

(1) PICO-based summary;

对于PICO每部分选择topK的文本;采用这些句子来构建摘要;

(2) non-PICO-based summary,没有考虑句子分类,直接是topk.

Example Case

输入:“How does family history affect rupture probability in intracranial aneurysms; is it a significant factor?“ – 摘要查询:颅内动脉瘤家族史

image-20210319110218331

第一,查询提取;

第二,PubMed search service; – 搜索返回239 studies, 130预后研究;

第三,130篇预后研究经这PQM进行对质量过滤;

第四,PICO 分类模型;Aim (32), Patients (9), Intervention (1), Results (168), and Outcome (49).

第五,计算语义相似度;

第六,分数进行组合,并进行排序;

第七,生成摘要;

实验

数据集: BioMed_Summarizer. Brain_Aneurysm_Research.: GitHub URL: https://github.com/smileslab/Brain_Aneurysm_Research/tree/master/BioMed_Summarizer [accessed 2020-10-07]

工具:RapidMiner

PQR结果:

image-20210319113152456

CCA结果

image-20210319113253831

**Proposed Semantic Similarity Algorithm (JS2E)**结果

image-20210319113519811

文本抽取后的评价

image-20210319113724612

结论

这是一篇比较全的系统文章,主要是一个深度机器学习的应用。对于以后设计一个医学摘要系统时,要参考这几块的内容。

相关技术

生物医学领域的ATS

Summarization分为两类:abstractive(抽象式),extractive(抽取式)

概述的综述性文章:Gambhir M, Gupta V. Recent automatic text summarization techniques: a survey. Artif Intell Rev 2016 Mar 29;47(1):1-66. [doi: 10.1007/s10462-016-9475-9]

分类为: statistical-, topic-,graph-, discourse-, machine learning–based approaches ;

基于item set–based mining approach抽取域概念以生成graph的摘要。

Nasr Azadani M, Ghadiri N, Davoodijam E. Graph-based biomedical text summarization: An itemset mining and sentence

clustering approach. J Biomed Inform 2018 Aug;84:42-58 [FREE Full text] [doi: 10.1016/j.jbi.2018.06.005] [Medline:

29906584]

Moradi M. Small-world networks for summarization of biomedical articles. arXiv 2019 Mar 7:1903.02861 [FREE Full

text]

Quantifying the informativeness for biomedical literature summarization: An itemset mining method.

Comput Methods Programs Biomed 2017 Jul;146:77-89. [doi: 10.1016/j.cmpb.2017.05.011] [Medline: 28688492]

基于统计特征 such as term frequency, sentence position,and similarity with the title

Luhn HP. The Automatic Creation of Literature Abstracts. IBM J Res Dev 1958 Apr;2(2):159-165. [doi: 10.1147/rd.22.0159]

Ferreira R, de Souza Cabral L, Freitas F, Lins RD, de França Silva G, Simske SJ, et al. A multi-document summarization

system based on statistics and linguistic treatment. Exp Syst Appl 2014 Oct;41(13):5780-5787. [doi:

10.1016/j.eswa.2014.03.023]

从外部引入semantic information

Ferreira R, de Souza Cabral L, Freitas F, Lins RD, de França Silva G, Simske SJ, et al. A multi-document summarization

system based on statistics and linguistic treatment. Exp Syst Appl 2014 Oct;41(13):5780-5787. [doi:

10.1016/j.eswa.2014.03.023]

Lynn HM, Choi C, Kim P. An improved method of automatic text summarization for web contents using lexical chain with

semantic-related terms. Soft Comput 2017 Apr 27;22(12):4013-4023. [doi: 10.1007/s00500-017-2612-9]

文本中PICO元素的识别(3大类)

第一类,individual PICO element identification;

Bui DDA, Del Fiol G, Hurdle JF, Jonnalagadda S. Extractive text summarization system to aid data extraction from full

text in systematic review development. J Biomed Inform 2016 Dec;64:265-272 [FREE Full text] [doi:10.1016/j.jbi.2016.10.014] [Medline: 27989816]

Boudin F, Shi L, Nie J. Improving medical information retrieval with PICO element detection. : Springer; 2010 Mar

Presented at: European Conference on Information Retrieval; March 2010; Berlin, Heidelberg. [doi: 10.1007/978-3-642-12275-0_8]

Huang K, Chiang I, Xiao F, Liao C, Liu C, Wong J. PICO element detection in medical text without metadata: are first sentences enough? J Biomed Inform 2013 Oct;46(5):940-946 [FREE Full text] [doi: 10.1016/j.jbi.2013.07.009] [Medline:

23899909]

第二类,sentence classification;

Jin D, Szolovits P. PICO Element Detection in Medical Text via Deep Neural Networks. In: BioNLP 2018 Workshop.:

Association for Computational Linguistics; 2018 Jul Presented at: Proceedings of the BioNLP 2018 workshop; July 2018;

Melbourne, Australia URL: https://www.aclweb.org/anthology/papers/W/W18/W18-2308/ [doi: 10.18653/v1/w18-2308]

Kim S, Martinez D, Cavedon L, Yencken L. Automatic classification of sentences to support Evidence Based Medicine.

BMC Bioinformatics 2011;12(Suppl 2):S5. [doi: 10.1186/1471-2105-12-s2-s5]

第三类,question and answer with summarization;

Bui DDA, Del Fiol G, Hurdle JF, Jonnalagadda S. Extractive text summarization system to aid data extraction from full

text in systematic review development. J Biomed Inform 2016 Dec;64:265-272 [FREE Full text] [doi:

10.1016/j.jbi.2016.10.014] [Medline: 27989816]

Demner-Fushman D, Lin J. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Comp Ling

2007 Mar;33(1):63-103. [doi: 10.1162/coli.2007.33.1.63]

element level and sentence level

Zlabinger M, Andersson L, Hanbury A, Andersson M, Quasnik V, Brassey J. Medical entity corpus with pico elements

and sentiment analysis. In: European Language Resources Association (ELRA). 2018 May Presented at: Eleventh International

Conference on Language Resources and Evaluation (LREC 2018); 2018; Miyazaki, Japan

machine learning and rule-based methods

Chabou S, Iglewski M. PICO Extraction by combining the robustness of machine-learning methods with the rule-based methods. : IEEE; 2015 Jun Presented at: 2015 World Congr Inf Technol Comput Appl WCITCA 2015 Internet IEEE; June 2015; Hammamet, Tunisia. [doi: 10.1109/wcitca.2015.7367038]

supervised distance supervision approach

Wallace B, Kuiper J, Sharma A, Zhu M, Marshall I. Extracting PICO Sentences from Clinical Trial Reports using Supervised

Distant Supervision. J Mach Learn Res 2016;17:132 [FREE Full text] [Medline: 27746703]

naave Bayes–based classifier

Huang K, Chiang I, Xiao F, Liao C, Liu C, Wong J. PICO element detection in medical text without metadata: are first

sentences enough? J Biomed Inform 2013 Oct;46(5):940-946 [FREE Full text] [doi: 10.1016/j.jbi.2013.07.009] [Medline:

23899909]

multiple supervised classification algorithms

Boudin F, Nie J, Bartlett JC, Grad R, Pluye P, Dawes M. Combining classifiers for robust PICO element detection. BMC

Med Inform Decis Mak 2010 May 15;10(1):29 [FREE Full text] [doi: 10.1186/1472-6947-10-29] [Medline: 20470429]

生物医学研究的质量

Towards automatic recognition of scientifically rigorous clinical research evidence2009
An overview of the design and methods for retrieving high-quality studies for clinical care.2005
Developing optimal search strategies for detecting clinically sound prognostic studies in MEDLINE: an analytic survey2004
Text categorization models for high-quality article retrieval in internal medicine2005
A comparison of citation metrics to machine learning filters for the identification of high quality MEDLINE documents2006
MEDLINE clinical queries are robust when searching in recent publishing years2013Medical Subject Heading (MeSH) terms
A Deep Learning Method to Automatically Identify Reports of Scientifically Rigorous Clinical Research from the Biomedical Literature: Comparative Analytic Study2018Medical Subject Heading (MeSH) terms
Impact of Automatic Query Generation and Quality Recognition Using Deep Learning to Curate Evidence From Biomedical Literature: Empirical Study2019
Comparison of the time-to-indexing in PubMed between biomedical journals according to impactfactor, discipline, and focus.2017

总结的句子评分和排名

[45]:最常见的方法是基于频率的方法

[5]:一个句子中围绕着题目的词来描述,表述这个句子应该评分比较高;

[4,6,44]:关键词技术;

[3,46-49]:深度学习也引入来了;

[49]:multidocument summarization

[46]query-focus summarization system called AttSum; – AttSum: Joint learning of focusing and summarization with neural attention

参考

brain aneurysm: [医]脑动脉瘤

prognosis:预后(根据经验预测的疾病发展情况)

scientifically sound:科学合理的

happyprince; https://blog.csdn.net/ld326/article/details/115012807


https://www.fengoutiyan.com/post/14127.html

相关文章:

  • 鏡像模式如何設置在哪,圖片鏡像操作
  • 什么軟件可以把圖片鏡像翻轉,C#圖片處理 解決左右鏡像相反(旋轉圖片)
  • 手機照片鏡像翻轉,C#圖像鏡像
  • 視頻鏡像翻轉軟件,python圖片鏡像翻轉_python中鏡像實現方法
  • 什么軟件可以把圖片鏡像翻轉,利用PS實現圖片的鏡像處理
  • 照片鏡像翻轉app,java實現圖片鏡像翻轉
  • 什么軟件可以把圖片鏡像翻轉,python圖片鏡像翻轉_python圖像處理之鏡像實現方法
  • matlab下載,matlab如何鏡像處理圖片,matlab實現圖像鏡像
  • 圖片鏡像翻轉,MATLAB:鏡像圖片
  • 鏡像翻轉圖片的軟件,圖像處理:實現圖片鏡像(基于python)
  • canvas可畫,JavaScript - canvas - 鏡像圖片
  • 圖片鏡像翻轉,UGUI優化:使用鏡像圖片
  • Codeforces,CodeForces 1253C
  • MySQL下載安裝,Mysql ERROR: 1253 解決方法
  • 勝利大逃亡英雄逃亡方案,HDU - 1253 勝利大逃亡 BFS
  • 大一c語言期末考試試題及答案匯總,電大計算機C語言1253,1253《C語言程序設計》電大期末精彩試題及其問題詳解
  • lu求解線性方程組,P1253 [yLOI2018] 扶蘇的問題 (線段樹)
  • c語言程序設計基礎題庫,1253號C語言程序設計試題,2016年1月試卷號1253C語言程序設計A.pdf
  • 信奧賽一本通官網,【信奧賽一本通】1253:抓住那頭牛(詳細代碼)
  • c語言程序設計1253,1253c語言程序設計a(2010年1月)
  • 勝利大逃亡英雄逃亡方案,BFS——1253 勝利大逃亡
  • 直流電壓測量模塊,IM1253B交直流電能計量模塊(艾銳達光電)
  • c語言程序設計第三版課后答案,【渝粵題庫】國家開放大學2021春1253C語言程序設計答案
  • 18轉換為二進制,1253. 將數字轉換為16進制
  • light-emitting diode,LightOJ-1253 Misere Nim
  • masterroyale魔改版,1253 Dungeon Master
  • codeformer官網中文版,codeforces.1253 B
  • c語言程序設計考研真題及答案,2020C語言程序設計1253,1253計算機科學與技術專業C語言程序設計A科目2020年09月國家開 放大學(中央廣播電視大學)
  • c語言程序設計基礎題庫,1253本科2016c語言程序設計試題,1253電大《C語言程序設計A》試題和答案200901
  • 肇事逃逸車輛無法聯系到車主怎么辦,1253尋找肇事司機